Job Openings Python Engineer (Data Engineering Focus)

About the job Python Engineer (Data Engineering Focus)

Who You Are

You are a highly seasoned backend engineer with deep Python expertise and a strong command of end-to-end data systems. You thrive on architecting, scaling, and optimizing complex data ecosystems—structured or unstructured, batch or streaming. You understand that APIs are the surface area; the real engineering happens in the data pipelines, distributed systems, and performance-critical internals you build. You are process-oriented, emphasize high-quality engineering practices such as TDD, and excel at designing maintainable systems and shared libraries that can be reused across teams—knowing exactly when abstraction adds value and when it does not. Youre deeply comfortable with Linux environments and containerization/Dockerization as fundamental tools of your workflow.

What Youll Do

  • Architect, design, and maintain high-performance backend data pipelines capable of processing large-scale telemetry, diagnostics, and sensor data across distributed systems.
  • Lead ingestion, transformation, and storage strategies for both real-time streams and large batch workloads, ensuring reliability, scalability, and observability.
  • Optimize relational and NoSQL database schemas and queries for high-throughput, lowlatency scenarios.
  • Partner closely with DevOps/SRE teams to build robust ETL/ELT workflows, data lakes, and automated data quality processes.
  • Integrate and orchestrate external APIs, data streams, and distributed messaging systems (Kafka, RabbitMQ, SQS).
  • Implement comprehensive logging, tracing, monitoring, and performance tuning strategies for backend services.
  • Build and maintain RESTful and event-driven services using frameworks such as FastAPI or Flask, with a strong emphasis on reliability, testability, and clean architecture.
  • Develop internal tools and shared libraries used across teams, establishing best practices and reusable patterns.

What You Need to Succeed

  • 7+ years of professional Python experience with an emphasis on large-scale, data-intensive backend systems.
  • Deep expertise with data processing frameworks (Pandas, Dask, PySpark, or equivalents) and distributed computing concepts.
  • Strong SQL & NoSQL expertise—PostgreSQL, MySQL, DynamoDB, or similar—combined with a senior-level understanding of indexing, query planning, and storage internals.
  • Hands-on experience with message queues (Kafka, RabbitMQ, SQS) in distributed, asynchronous architectures.
  • Proven experience designing ETL/ELT pipelines, data lake architectures, and highthroughput ingestion systems.
  • Strong competency in cloud platforms (AWS preferred), including scalable storage, compute, orchestration, and security best practices.
  • Strong understanding of containerization, Dockerization, CI/CD workflows, and Linux-based systems.
  • Process-oriented mindset with practical experience in TDD, automated testing, and maintaining high-quality codebases.

Nice-to-Have Skills (But Not Required)

  • Experience with Kubernetes, container orchestration, or AWS ECS.
  • Understanding of ML/AI workflows or an interest in collaborating with data science teams.
  • Familiarity with Java for navigating or refactoring legacy components