About the job Senior Data Engineer
Data Engineer
Location: Remote (Global)
Team: Data Engineering
Summary
You'll step into an existing Databricks and SQL data environment, assess what's there, and take ownership of stabilizing, maintaining, and improving it. This means reading real production pipelines, identifying what's broken or brittle, writing clean SQL and PySpark scripts for data flows, and making the system more reliable over time. This is not a greenfield build — it's a technical ownership role that demands analytical depth and the ability to move confidently inside someone else's codebase.
Requirements (Must-Haves)
Excellent English communication skills — you'll interact directly with the team, ask the right questions, and communicate findings and blockers clearly without hand-holding
4+ years of hands-on SQL experience — writing and reading complex queries is your baseline, and you can optimize a slow query by reading an execution plan, not by guessing
Databricks proficiency in production contexts: Delta Lake, notebooks, jobs, cluster configuration — you've debugged failed jobs and made real architectural decisions on the platform (2022–2025)
Python and/or PySpark scripting for data import/export pipelines — you've built and maintained these in real environments, not just notebooks
Experience working inside an existing codebase or data system — you can read legacy SQL and pipeline code, form a clear opinion on what to keep vs. rewrite, and act on it
End-to-end data pipeline ownership: ingestion through transformation through loading, including scheduling and failure handling
SQL query optimization that goes beyond adding an index — partitioning strategies, caching, join optimization, execution plan analysis
Data modeling experience: star/snowflake schemas, dimensional modeling best practices
Requirements (Nice-to-Haves)
Experience with cloud data warehouses: Snowflake, Redshift, or BigQuery
Familiarity with orchestration tools: Apache Airflow, dbt, or Prefect
AWS data services exposure: S3, Glue, Lambda, or RDS
Git-based workflows and CI/CD practices applied to data projects
Understanding of data quality frameworks, lineage tracking, or observability tooling
Bonus Points
Databricks Certified Data Engineer Associate or Professional
Prior experience doing technical assessments or migrations of legacy data systems — not just greenfield builds
Streaming pipeline experience: Kafka, Spark Streaming, or Delta Live Tables
Open-source contributions or public work in the data engineering space