Job Openings M11 - Data Engineer

About the job M11 - Data Engineer

Responsibilities

  • Design, develop, and maintain data pipelines that extract data from various sources and formats, transform it according to business requirements, and load it into target systems.
  • Perform data extraction, cleaning, transformation, and flow.
  • Design, build, launch and maintain efficient and reliable large-scale batch and real-time data pipelines with data processing frameworks.
  • Integrate and collate data silos in a manner which is both scalable and compliant.
  • Collaborate with the Project Manager, Data Architect, Business Analysts, Frontend Developers, Designers and Data Analysts to build scalable data driven products.
  • Work in an Agile Environment that practices Continuous Integration and Delivery.
  • Work closely with fellow developers through pair programming and code review process.

Requirements

  • Bachelor's degree in Computer Science, Software Engineering, or related field.
  • At least 3–5 years experience in ETL/data integration projects.
  • Proficient in general data cleaning and transformation using scripting languages (mandatory: SQL, Python; added advantages: R, etc) to ensure data accuracy and consistency. Knowledge in R will be an advantage.
  • Proficient in building ETL pipeline (mandatory: SQL Server Integration Services SSIS, Python, Snowflake; added advantages: AWS Lambda, ECS Container task, Eventbridge, AWS Glue, Spring, etc). 
  • Proven hands-on experience with Microsoft SSIS and Snowflake.
  • Proficient in database design and various databases (mandatory: SQL, AWS S3, RDS; added advantages: PostgreSQL, Athena, MongoDB, Postgres/GIS, MySQL, SQLite, VoltDB, Apache Cassandra, etc).
  • Experience in cloud technologies such as GCC and GCC+ (i.e. AWS, Azure).
  • Experience in and passion for data engineering in a big data environment using Cloud platforms such as GCC and GCC+ (i.e. AWS, Azure).
  • Experience in building production-grade data pipelines, ETL/ELT data integration.
  • Experience in CI/CD pipelines and DevOps tools (e.g. GitLab).
  • Experience in automated provisioning tools (Ansible, Terraform, Puppet, Vagrant) will be an advantage.
  • Familiar with data modelling, data access, and data storage infrastructure like Data Mart, Data Lake, Data Virtualisation and Data Warehouse for efficient storage and retrieval.
  • Familiar with REST API and web requests/protocols in general.
  • Familiar with data governance policies, access control and security best practices.
  • Knowledge of system design, data structure and algorithms.
  • Knowledge of AI/ML RAG (Retrieval-Augmented Generation), MCP (Model Context Protocol) concepts.
  • Comfortable in both Windows and Linux development environments.
  • Interest in being the bridge between engineering and analytics.