About the job Data Engineer
Who We Are
Provoke is a global consulting firm building AI-native solutions that transform how work gets done. Founded on a culture of innovation, growth, and curiosity, we partner with global clients to design and deploy agentic AI embedded directly into workflows so teams move faster, think smarter, and scale with purpose.
What We Do
We build bespoke software using modern technologies to help our clients solve complex business problems, achieve operational excellence, and flourish in an increasingly digital world.
The Provoke Experience
We are committed to building high-performing, diverse teams. We provide tangible rewards for effort, with purposeful in-person collaboration and learning opportunities to fast-track your career. We focus on diversity in race, gender, orientation, and experience, because we know diversity fuels innovation.
Job Overview:
As a Data Engineer, you will be responsible for designing, developing, and managing scalable data pipelines that feed into our data lake using MS Fabric. You will work closely with our Lead Data Engineer, analysts, and other stakeholders to ensure that the data infrastructure supports both current and future needs. Your role will involve integrating various data sources, ensuring data quality, and building a robust, scalable data lake architecture.
Key Responsibilities:
- Design, implement, and manage data lake solutions using MS Fabric.
- Develop and maintain data pipelines to extract, transform, and load (ETL) data from various structured and unstructured sources.
- Collaborate with cross-functional teams to understand data requirements and ensure seamless integration of new data sources into the data lake.
- Optimize data storage and retrieval processes to improve performance and scalability.
- Ensure the integrity, security, and availability of the data lake by implementing best practices in data governance and management.
- Perform data profiling, cleansing, and transformation to ensure data quality.
- Monitor and troubleshoot data flows to ensure reliable operation of the data pipelines.
- Stay up to date with the latest trends and technologies in data engineering, particularly in relation to MS Fabric and data lake architectures.
Requirements:
- Proven experience as a Data Engineer, with a strong focus on data lake architecture and development.
- Hands-on experience with MS Fabric and building data lakes.
- Proficiency in data integration techniques for structured and unstructured data from multiple sources (e.g., APIs, databases, cloud services).
- Strong programming skills in Python, SQL, or other relevant languages.
- Experience with cloud platforms like Azure, AWS, or Google Cloud for data storage and processing.
- Solid understanding of data warehousing, ETL processes, and big data technologies.
- Strong experience with Power BI and DAX for data visualization
- Knowledge of data governance principles and practices.
- Strong problem-solving skills with attention to detail.
- Excellent communication and collaboration skills.
Preferred:
- Experience with additional data tools and platforms, such as Apache Spark, Hadoop, or Databricks.
- Familiarity with machine learning workflows and model deployment in data lake
- Experience working in Agile or DevOps environments.