Data Engineer
Job Description:
Project: Energy Sector
Location:
EU
Start Date:
ASAP
Duration:
6-12 months
Seniority:
Mid / Senior
Language:
Lithuanian (must)
Role Overview:
We are looking for an experienced Data Engineer / PySpark Developer to join a project in the energy sector. The role involves designing, developing, and maintaining scalable data pipelines and processing solutions using PySpark. You will collaborate closely with analysts, data scientists, and other engineers to ensure reliable, high-quality data delivery that supports business intelligence and analytics needs.
Key Responsibilities:
- Design, develop, and maintain data pipelines using PySpark
- Optimize data processing workflows for performance and scalability
- Ensure data accuracy, consistency, and quality across platforms
- Collaborate with cross-functional teams to translate business requirements into data solutions
- Participate in code reviews and contribute to best practices in data engineering
Must-Have Requirements:
- Minimum 3 years of experience in PySpark development
- Strong understanding of ETL processes and data pipeline design
- Experience with optimizing performance and working with large datasets (~20 TB/year)
- Familiarity with data lake or data warehouse concepts
- Fluent in Lithuanian
Nice to Have:
- Experience with cloud data platforms (preferably Azure and Fabric)
- Knowledge of CI/CD pipelines and version control (e.g., Git)
- Background in energy sector projects
Required Skills:
Data Engineering ETL Analysts Intelligence Data Processing Pipelines CI/CD Version Control Scalability Azure Business Intelligence Energy Reviews Business Requirements Analytics Git Design Engineering Business