Job Openings
Data Engineer
About the job Data Engineer
Key Responsibilities
Data Operations & Pipeline Support
- Assist in ingesting, collecting, validating, and storing structured/unstructured batch data coming through Edge nodes or direct DB connections
- Support ETL/ELT jobs running on Hadoop, Hive, Impala, and Spark
- Monitor daily data loads, troubleshoot failures, and ensure data availability for analytics use cases
- Maintain HDFS directory structure, Hive tables, and data partitions
- Perform file-level data quality checks and checksum validations and table level validations for data consistency
Platform & Infrastructure Operations
- Support the operation of on-prem Hadoop clusters (Cloudera)
- Assist in OS-level tasks: log checks, service restarts, disk usage monitoring, user/permission handling
- Assist in regular Big Data cluster health checks
- Support platform upgrades, patches, configuration changes, and security hardening efforts managed by the senior engineer
- Work with network and system teams during installation, troubleshooting, or hardware issues
Tools & Technologies
- Assist in running and maintaining data flows involving Hive, Impala, HDFS, Spark, Kafka (basic), HBase (basic), and Linux environments
- Use tools like NiFi/SFTP for data movement with NiFi flow development & NiFi cluster management
- Support API-based data push/pull if required for integrations
Data Governance & Documentation
- Maintain metadata, data dictionary updates, and platform documentation
- Ensure compliance with Kerberos/LDAP authentication and Cloudera Navigator governance processes
- Record operational runbooks and incident logs
Collaboration & Support
- Work under the senior engineer to ensure continuous operations of the client environment
- Participate in joint troubleshooting with Client team during data-source onboarding
- Provide L1/L2 support for data ingestion, cluster operations, and daily job executions
Work Complexity and Role Expectation
- Work on assigned operational tasks within the Big Data platform under guidance
- Support development, testing, and automation of simple data flows
- Involve in routine batch workloads, testbed validations
- Participate as a team member in platform enhancements, monitoring improvements, and data integration activities
Person Specifications
Education
- Bachelors degree in computer science, IT, Electronics/Telecom Engineering, or a related field
Technical Skills
- Basic knowledge of Hadoop ecosystem: HDFS, Hive, Spark, Yarn (hands-on exposure is an added benefit)
- Familiarity with Linux shell commands; ability to navigate logs and services
- Good understanding of SQL able to write and troubleshoot complex queries
- Exposure to Python/Scala/Java is an added advantage
- Basic understanding of data pipelines, ETL processes, and batch data workflows
- Exposure to Cloudera platform is a plus
Experience
- 1 - 2 years of experience in Data Engineering, Database operations, or Big Data platform support
- Experience in telecom domain or enterprise data environments is an added advantage
Soft Skills
- Good analytical and troubleshooting mindset
- Ability to collaborate with senior engineers and follow structured operational practices
- Effective communication and willingness to learn complex distributed systems