Job Openings ETL Developer

About the job ETL Developer

Roles and Responsibility of ETL Developer:

  • Must be able to understand and develop ANSI SQL queries.
  • Must be aware handling other semi-structured data formats JSON, XMLs, HTMLs etc.
  • Must be comfortable in querying tools for distributed systems such as Hive, Impala and other SQL clients (RDBMS, Python, Scala etc.).
  • Must be comfortable in handling transforming data to suite business needs.
  • Must be able to handle tools for data transformation and/or ETL through tool-based and code-based development ex. Informatica or Talend for tool based / Spark or MapReduce framework in Python or Scala/Java for code-based development
  • Must have a background on data warehousing and reporting technologies.

Additional Bonus Qualifications:

  • Experienced in end-to-end devOps techniques is a plus. Able to create Jenkins pipelines and create unit test cases for automated testing. Experience in Git version control is a plus.
  • Experience in Docker containers and container technologies is a plus.
  • Experience in Flink Streaming framework.
  • Experience in using Cloud Technologies (AWS RedShift, S3 etc.)
  • Must be comfortable in designing sustainable data architecture for Big data systems. That is balance partitioning with respect to target block sizes of the HDFS, compaction techniques, bucketing etc.
  • Must understand concepts of Small Files in HDFS, Storage formats such as columnar format (Parquet and ORC) and Transmit formats (Avro).
  • Must be able to understand resource allocation and proper resource sizing for distributed jobs. Ex. Spark executor and driver sizes
  • Required Technological Skills:
  • Strong SQL skills
  • Datawarehouse concepts
  • Experience in any of the following languages Python, Java, Scala
  • Experience in Talend, Information, Ab Initio and other ETL GUI based tools
  • Comfortable in working with Hive partitioning and bucketing