Data Operations Engineer

Washington, District of Columbia, United States

Or refer someone

Job Openings Data Operations Engineer

About the job Data Operations Engineer

EDUCATION / QUALIFICATIONS / EXPERIENCE

B.S. in computer science or information systems fields required, or 5+ years related work experience.
Strong analytical, critical thinking skills used to solve complex problems
Strong technical background with a mix of development and automation skills
Outstanding attention to detail and consistently meets deadlines
Exceptional communication and interpersonal skills
Ability to work alongside a highly collaborative team, but also a self-starter, able to work independently with little guidance
Experience in troubleshooting, performance tuning, and optimization
Proficient in shell scripting, Python, Scala or other programming languages
Knowledge of Spark/PySpark
Excellent SQL knowledge, ability to read/write SQL queries
Skilled in Hive (HQL) and HDFS
Experience working with both unstructured and structured data sets, including flat files, JSON, XML, ORC, Parquet and AVRO
Comfortable working with big data environments and dealing with large diverse data sets
Proficient in Linux environments
Familiarity with source code management/versioning tools such as Github
Understanding of CI/CD principles and best practices in data processing
Experience building data visualization dashboards to capture data quality metrics using tools like Tableau, Big Data Studio
Understanding of public cloud technologies such as AWS, GCP and Azure is a plus

MAJOR JOB RESPONSIBILITIES

Contribute to the maintenance, documentation, and monitoring of supported data pipelines
Continuously analyze supported data workflows for opportunities to improve reliability and timeliness against established SLAs
Conceive, develop, and apply improvements to workflows and monitoring to minimize the occurrence and impact of defects
Communicate with stakeholders when data is in error or is delayed, with clear plans and timelines for recovery, and future prevention
Develop modifications to workflows using git and github
Assist with production support tickets and inquiries from consumers of supported data pipelines
Facilitate the onboarding of new products and pipelines into our suite of supported production processes

ABOUT YOU

You are passionate about improving the integrity, accuracy and reliability of data across the organization. You are a highly motivated individual with excellent analytical, critical thinking and problem solving skills. You bring substantial value to the team with your prior experience in building and supporting production data and reporting pipelines. Strong verbal, written and interpersonal skills provide the flexibility to work collaboratively with a team or independently with minimal supervision. Youre a detail-oriented, self-starter with the ability to multitask and thrive in a dynamic environment. Furthermore, your familiarity with techniques for automating, cleansing and standardizing data at rest, and in motion, make you a great fit for this role.

WHAT YOUD BE DOING

As a Data Operations Engineer you will be responsible for monitoring and managing maintenance of multiple data ETL pipelines, which power hundreds to thousands of business-critical applications and reports, used by many teams throughout the organization, as well as external customers, every day. Due to the scale, variety, and complexity of the processes we support, standardized or automated practices and tools are needed for monitoring and maintenance of the pipelines. It is your duty to ensure that monitoring alerts Data Operations to any issues with data pipelines, with appropriate timeliness and sensitivity. Any alerts must be dealt with in a timely and appropriate manner. This can include a variety of things, including job regeneration, communication to stakeholders, definition and development of process enhancements via code, or updates to process documentation. In addition, you will assist with production support inquiries from stakeholders about the quality or timeliness of data in reporting. Finally, you will work with other teams to facilitate the onboarding of new data pipelines and products into our suite of supported production processes. Strong analytic skills and problem solving skills are needed throughout to model, monitor, and troubleshoot the production processes.

Or refer someone