San Francisco, California, United States

ML Engineer focused on Data Engineering

 Job Description:

Job brief

Join our San Francisco office as an ML Engineer focused on Data Engineering. Visa sponsorship available for global talent

  • 3 days ago

A machine learning engineer with an emphasis on data engineering is needed by this organization to manage and improve their massive datasets, which include hundreds of millions of photos and videos. This position is essential to supporting our research and development teams by guaranteeing effective data access, storage, and versioning. Your contributions will fund state-of-the-art research and development initiatives while immediately improving the effectiveness of our data handling procedures. "

What you'll do:


  • Implement and maintain distributed storage solutions to provide seamless data access across all training machines.​

  • Develop strategies to reduce data storage costs while ensuring high availability and reliability.​

  • Optimize input/output operations to accelerate data retrieval and processing during training and inference phases.​

  • Build and manage backend systems to track and manage different versions of datasets, ensuring reproducibility and consistency in experiments.​


Our culture:

  • We work full-time and in-person at our waterfront office in San Francisco.

  • We believe that demonstrated interest in the creative space is key: our team includes musicians, designers, visual artists and more.


Example tacit skills we're looking for:


  • Experience with distributed storage systems including deployment, configuration, and optimization.​

  • Strong skills in Python and (ideally) C++ for developing data processing pipelines and integrating storage solutions.​

  • Experience in building and maintaining data pipelines capable of handling large-scale datasets efficiently.​

  • Experience with K8s.

  • Generalist back-end experience with familiarity in ETL and infrastructure.


What we offer:


  • Openness to sponsoring International candidates (e.g STEM OPT, OPT, H1B, O1, E3)

  • Work alongside a world class developing the future of AI tooling

  • Significant impact on our market presence and growth

  • Competitive compensation (75% percentile of market rates) with significant equity upside


  Required Skills:

ETL Data Engineering Organization Operations High Availability Data Processing Output Pipelines Compensation Storage Machine Learning Infrastructure Availability C++ Research Engineering Python Training