About the job Machine Learning Engineer
Role Overview:
We're looking for a Machine Learning Engineer who thrives at the intersection of research and production. You'll be joining a fast growing AI organization building scalable systems that power next generation language and reasoning models. You'll work on end to end pipelines: from prototyping novel algorithms, to deploying them in production, monitoring and optimizing them at scale.
What You'll Do:
-
Collaborate with research teams to implement and refine machine learning/AI models (LLMs, multimodal, reasoning agents)
-
Build robust ML pipelines: data collection, preprocessing, model training/finetuning, deployment and monitoring
-
Work with production infrastructure: containerized environments, distributed systems, scalable compute resources
-
Develop and deploy models in a cloud native setting (AWS/Azure/GCP) with production constraints (latency, throughput, reliability)
-
Monitor model performance in production: latency, accuracy drift, cost, resilience
-
Debug and optimize systems for scale: memory, compute, inference cost, streaming data
-
Collaborate cross functionally: data engineers, software engineers, product teams, researchers
-
Stay abreast of the latest ML/AI research and bring actionable improvements into production
What You Bring:
-
M.Sc. or Ph.D. in Computer Science, Electrical Engineering, Mathematics or related field (or equivalent experience)
-
3+ years of hands on experience implementing ML/AI systems in production
-
Strong programming skills in Python (and optionally C++/Go)
-
Experience with ML frameworks (e.g., PyTorch, TensorFlow, JAX)
-
Proven track record deploying models into production (cloud, containers, microservices)
-
Familiarity with distributed computing, scalable architectures and production ML constraints
-
Good understanding of ML operations: monitoring, model drift, post deploy maintenance
-
Strong debugging and optimization skills: performance, cost, latency
-
Excellent collaboration and communication skills
-
Bonus: experience with large language models, retrieval augmented generation, and multimodal systems