Job Openings
Machine Learning Engineer - Drug Discovery
About the job Machine Learning Engineer - Drug Discovery
Role Summary
We are seeking a Machine Learning Engineer who has experience working with small molecule drug discovery datasets and can hit the ground running to deliver high-impact, production-grade solutions to advance our programs. The ideal candidate will build and scale data and ML infrastructure across early research, lead optimization, and development phases of drug discovery pipeline. You will enable data-driven science while ensuring robust engineering practices and FAIR data principles.
Key Responsibilities
- Support management of biobank scale datasets in Polaris, Maze's internal platform supporting Compass, by building scalable data ingestion, cleaning, processing, and validation pipelines.
- Work with scientific compute teams to design and deploy machine learning models to support workflows in research and small molecule drug discovery (compound property prediction, assay data prediction, data analysis).
- Lead the evaluation and integration of Large Language Models (LLMs) to automate data ingestion workflows, enhance intelligent querying, and support user-facing variant association and scientific visualization platforms.
- Design and operate scalable ML and data platforms leveraging Terraform (IaC) and Git-based CI/CD pipelines, incorporating workflow orchestration, automated model lifecycle management, and production-grade monitoring and reliability.
- Collaborate with development organization to evaluate and deploy ML tools that support workflows across Regulatory, Clinical Operations, and Medical Affairs.
- Collaborate cross-functionally translate scientific requirements into production-grade systems.
Required Qualifications
- Master's degree in Computer Science, Machine Learning, Bioinformatics, Data Engineering, or a related field.
- 3+ years of industry experience building production-grade data and ML pipelines, preferably in life sciences supporting drug discovery.
- Hands-on experience deploying AI/ML models in drug discovery applications (e.g., computational biology/chemistry workflows).
- Experience with FAIR data principles and strong programming skills in Python and SQL (R is a plus).
- Proven experience in deploying and maintaining ML systems, including CI/CD, workflow orchestration, and monitoring.
- Experience with workflow orchestration tools (e.g., Airflow, Prefect).
- Experience with containerization and cloud infrastructure (Docker, Kubernetes, AWS or similar).