About the job ML Engineer (Austin, TX)
About Us
We are a stealth-mode startup building next-generation infrastructure for the AI industry. Our mission is to make advanced language models portable, efficient, and customizable for real-world deployments. We’re building tools that allow vendors to fine-tune models easily and deploy them securely on diverse hardware.
Role
We are seeking a Machine Learning Engineer (Python) to help design and implement our training engine. This is not an academic research role - you will be productizing and automating existing fine-tuning techniques (LoRA/QLoRA) so vendors can train and manage their own adapters with minimal effort.
You’ll work closely with backend engineers (Node.js) who orchestrate jobs and dashboards, while you focus on the training pipelines and adapter export logic.
Eligibility Requirement: Despite being a remote role, this position is open only to U.S. citizens or green card holders based in Austin, Texas.
Responsibilities
- Implement and maintain LoRA/QLoRA fine-tuning pipelines using PyTorch + Hugging Face Transformers + PEFT.
- Develop logic for incremental training and adapter stacking, producing clean, versioned “delta packs.”
- Automate data preprocessing (tokenization, formatting, filtering) for user-supplied datasets.
- Build training scripts/workflows that integrate with orchestration backends (Node.js, REST/gRPC, or job queues).
- Implement monitoring hooks (loss curves, checkpoints, eval metrics) to feed into dashboards.
- Collaborate with DevOps to ensure reproducible, portable training environments.
- Write tests to guarantee reproducibility and correctness of adapter outputs.
- Willingness to occasionally be present in the office for discussions and team collaboration.
Requirements
Strong programming skills in Python.
- Hands-on experience with PyTorch and the Hugging Face ecosystem (Transformers, Datasets, PEFT).
- Familiarity with LoRA/QLoRA or parameter-efficient fine-tuning methods.
- Understanding of mixed precision training (FP16/BF16) and memory optimization techniques.
- Experience building training scripts that are production-ready (reproducibility, logging, error handling).
Comfortable working in Linux GPU environments (CUDA, ROCm).
- Ability to collaborate with backend/frontend engineers who are not ML specialists.
Nice to Have
Experience with bitsandbytes, xformers, or flash-attention.
Familiarity with distributed training (multi-GPU, NCCL, DeepSpeed, or Accelerate).
- Prior work in MLOps or packaging ML pipelines for deployment.
Contributions to open-source ML libraries.
Why Join
- Build the core training product that lets vendors adapt models safely and efficiently.
Focus on product engineering, not open-ended research.
- Collaborate with a lean, highly technical team at the intersection of AI and systems.
- Competitive compensation, equity potential, and flexible remote work.
Please use this link to apply for this job:
https://www.baasi.com/career/apply/3135729