Founding Applied ML Engineer Audio & Speech
Job Description:
Founding Applied ML Engineer Audio & Speech
Location: San Francisco (strongly preferred) or New York City
Position Type: Full-Time (Permanent)
Start Timeline: ASAP
Hiring Target: 2
About
A fast-scaling early-stage startup is building the worlds most advanced audio dataset infrastructure global, studio-grade, and optimized for ML training. The team is leveraging audio to bridge digital and real-world AI applications, with early traction from major tech labs and customers.
Why join
Founding team role with high ownership and early equity
Lead ML architecture for cutting-edge speech models
Fast-paced ML environment (75% engineering / 25% research)
Significant impact across research, infra, and production deployment
Paid meals, strong benefits, and 401(k) access
Compensation
- Base Salary: $200,000 $250,000
- Equity: Competitive (~0.5%)
- Relocation: Required (must be onsite 5 days/week in SF or NY)
- Visa Sponsorship: Available
- Tech environment
- Languages: Python
- Frameworks: PyTorch
- Cloud: AWS or GCP
- Domain: DSP, Audio ML, Generative AI
Role summary: As a Founding Applied ML Engineer, youll own the full ML lifecycle from research and prototyping to inference and resilient deployment. You'll collaborate with Ops to source high-quality training data and drive the roadmap for core ML models.
Responsibilities
- Develop cutting-edge ML models for audio and speech
- Implement inference systems and production-ready APIs
- Architect durable pipelines and evaluations
- Translate research into high-performance code
- Drive technical roadmap and cross-functional integration
Must-have qualifications
- 5+ years in ML, including 2+ years in audio/speech (DSP/ML)
- Demonstrated ownership of ML systems from POC to production
- Proficient in Python and PyTorch
- Experience with cloud-based ML development (AWS or GCP)
- Prior software engineering experience (~1 year min before ML)
- Ability to connect model quality to user experience and business value
- Strong trajectory (e.g., fast promotions) at top companies
Preferred qualifications
- Graduate degree (MS or PhD), esp. from top schools like Stanford, MIT
- Stanford Music/Audio Tech program experience
- Led ML teams or drove roadmap direction
- Experience training generative AI models
- Speech research background
- Experience with classical DSP and ML-based audio processing
Required Skills:
Relocation Pipelines Ownership Hiring Salary User Experience AWS Music Compensation Architecture Infrastructure Integration Research Software Python Engineering Business Training
Salary Package:
$ 200,000.00 - 250,000.00 (US Dollar)