San Francisco, California, United States

Founding Applied ML Engineer Audio & Speech

 Job Description:

Founding Applied ML Engineer Audio & Speech

Location: San Francisco (strongly preferred) or New York City

Position Type: Full-Time (Permanent)

Start Timeline: ASAP

Hiring Target: 2

About

A fast-scaling early-stage startup is building the worlds most advanced audio dataset infrastructure global, studio-grade, and optimized for ML training. The team is leveraging audio to bridge digital and real-world AI applications, with early traction from major tech labs and customers.

Why join

Founding team role with high ownership and early equity

Lead ML architecture for cutting-edge speech models

Fast-paced ML environment (75% engineering / 25% research)

Significant impact across research, infra, and production deployment

Paid meals, strong benefits, and 401(k) access

Compensation

  • Base Salary: $200,000 $250,000
  • Equity: Competitive (~0.5%)
  • Relocation: Required (must be onsite 5 days/week in SF or NY)
  • Visa Sponsorship: Available
  • Tech environment
  • Languages: Python
  • Frameworks: PyTorch
  • Cloud: AWS or GCP
  • Domain: DSP, Audio ML, Generative AI

Role summary: As a Founding Applied ML Engineer, youll own the full ML lifecycle from research and prototyping to inference and resilient deployment. You'll collaborate with Ops to source high-quality training data and drive the roadmap for core ML models.

Responsibilities

  • Develop cutting-edge ML models for audio and speech
  • Implement inference systems and production-ready APIs
  • Architect durable pipelines and evaluations
  • Translate research into high-performance code
  • Drive technical roadmap and cross-functional integration

Must-have qualifications

  • 5+ years in ML, including 2+ years in audio/speech (DSP/ML)
  • Demonstrated ownership of ML systems from POC to production
  • Proficient in Python and PyTorch
  • Experience with cloud-based ML development (AWS or GCP)
  • Prior software engineering experience (~1 year min before ML)
  • Ability to connect model quality to user experience and business value
  • Strong trajectory (e.g., fast promotions) at top companies

Preferred qualifications

  • Graduate degree (MS or PhD), esp. from top schools like Stanford, MIT
  • Stanford Music/Audio Tech program experience
  • Led ML teams or drove roadmap direction
  • Experience training generative AI models
  • Speech research background
  • Experience with classical DSP and ML-based audio processing
  Required Skills:

Relocation Pipelines Ownership Hiring Salary User Experience AWS Music Compensation Architecture Infrastructure Integration Research Software Python Engineering Business Training

 Salary Package:

$ 200,000.00 - 250,000.00 (US Dollar)