Job Openings Research Engineer — Judgment Labs

About the job Research Engineer — Judgment Labs

Research Engineer — Judgment Labs

Location: Chinatown, San Francisco, CA (Onsite, 5.5 days/week)
Compensation: $225,000 – $400,000 base + competitive equity
Visa Sponsorship: H-1B supported
Experience Level: 1–4 years
Employment Type: Full-Time

About Judgment Labs

Judgment Labs builds infrastructure for Agent Behavior Monitoring (ABM), helping organizations evaluate and monitor AI agent performance in production environments. Their platform identifies behavioral anomalies such as instruction drift, retrieval degradation, and reliability failures across complex workflows.

The company has raised more than $30M from investors including Lightspeed, SV Angel, Valor Equity Partners, Chris Manning, Michael Ovitz, Michael Abbott, Cory Levy, and Kevin Hartz.

About the Role

Judgment Labs is seeking Research Engineers to build AI systems focused on analyzing agent interaction data, evaluating long-running agent behaviors, and improving autonomous systems through feedback and optimization workflows. This is a highly hands-on applied AI engineering role focused on production systems rather than pure academic research. Engineers will work directly with real-world agent data and deploy systems into production environments supporting finance, legal, operations, and other high-stakes domains.

What You'll Own

  • Build systems to aggregate, index, and analyze large-scale agent interaction data
  • Develop agent-based systems for evaluating long-running agent behaviors
  • Design post-training and optimization workflows for AI agents
  • Build tooling and infrastructure for experimentation, analysis, and training
  • Work with retrieval systems, evaluation harnesses, and production AI infrastructure
  • Own projects end to end with significant autonomy
  • Collaborate closely with engineering and research teams

Requirements

  • 1–4 years of industry experience in applied AI or generative AI
  • Experience building and evaluating AI agents in production
  • Strong problem-solving ability with high agency and intellectual curiosity
  • Comfortable handling large-scale, messy, real-world datasets
  • Experience with retrieval systems, search algorithms, or evaluation harnesses
  • Ability to work onsite in San Francisco 5.5 days per week

Nice to Have

  • Experience with sandboxed or autonomous evaluation environments
  • Agent trajectory analysis or long-running behavior evaluation
  • Self-improving or continual learning systems
  • Experience at fast-moving AI startups or applied AI organizations
  • Reinforcement learning or machine learning systems expertise

This Role Is NOT For

  • Those who require heavily structured task management
  • Profiles with limited production AI experience
  • Pure research backgrounds without shipped systems
  • Candidates who cannot relocate or work onsite in San Francisco

Interview Process

  1. Initial approval review
  2. Founder vibe check and technical discussion
  3. Technical interview and problem-solving round
  4. Work trial
  5. Offer stage

Logistics

  • Role is onsite in San Francisco, 5.5 days/week — please only apply if you can commit to this
  • H-1B visa sponsorship available

Shortlisted candidates will be contacted by David Joseph & Co., the recruiting partner managing this search on behalf of Judgment Labs.