Job Openings Senior Software Engineer – LLM Evaluation - RibbitZ

About the job Senior Software Engineer – LLM Evaluation - RibbitZ

Our client RibbitZ is looking for Senior Software Engineer-LLM Evaluation to work remotely.

As a Software Engineering evaluator, you will create cutting-edge datasets for training, benchmarking, and advancing large language models, collaborating closely with researchers. This includes curating code examples, providing precise solutions, and making corrections in Python, JavaScript (including ReactJS), C/C++, Java, Rust, and Go; evaluating and refining AI-generated code for efficiency, scalability, and reliability; and working with cross-functional teams to enhance enterprise-level AI-driven coding solutions.

What Does a Typical Day Look Like?

  • Working on AI model training initiatives by curating code examples, building solutions, and correcting code in Python, JavaScript (including ReactJS), C/C++, Java, Rust, and Go.
  • Evaluate and refine AI-generated code to ensure that it is efficient, scalable, and reliable.
  • Collaborate with cross-functional teams to enhance AI-driven coding solutions against industry performance benchmarks.
  • Build agents that can verify the quality of the code and identify error patterns.
  • Hypothesize on steps in the software engineering cycle (prototyping, architecture design, API design, production implementation, launch, experiments, monitoring, operational maintenance) and evaluate model capabilities on them
  • Design verification mechanisms that can automatically verify a solution to a software engineering task.

Required Skills

  • Several years of software engineering experience (+5 years), including, 2+years of continuous full-time experience at a top-tier product company (e.g., Google, Stripe, Amazon, Apple, Meta, Netflix, Microsoft, Datadog, Dropbox, Shopify, PayPal, IBM Research).
  • Strong expertise in building full-stack applications and deploying scalable, production-grade software using modern languages and tools.
  • Deep understanding of software architecture, design, development, debugging, and code quality/review assessment.
  • Excellent oral and written communication skills for clear, structured evaluation rationales.

Eligibility (Strictly Enforced):

  • Software Engineering profiles only
  • Candidates must be based in the US
  • 5+ years of relevant experience
  • Immediate assessment availability

Top companies:

  • Google (Alphabet)
  • Apple
  • Amazon
  • Meta (Facebook)
  • Netflix
  • Microsoft
  • Tesla
  • NVIDIA
  • Adobe
  • Salesforce
  • Github
  • Atlassian
  • hashiCorp
  • Databricks
  • Snowflake
  • Cloudflare, DigitalOcean, MongoDB
  • Elastic, Confluent, Airbnb, Dropbox
  • Stripe, Palantir, Uber, Lyft
  • Square (Block), Twilio, Snap Inc.
  • Pinterest, Figma, Oracle, Cisco
  • Paypal, Doordash, Rivian, Reddit, Coinbase, Splunk
  • Spotify, Goldman Sachs, Morgan Stanley
  • JP Morgan Chase, Capital One
  • Plaid, Shopify, Intuit, Workday, ServiceNow
  • Hugging Face, VMware, Brex, Wise
  • Epic Games, Unity Technologies
  • Activision Blizzard, Riot Games, Valve
  • Huawei, Bloomberg, ByteDance
  • Alibaba, Baidu, Notion, Klarna
  • Instacart, Zillow.