Job Openings
ML Research Engineer - Legal Reasoning
About the job ML Research Engineer - Legal Reasoning
Role Overview
The Justice Lab & Tax-Litigation Co-Pilot
Youll join The Justice Lab—our skunk-works unit focused on ideas six-to-twelve months ahead of production.
Flagship project: a Tax-Litigation Co-Pilot that ingests full judgments, statutes, and filings, then produces defensible predictions and transparent explanations.
- Project Coordinator: Arghya Bhattacharya (CTO, Adalat AI)
- Project Oversight: Prof. Daron Acemoglu (Nobel Laureate, MIT) & Prof. Daniel Kang (UIUC)
Role in a Nutshell
As a Research Engineer—Legal Reasoning you will turn cutting-edge ideas into artifacts that ship:
- Frame the problem: formalize legal reasoning for outcome prediction.
- Design experiments: benchmark LLMs on labeled tax-law and civil-procedure tasks.
- Prototype systems: retrieval-augmented generation, evidence tracing, causal inference—pipelines that think like lawyers.
- Build eval suites: factual consistency, citation faithfulness, policy impact (e.g., case-load reduction).
- Ship hand-offables: lightweight services or notebooks that engineers can harden.
- Publish: co-author internal memos and external papers with academic partners.
Key Responsibilities
- Data & Evaluation
- Curate, label, and version corpora spanning four court tiers.
- Create task sets for prediction, entailment, and explanation.
- Modeling & Experimentation
- Fine-tune / distill LLMs with RL-, DPO-, or SFT-style feedback.
- Explore long-context and retrieval strategies (LoRA, RAG, chunking).
- Legal-Reasoning Research
- Model precedential hierarchies, detect conflicts, and generate citation-grounded chains of thought.
- Collaboration
- Sync daily on design and code quality.
- Present findings to Professors Acemoglu, Kang, and policy advisors.
- Documentation & Dissemination
- Maintain reproducible logs, polished reports, and publish-ready code.
Qualifications
No one ticks every box—if the mission resonates, lets talk.
What You Will Achieve in a Year
A prototype that classifies appeal merit with 75 % F1 on held-out High-Court cases.
- An evaluation methodology poised to become the standard for legal-AI outcome prediction in the Global South.
- A first-author or co-author paper submission (e.g., NeurIPS L4DC, ICML LawML).
- Pilot deployment inside real-world Tax Offices.