I lead evaluation systems at Groq. I'm interested in how we can better understand AI capabilities through systematic evaluation. In my spare time, I build benchmarks that test the capabilities of LLMs.
Before Groq, I was at Nous Research developing synthetic data pipelines for training language models.
AIME Problem #1 - 2025 AIME I
Problem: Find the sum of all integer bases $b>9$ for which $17_b$ is a divisor of $97_b$.
|
Projects
AI Debate Evaluation
Topic: The potential of quantum computing in breaking current encryption standards
Affirmative Position
Negative Position
Judge's Analysis
Final Verdict
✓ Negative position wins the debate