

Evals for Everyone
AI products fail silently when you can't measure them. In this 3-part Lightning Lesson series, learn how top PMs design evals users trust, scale them without drowning in complexity, and deploy them in production. From first principles to real-world implementation, discover how to know your products actually work.
By continuing, you agree to Maven's Terms and Privacy Policy.
Wed Feb 18·5:00 PM UTC
Design Evals Users Will Trust
Most AI teams ship products that pass benchmarks but fail users. The gap between "model works" and "product works" is where careers and products stall. This session gives you the systematic approach used by production AI teams at companies like OpenAI and Google to evaluate what actually matters before your users find the problems first.
You'll learn from

Aishwarya Naresh Reganti
AI Founder & Advisor to F500 Leaders
Wed Feb 25·4:00 PM UTC
Scale Evals Without the Chaos
Your AI product passed testing. Now what? The gap between "works in development" and "works in production" destroys more AI projects than bad models ever will. Real users behave differently than test cases. Edge cases multiply. And without the right monitoring, you won't know something broke until your support queue explodes. This session gives you a battle-tested playbook for evaluating AI system
You'll learn from

Aishwarya Naresh Reganti
AI Founder & Advisor to F500 Leaders
Fri Feb 27·5:00 PM UTC
Evals in Action With Arize
You've learned why evals matter and what to measure. Now you need to actually build them. Most teams get stuck here because the gap between "understanding evals" and "shipping evals" feels enormous. This hands-on session bridges that gap with live code, real tools, and templates you can steal. You'll leave with working evaluators, not just concepts.
You'll learn from

Laurie Voss
Head of DevRel at Arize, co-founder, npm Inc