Updates — Voices from the AI Socratic Community

1 / 3

Aug 28, 2025Research

Benchmarks & Metrics: Models Increasingly Overfitted to Leaderboards

Over the past two years, the AI NY community has been actively reviewing and discussing various benchmarks while tracking the rapid progress of new models. What has become increasingly clear is that m

Federico Ulfo

Read full update

Use ← → arrow keys to navigate

Benchmarks & Metrics: Models Increasingly Overfitted to Leaderboards

Search