LLM Models Vibe Check & Benchmarks: OpenRouter, lmarena, and IQ
May 20, 2025

Top models according open router, notable how Gemini 2.5 is climbing the ladder, while anthropic 3.7 is slowly going down.

Companies are overfitting their model to the benchmarks. The @lmarena_ai has become the go-to evaluation for AI progress. Their last release demonstrates the difficulty in maintaining fair evaluations on @lmarena_ai, despite best intentions. Read more.

Benchamarks collection from Hugging Face
IQ bench changes in just one year. o3 has an IQ of 160 placing it in the top 100,000 smartest people in the world.

Get the latest AI insights delivered to your inbox. No spam, unsubscribe anytime.
Search across events, members, and blog posts
Comments
Sign in as a member to join the conversation.
Loading comments…