Skip to main content
AI Socratic
March 2025
Models

Anthropic Claude Sonnet 3.7 (Aurora) 🛸

Codename Aurora 🛸 🌌

On March 10, 2025, Anthropic introduced Claude Sonnet 3.7, dubbed "Aurora," marking it as their most sophisticated model yet. Designed with a focus on seamless human-like interaction, enhanced interpretability, and superior safety features, Anthropic continues to push boundaries in AI development. Alongside the release, they published a Transparency Report detailing safety protocols, performance metrics, and training insights.

Here are the highlights of Claude Sonnet 3.7:

Anthropic API Update

This is not the only update from Anthropic, they also just improved their API via prompt caching, reducing the amount of tokens and therefore inference cost 🔥!

Federico UlfoFederico Ulfo
Models

OpenAI releases GPT-4.5 (Orion) 🚀

Codename Orion 🚀 🌌

On February 27, 2025, OpenAI launched GPT-4.5, codenamed "Orion," as its most advanced model to date. GPT-4.5 focuses on natural conversation, reduced hallucinations, and improved general-purpose capabilities. OpenAI started using a System Card to share details including safety, benchmark, and training strategy.

Here are the highlights of GPT 4.5:

  • Performance: Excels in factual accuracy (PersonQA: 0.78) and multilingual tasks (MMLU English: 0.896), with a lower hallucination rate (0.19), helicone.ai.

  • New Base Model: Replaces previous models as the core for OpenAI’s reasoning systems, emphasizing natural dialogue.

  • Limitations: Strong in practical tasks but lags in complex reasoning compared to specialized models (vellum.ai)

  • Rollout: The rollout was staggered due to GPU shortages, with plans to expand access to other tiers.

Federico UlfoFederico Ulfo
Models

Google Flash 2.0 Image Generation, Gemma 3 27B

Google has been on a role! Releasing image generation for Gemini Flash 2.0 and releasing Gemma 3 27B, the new king in the small model ELO arena.

Google Flash 2.0 Image Generation

Launched March 12, 2025, Gemini 2.0 Flash by Google introduces native image generation within the model. It enables fast, context-aware image creation and editing, with a strong text rendering and storytelling consistency. We played with it, is quite fast and consistent, although in our experience the quality of the image is relatively low.

Gemma 3 27B — King of SLMs 👑

Released the same day, Gemma 3 27B is Google’s largest open-source model, built from Gemini research. With 27 billion parameters, it handles text and image inputs, supports a 128K token context, and excels in multilingual tasks and reasoning. Trained on 14T tokens, it runs efficiently on a single GPU and pairs with ShieldGemma 2 for safety. It’s ideal for developers seeking customizable, high-performance AI.

Federico UlfoFederico Ulfo
Models

Baidu ERNIE-4.5 — DeepSeek R1 level at half the price 👸

Another Small Language Model from the Chinese 🇨🇳 Baidu. This model is a DeepSeek R1 level but at half the cost.

The benchmark reported by Baidu shows that Ernie-4.5 beats 4o in multimodal capability and in some benchmark even gpt-4.5 in text capability — models today are getting overfitted for winning benchmark, running this model by hand generally shows a different story, and well, you can try it yourself here https://yiyan.baidu.com/ (just be aware that your data is getting collected).

Federico UlfoFederico Ulfo

Search

Search across events, members, and blog posts