AI Socratic, Feb 2025 — Part 1

February 12, 2025Updated February 2, 20264 min read

We collect all the most important AI updates from February.

AI Dinner 7.0 — DeepSeek Workshop 🚀

Let's start by announcing our next AI Dinner. The invitation is open to all applied AI engineers, researchers, and founders. At this event we'll read DeepSeek R1 research papers and Ivo will run a workshop to install it on your laptop.

February 20th, Manhattan NY, https://lu.ma/ai-dinner-7.0

O3, Operator, And DeepResearch

We're just at the beginning of the year and OpenAI launched 3 new products under the pro subscription for $200/month.

O3 is a new category of GPT models that score 87% on the ARC challenge. OpenAI released o3-mini and o3-mini-high for coding.
Operator is an AI agent mode that can be used with chatgpt-4o to use a desktop simulator and run actions that require browsing a web page and clicking links. Here's Karpathy's take on Operator.
DeepResearch enables running long research that collects content across multiple sources and summarizes it into a coherent report. It's a super powerful tool that has been received with a bang. It really makes the OpenAI pro subscription worth it. DeepResearch is currently the highest scoring in the Humanity's Last Exam.

https://x.com/tomaspueyo/status/1887270096013529530

Gemini 2.0 Flash — Cheapest Model Yet

Google Released Gemini 2.0 Flash, their most impressive LLM yet. What set 2.0 Flash apart from other LLMs is the incredibly low cost and the ability to process PDFs. It cost only $0.40 per million tokens and has 1M-tokens context window, which means you can now parse 6000 long PDFs at near perfect quality for $1.

Flash 2.0 is the new king 👑 in the block

Gemini 2.0 Flash is Better ELO than DeepSeek r1 and cheaper.

Mistral, Le Chat

Mistral just shipped Le Chat a competitor to ChatGPT that is 13x faster, 100% open-source, and completely free (vs $20/month).

https://x.com/itsolelehmann/status/1888290407127388497

Cerebras, Fast Inference

Cerebra is currently the fastest inference platform, gets us to 1,200 tokens/s - 10x faster than any comparable models, and 3x faster than groq, using DeepSeek-R1-Distill-Llama-70B.

https://x.com/CerebrasSystems/status/1885444050859487324

Top Research Papers

SFT Memorizes, RL Generalizes. DeepSeek has shown the power of Reinforcement Learning (RL) without Supervised Fine-Tuning (SFT). What does RL learn differently than SFT? Well, as the title, SFT memorizes, RL generalizes.
As AIs get smarter, they develop their own coherent value systems.

For example, they value human lives higher in order of Pakistan > India > China > US.

These are not just random biases, but internally consistent values that shape their behavior, with many implications for AI alignment, website link.
Humanity's Last Exam is a dataset with 3,000 questions, with known and verifiable answers, developed with hundreds of subject matter experts to capture the human frontier of knowledge and reasoning.

Videos Worth Watching

3 hours of pure learning

https://www.youtube.com/watch?v=7xTGNNLPyMI

Schmidt Huber the inventor of CNN and most DL on the Machine Learning Street Talk podcast

https://youtu.be/fZYUqICYCAk?si=xlq5LB1XRTNQtEIP

Interesting Numbers And Random Takes

A few more updates and information worth mentioning.

Anthropic is eating OpenAI's market share

https://x.com/itsandrewgao/status/1885144792323285183

LLM beats doctor in treating patients
Computer Science is more than just coding
YC on why vertical AI agents could be 10X bigger than SaaS
Number of H100s bought in 2024
- MSFT: 450,000
- META: 350,000
- AMZN: 196,000
- GOOG: 169,000
Artificial Intelligence Roadmap

Top use of Claude Inference is for Engineering. Through the Anthropic Economic Index, Anthropic will track how these patterns evolve as AI advances.
This research from Gratient Update analyzes the impact of AGI on human wages, concluding AGI can fully substitute for human labor, and it might cause wages to crash, below subsistence level. The author thinks humans will eventually lose their wealth through expropriation or through violent revolution. If this occurs, AGI will be negative for human welfare.

Decentralized AI

New Distributed training paper from Google DeepMind

https://x.com/osanseviero/status/1885301292131582347

Scaling through decentralization

https://x.com/Ronangmi/status/1885373092777910749

Full Source List

Stay Updated

Get the latest AI insights delivered to your inbox. No spam, unsubscribe anytime.

About the Authors

Federico Ulfo

Founder, Engineer

New York City

AI Socratic July 2026 — Lost In J-Space

Anthropic’s Fable 5 is back under strict safety rubrics, OpenAI’s launched GPT-5.6, Meta launched Muse Spark 1.1 model and Meta Compute.

AI Socratic June 2026 #2 — Begun the Open Source AI War Has

The second half of June was about AI climbing out of the chat box and into the physical world: Midjourney started scanning bodies, Snap shipped a face computer, SpaceX bought Cursor, and Sakana built a model to command other models. Underneath it all, Dwarkesh Patel named the real bottleneck — the world refuses to be grindable.

AI Socratic June 2026 - Hoist by Its Own Fable

Anthropic shipped Claude Fable 5, its first public Mythos-class model, and 72 hours later a national-security directive pulled it offline worldwide. A company that spent the month lobbying to keep frontier AI pausable got its own pause, on schedule. Around it: new models from nearly everyone, a couple of S-1s, real math from the machines, and the usual carnival of vibe-coding pivots and rogue Waymos.

AI Socratic, Feb 2025 — Part 1

#AI Dinner 7.0 — DeepSeek Workshop 🚀

#O3, Operator, And DeepResearch

#Gemini 2.0 Flash — Cheapest Model Yet

#Mistral, Le Chat

#Cerebras, Fast Inference

#Top Research Papers

#Videos Worth Watching

#Interesting Numbers And Random Takes

#Decentralized AI