AI Socratic

AI Socratic, Feb 2025 — Part 1

Federico UlfoFebruary 12, 2025
Market News

We collect all the most important AI updates from February.

AI Dinner 7.0 — DeepSeek Workshop 🚀

Let's start by announcing our next AI Dinner. The invitation is open to all applied AI engineers, researchers, and founders. At this event we'll read DeepSeek R1 research papers and Ivo will run a workshop to install it on your laptop.

February 20th, Manhattan NY, https://lu.ma/ai-dinner-7.0

O3, Operator, And DeepResearch

We're just at the beginning of the year and OpenAI launched 3 new products under the pro subscription for $200/month.

  • O3 is a new category of GPT models that score 87% on the ARC challenge. OpenAI released o3-mini and o3-mini-high for coding.

  • Operator is an AI agent mode that can be used with chatgpt-4o to use a desktop simulator and run actions that require browsing a web page and clicking links. Here's Karpathy's take on Operator.

  • DeepResearch enables running long research that collects content across multiple sources and summarizes it into a coherent report. It's a super powerful tool that has been received with a bang. It really makes the OpenAI pro subscription worth it. DeepResearch is currently the highest scoring in the Humanity's Last Exam.

https://x.com/tomaspueyo/status/1887270096013529530

Gemini 2.0 Flash — Cheapest Model Yet

Google Released Gemini 2.0 Flash, their most impressive LLM yet. What set 2.0 Flash apart from other LLMs is the incredibly low cost and the ability to process PDFs. It cost only $0.40 per million tokens and has 1M-tokens context window, which means you can now parse 6000 long PDFs at near perfect quality for $1.

Flash 2.0 is the new king 👑 in the block

Gemini 2.0 Flash is Better ELO than DeepSeek r1 and cheaper.

Mistral, Le Chat

Mistral just shipped Le Chat a competitor to ChatGPT that is 13x faster, 100% open-source, and completely free (vs $20/month).

https://x.com/itsolelehmann/status/1888290407127388497

Cerebras, Fast Inference

Cerebra is currently the fastest inference platform, gets us to 1,200 tokens/s - 10x faster than any comparable models, and 3x faster than groq, using DeepSeek-R1-Distill-Llama-70B.

https://x.com/CerebrasSystems/status/1885444050859487324

Top Research Papers

  • SFT Memorizes, RL Generalizes. DeepSeek has shown the power of Reinforcement Learning (RL) without Supervised Fine-Tuning (SFT). What does RL learn differently than SFT? Well, as the title, SFT memorizes, RL generalizes.

  • As AIs get smarter, they develop their own coherent value systems.

    For example, they value human lives higher in order of Pakistan > India > China > US.

    These are not just random biases, but internally consistent values that shape their behavior, with many implications for AI alignment, website link.

  • Humanity's Last Exam is a dataset with 3,000 questions, with known and verifiable answers, developed with hundreds of subject matter experts to capture the human frontier of knowledge and reasoning.

Videos Worth Watching

3 hours of pure learning

https://www.youtube.com/watch?v=7xTGNNLPyMI

Schmidt Huber the inventor of CNN and most DL on the Machine Learning Street Talk podcast

https://youtu.be/fZYUqICYCAk?si=xlq5LB1XRTNQtEIP

Interesting Numbers And Random Takes

A few more updates and information worth mentioning.

  • Anthropic is eating OpenAI's market share

https://x.com/itsandrewgao/status/1885144792323285183

Decentralized AI

  • New Distributed training paper from Google DeepMind

https://x.com/osanseviero/status/1885301292131582347

  • Scaling through decentralization

https://x.com/Ronangmi/status/1885373092777910749

Full Sources List

Memes and Funsies

Chart Porn

Research Papers

OpenAI

Stargate

Robots

LLMs

Beautiful Charts

Random

AI Agents

News

Decentralized AI

Blog Posts

AI For Builders

Updates

Tools

Pin

Learn Transformers

Video & Podcast

Politics

AGI

RAG

Founders