Updates — Voices from the AI Socratic Community | AI Socratic

Feed Mosaic Slides

February 2025

Feb 21, 2025Models

xAI launches Grok 3

xAI launched Grok3 this week, it is an order of magnitude more capable than Grok 2, with 10x more computing power thanks to xAI's Colossus 100k H100s.

Grok 3 excels in math, science, coding, and general knowledge, with notable performance in image understanding tasks, achieving a 73.2% score on the MMMU benchmark (xAI Blog).

It's seems to be o1 level but it also introduced DeepResearch (they called it DeepSearch) and Search capabilities. The subscription costs $22/month. The most interesting feature of Grok is still the direct access to the twitter feed.

So while G3 is not full o3 level, which didn't launch yet, it signal xAI entering the arena in a competitive way.

https://x.com/adonis_singh/status/1892109817830851060

Federico Ulfo

Federico Ulfo

Feb 21, 2025Hardware

Microsoft Launches Quantum Chip Majorana 1

https://x.com/MSFTQuantum/status/1892252249797124598

Microsoft’s Majorana 1, unveiled today, is the world’s first quantum processor using topological qubits. Here are three key insights:

Stable Qubits: It harnesses Majorana zero modes for more stable, error-resistant qubits, advancing quantum computing scalability.
Extreme Conditions: The processor relies on near-absolute-zero temperatures and magnetic fields to create a rare topological state of matter.
Scalable Future: Microsoft aims to scale to a million qubits on one chip, building on its 2024 Quantinuum collaboration for reliable quantum systems.

Federico Ulfo

Federico Ulfo

Feb 21, 2025Macro & Geopolitics

Hyperscaler Capex

https://x.com/kimmonismus/status/1889784565935317169

The hardware AI landscape is accelerating at a breathtaking pace. Tech giants like Microsoft, Google, Meta, and Amazon spending $528 billion on AI infrastructure in 2024 to pursue artificial general intelligence (AGI). Leaders like Dario Amodei and Sam Altman predict AGI could arrive within 1-2 years, promising breakthroughs in health and energy but also warning of job losses, inequality, and societal risks. Despite challenges like electricity and chip shortages, the post calls for urgent global discussions to ensure equitable outcomes as humanity approaches this transformative era.

Federico Ulfo

Federico Ulfo

Feb 21, 2025Research

Scaling up test-time compute with latent reasoning

https://x.com/MatthewBerman/status/1890081482104008920

Federico Ulfo

Federico Ulfo

Feb 12, 2025Random

AI Dinner 7.0 — DeepSeek Workshop

Let's start by announcing our next AI Dinner. The invitation is open to all applied AI engineers, researchers, and founders. At this event we'll read DeepSeek R1 research papers and Ivo will run a workshop to install it on your laptop.

February 20th, Manhattan NY, https://lu.ma/ai-dinner-7.0

Federico Ulfo

Federico Ulfo

Feb 12, 2025Models

OpenAI Launches O3, Operator, and DeepResearch

We're just at the beginning of the year and OpenAI launched 3 new products under the pro subscription for $200/month.

O3 is a new category of GPT models that score 87% on the ARC challenge. OpenAI released o3-mini and o3-mini-high for coding.
Operator is an AI agent mode that can be used with chatgpt-4o to use a desktop simulator and run actions that require browsing a web page and clicking links. Here's Karpathy's take on Operator.
DeepResearch enables running long research that collects content across multiple sources and summarizes it into a coherent report. It's a super powerful tool that has been received with a bang. It really makes the OpenAI pro subscription worth it. DeepResearch is currently the highest scoring in the Humanity's Last Exam.

https://x.com/tomaspueyo/status/1887270096013529530

Federico Ulfo

Federico Ulfo

Feb 12, 2025Models

Gemini 2.0 Flash — Cheapest Model Yet

Google Released Gemini 2.0 Flash, their most impressive LLM yet. What set 2.0 Flash apart from other LLMs is the incredibly low cost and the ability to process PDFs. It cost only $0.40 per million tokens and has 1M-tokens context window, which means you can now parse 6000 long PDFs at near perfect quality for $1.

Flash 2.0 is the new king 👑 in the block

Gemini 2.0 Flash is Better ELO than DeepSeek r1 and cheaper.

Federico Ulfo

Federico Ulfo

Feb 12, 2025Models

Mistral Ships Le Chat

Mistral just shipped Le Chat a competitor to ChatGPT that is 13x faster, 100% open-source, and completely free (vs $20/month).

https://x.com/itsolelehmann/status/1888290407127388497

Federico Ulfo

Federico Ulfo

Feb 12, 2025Hardware

Cerebras — Fastest Inference Platform

Cerebra is currently the fastest inference platform, gets us to 1,200 tokens/s - 10x faster than any comparable models, and 3x faster than groq, using DeepSeek-R1-Distill-Llama-70B.

https://x.com/CerebrasSystems/status/1885444050859487324

Federico Ulfo

Federico Ulfo

Feb 12, 2025Research

SFT Memorizes, RL Generalizes

SFT Memorizes, RL Generalizes. DeepSeek has shown the power of Reinforcement Learning (RL) without Supervised Fine-Tuning (SFT). What does RL learn differently than SFT? Well, as the title, SFT memorizes, RL generalizes.

Federico Ulfo

Federico Ulfo

Feb 12, 2025Research

As AIs Get Smarter, They Develop Coherent Value Systems

As AIs get smarter, they develop their own coherent value systems.

For example, they value human lives higher in order of Pakistan > India > China > US.

These are not just random biases, but internally consistent values that shape their behavior, with many implications for AI alignment, website link.

Federico Ulfo

Federico Ulfo

Feb 12, 2025Research

Humanity's Last Exam Dataset Released

Humanity's Last Exam is a dataset with 3,000 questions, with known and verifiable answers, developed with hundreds of subject matter experts to capture the human frontier of knowledge and reasoning.

Federico Ulfo

Federico Ulfo

Feb 12, 2025Videos & Podcasts

Videos Worth Watching

3 hours of pure learning

https://www.youtube.com/watch?v=7xTGNNLPyMI

Schmidt Huber the inventor of CNN and most DL on the Machine Learning Street Talk podcast

https://youtu.be/fZYUqICYCAk?si=xlq5LB1XRTNQtEIP

Federico Ulfo

Federico Ulfo

Feb 12, 2025Macro & Geopolitics

Anthropic Is Eating OpenAI's Market Share

Anthropic is eating OpenAI's market share

https://x.com/itsandrewgao/status/1885144792323285183

Federico Ulfo

Federico Ulfo

Feb 12, 2025Research

LLM Beats Doctor in Treating Patients

LLM beats doctor in treating patients

Federico Ulfo

Federico Ulfo

Feb 12, 2025Random

Computer Science Is More Than Just Coding

Computer Science is more than just coding

Federico Ulfo

Federico Ulfo

Feb 12, 2025Agents

YC: Vertical AI Agents Could Be 10X Bigger Than SaaS

YC on why vertical AI agents could be 10X bigger than SaaS

Federico Ulfo

Federico Ulfo

Feb 12, 2025Hardware

Number of H100s Bought in 2024

Number of H100s bought in 2024

MSFT: 450,000
META: 350,000
AMZN: 196,000
GOOG: 169,000

Federico Ulfo

Federico Ulfo

Feb 12, 2025Random

Artificial Intelligence Roadmap

Artificial Intelligence Roadmap

Federico Ulfo

Federico Ulfo

Feb 12, 2025Macro & Geopolitics

Top Use of Claude Inference Is for Engineering

Top use of Claude Inference is for Engineering. Through the Anthropic Economic Index, Anthropic will track how these patterns evolve as AI advances.

Federico Ulfo

Federico Ulfo

Feb 12, 2025Macro & Geopolitics

Research: AGI's Impact on Human Wages

This research from Gratient Update analyzes the impact of AGI on human wages, concluding AGI can fully substitute for human labor, and it might cause wages to crash, below subsistence level. The author thinks humans will eventually lose their wealth through expropriation or through violent revolution. If this occurs, AGI will be negative for human welfare.

Federico Ulfo

Federico Ulfo

Feb 12, 2025Research

New Distributed Training Paper from Google DeepMind

New Distributed training paper from Google DeepMind

https://x.com/osanseviero/status/1885301292131582347

Federico Ulfo

Federico Ulfo

Feb 12, 2025Research

Scaling Through Decentralization

Scaling through decentralization

https://x.com/Ronangmi/status/1885373092777910749

Federico Ulfo

Federico Ulfo

← NewerFebruary 2025Older →