Updates — Voices from the AI Socratic Community

Dec 17, 2025Models

Model Wars: GPT-5.2 vs Opus 4.5 vs Gemini 3 vs Grok 4.1

The "Model Wars" have intensified with major releases from all top providers, focusing heavily on reasoning and efficiency.

GPT-5.2: OpenAI’s latest step is less “bigger model” and more “better worker”. Instant / Thinking / Pro variants tuned for deep, multi-step knowledge work (coding, long-context synthesis, and tool-heavy agent workflows like spreadsheets and presentations). On ARC-AGI-2 (Verified), GPT-5.2 Thinking posts 52.9% and Pro reaches 54.2%, positioning as OpenAI’s flagship for coding + agentic tasks. Even at higher per-token pricing, it’s pitched as cheaper-per-quality due to improved token efficiency (note: GPT 5.1 already signaled massive efficiency gains, since it was reaching o3 performance at 150x lower cost).

Grok 4.1:

Gemini 3.0: Google released Gemini 3 (Pro and Flash), is a massive leap over ChatGPT 5.1 in reasoning, speed, and video. It reportedly "one-shotted" an entire website build, leading some to declare front-end development "dead".

Claude Opus 4.5: Anthropic's new flagship model is a significant breakthrough. It outperforms predecessors while being cheaper than Sonnet 4.5. Notably, it embeds reasoning directly into files when traces are disabled and is marketed as the best model for coding and agentic computer use. All engineers agree on this being the best coding model.

Model Wars Cycle

ARC Prize Leaderboard

Federico Ulfo

Comment

Dec 17, 2025Models

Nano Banana Pro: turn earnings PDFs into infographics and slides

Nano Banana Pro: A standout tool this month for visuals. It can compress entire earnings PDFs into infographics, generate insights from papers, and create slides.

Federico Ulfo

Comment

Dec 17, 2025Models

Intellect-3: 100B+ MoE trained with decentralized compute by Prime Intellect

Intellect-3: A 100B+ parameter MoE (Mixture of Experts) model released by Prime Intellect (PI), trained using decentralized computing. It shows state-of-the-art performance in math and code.

Federico Ulfo

Comment

Dec 17, 2025Models

NotebookLM gets Deep Research

NotebookLM got Deep Research

Federico Ulfo

Comment

Dec 17, 2025Models

OpenAI silently testing next-gen image backend "Image 2"

OpenAI is silently testing its next-gen image backend that people are informally calling "Image 2", allegedly considered in the same frontier tier as Nano Banana Pro.

Federico Ulfo

Comment

Model Wars: GPT-5.2 vs Opus 4.5 vs Gemini 3 vs Grok 4.1

Nano Banana Pro: turn earnings PDFs into infographics and slides

Intellect-3: 100B+ MoE trained with decentralized compute by Prime Intellect

NotebookLM gets Deep Research

OpenAI silently testing next-gen image backend "Image 2"

Search