Updates — Voices from the AI Socratic Community

May 2026

May 31, 2026Models

DeepSeek V4

DeepSeek just dropped V4 (preview) — two open-weights MoE models that push the frontier on cost-effective 1M-token context.

DeepSeek-V4-Pro: 1.6T total params (49B active) — flagship performance rivaling top closed models in reasoning, math, and agentic coding. DeepSeek-V4-Flash: 284B total (13B active) — faster, cheaper, and highly efficient for everyday/agent tasks.

Both feature a new hybrid attention architecture (Compressed Sparse Attention + Heavily Compressed Attention) that makes million-token contexts dramatically more practical (much lower FLOPs and KV cache than V3). MIT license, available on Hugging Face (base + instruct), and live on the DeepSeek API today.

The community is already praising the efficiency gains, strong coding/agent results (e.g., high LiveCodeBench / SWE-Bench scores), and rock-bottom pricing — especially with the ongoing Pro discount.

Quick Highlights (as of early May 2026)

Release date: April 24, 2026 (preview)
Context: Native 1M tokens (with practical efficiency improvements for real agent/document workflows)
Reasoning modes: Non-think (fast), Think High, Think Max (deeper, higher quality on hard tasks) — all from the same weights
API pricing (highly competitive): Flash is extremely cheap; Pro has a big temporary discount (extended to ~May 31 in some updates) + major input cache price drop (1/10th)
Strengths: Coding/agentic tasks, long-context efficiency, price/performance. Text-only for now (multimodal planned later).
Availability: Chat at chat.deepseek.com (Expert/Instant modes), API (OpenAI/Anthropic compatible), open weights on HF/ModelScope.

Sources: Official announcement, Hugging Face collection, Tech Report, tweet discount extended

Federico Ulfo

Comment2

May 31, 2026Models

OpenAI: GPT-5.5, Goblin Mode, Symphony & Realtime

GPT 5.5

OpenAI shipped GPT-5.5 — an incremental but meaningful step on the way to GPT-6. The release keeps OpenAI in the conversation while Anthropic and DeepSeek crowd the frontier from both sides.

Sources: OpenAI announcement

GPT goes in Goblin Mode

"Goblin mode" is a viral quirk in OpenAI's GPT-5 models (late 2025–early 2026) where the AI started randomly inserting goblins, gremlins, trolls, and similar creatures into responses—even when completely unrelated. Cause: Over-reinforcement during training for the "Nerdy" personality. Playful goblin metaphors scored high on "fun/quirky," so the behavior spread wildly. Fix: Open AI fixed it by adding this to the system prompt, twice!

Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user’s query.
...
Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user’s query.

Sources: OpenAI, Amanda Askell, tweet

Symphony

Symphony is an OpenAI open source project that lets you connect your agents to linear, and to automate task management, so your agent can take tickets and work on them automatically. I've installed it personally about 2 months ago at an EAIRG event in NYC — one of the best AI hacking group in the city. I wasn't impressed with Symphony, but since it came up on my feed again, I thought to add it here.

Sources: tweet, symphony link

GPT-Realtime-2

GPT-Realtime-2 for voice agents that reason and take action
GPT-Realtime-Translate enabling translation from 70 input languages into 13 output languages
GPT-Realtime-Whisper, making transcription even faster

Sources: tweet

Mira Murati email exchange with Sam Altman leaks:

Sources: tweet

Federico Ulfo

#Quick Highlights (as of early May 2026)

#GPT 5.5

#GPT goes in Goblin Mode

#Symphony

#GPT-Realtime-2

#Estimated size per model:

#1) Benchmarks

#2) Revenue

#3) Productivity uplift

#4) Compute demand

#How GPT, Claude, and Gemini are actually trained and served – Reiner Pope

#What rebuilding AlphaGo teaches us about self-play, RL, and future of LLMs - Eric Jang

#Chip design from the bottom up – Reiner Pope

#Fireside Chat: Sequoia × Karpathy

Search

Quick Highlights (as of early May 2026)

GPT 5.5

GPT goes in Goblin Mode

Symphony

GPT-Realtime-2

Estimated size per model:

1) Benchmarks

2) Revenue

3) Productivity uplift

4) Compute demand

How GPT, Claude, and Gemini are actually trained and served – Reiner Pope

What rebuilding AlphaGo teaches us about self-play, RL, and future of LLMs - Eric Jang

Chip design from the bottom up – Reiner Pope

Fireside Chat: Sequoia × Karpathy