Updates — Voices from the AI Socratic Community

Feed Mosaic Slides

February 2026

Feb 6, 2026Research

Genomic Models: Bytedance SeedFold and Google AlphaGenome

Bytedance SeedFold surpassing AlphaFold 3 (protein folding arms race).
Google AlphaGenome released (weights available to academic researchers). We're excited for X (I'll update the name after getting confirmation I can) from our community who joined AlphaGenome's incredible team 👏 and a bit sad he's moving to Zurig.

Sources: Bytedance Tweet, AlphaGenome Tweet.

Federico Ulfo

Comment

Feb 6, 2026Research

World Models Are Taking Over

This was a huge month for simulated worlds. The vibe shifted from “make pretty video” to: build systems that can live inside a world.. Here's the list of World Model worth mentioning:

Project Genie: playable worlds from text prompts (rolling out to Google AI Ultra subscribers in the US).
World API (World Labs): persistent 3D worlds from text, images, video.
D4RT (DeepMind): unify video into full 4D (space + time) representations.
Roblox 4D generation powered by Cube Foundation Model (beta).
Alibaba’s LingBot-World (open-source Genie competitor) getting attention.
Hot take making the rounds: “World models, not video generation models, will dominate AI in 2026.”

Why it matters? Agents that can reason are good. Agents that can reason inside a stable world are the foundation of robotics, game economies, simulation-heavy science, and eventually: autonomous businesses.

Yann LeCunn is the strongest anti-LLMs advocate, he thinks World Model are the right direction, because models to be AGI need to predict the outcome of their actions.

Sources: deepmind project genie, worldlab AI, tweet 1, DeepMind Teaching AI to see the World in 4d, tweet 2, roblox tweet, Alibaba Tweet, hot take tweet, Yann Tweet

Federico Ulfo

Comment

Feb 6, 2026Research

Recursive Language Models (RLM)

This is a paper from December 2025 that recently had a resurgence due to the improvement of LLMs and the proven successful implementation of sub-agents with Claude Code.

RLM are an agent architecture that overcome LLM context and reasoning limits by giving agents programmatic control over their own context via a REPL.

RLM works over a mutable context and can recursively spawn sub-agents to work on sub-tasks.

An RLM agent has:

A context object: a mutable structure that can scale to very large contexts
A recursive agent function: rlm_agent(query, context) → response, which can spawn child agents
A Python execution environment: enabling search, filtering, and computation over context

The agent alternates between writing code, inspecting results, and delegating work recursively.

RLM with Google's Agent Development Kit (ADK)

ADK adapts RLMs for production by providing low-level control over execution, memory, and orchestration via BaseAgent. Key features:

Lazy context loading from files instead of massive in-memory prompts
Parallel recursive delegation for scalable reasoning
Tool-first reasoning over code before language
Built-in observability for debugging recursive behavior

ADK preserves the core RLM idea—recursive, compute-over-context agents—while making it practical to deploy at scale.

Sources: Zhang Tweet, RLM with ADK Tweet, paper

Federico Ulfo

Comment

Feb 6, 2026Research

Prompt Repetition Improves Non-Reasoning LLMs

LLMs read prompts left to right, so early context can’t “know” what question is coming. This paper tests a simple fix: repeat the entire prompt twice, giving every token a second chance to attend to everything else.

Across seven benchmarks and seven major models (Gemini, ChatGPT, Claude, DeepSeek), accuracy improves, sometimes dramatically, without longer outputs or meaningful slowdown. Without fine-tuning, extra training, or prompting techniques.

Sources: tweet, (paper)[https://arxiv.org/pdf/2512.14982]

Federico Ulfo

Comment

Feb 6, 2026Research

Patterning: The Duality Of Interpretability

“Neural networks are grown, not programmed”. This paper changes that. Mechinterp investigates how models generalize beyond their training data by studying the resulting internal structure. They introduce patterning as the dual: given desired structure, determine what data produces it.

This is done with the language of susceptibilities. In physics, susceptibilities measure how a system responds to perturbations. Here, we think of the neural network as such a system, and of shifts in the training distribution as such perturbations.

This is a small language model (3M) across training, visualised with a new interpretability technique: susceptibilities. We call this handsome critter the rainbow serpent.

In a synthetic parentheses balancing task, we show that, given two solutions that both achieve perfect training accuracy and loss, we can effectively steer the solution that the model chooses to implement. We do this using only in-distribution data.

This is closely related to, but distinct from influence functions and training data attribution. These study the effects of data at the behavioral level, such as the impact of a data point on test loss, whereas patterning is concerned with the structure underlying that behavior.

Sources: tweet, paper, NN are grown tweet

Federico Ulfo

Comment

Feb 6, 2026Research

Anthropic: How Misalignment Scales with Bigger Models

AI failures on hard tasks tend to be incoherent and unpredictable (“hot mess”) rather than systematically pursuing the wrong goal.

More scale ≠ more coherence: bigger models don’t reliably behave more consistently and can get worse on very hard problems.
Longer reasoning can backfire: “overthinking” increases error variance; ensembling helps but isn’t practical for real-time agents.
Safety implication: future risks look more like industrial accidents from complexity and goal misspecification than deliberate, coherent misalignment.

Take away for AI engineers: build simple system that are easy to test and combine them. In other words SOLID and KISS methods translate from engineering to AI.

Source: blog

Federico Ulfo

Comment

Feb 6, 2026Research

Top Papers of the Month from DairAI

DairAI is one of our favorite Twitter AI account, consistently posts the top papers of the week, so here are the top papers for the last 3 weeks. I didn't find anything particularly interesting this month, but I recommend you to follow this account. Sources: tweet 1, tweet 2, tweet 3

Federico Ulfo

Comment

← NewerFebruary 2026Older →

Genomic Models: Bytedance SeedFold and Google AlphaGenome

World Models Are Taking Over

Recursive Language Models (RLM)

#RLM with Google's Agent Development Kit (ADK)

Prompt Repetition Improves Non-Reasoning LLMs

Patterning: The Duality Of Interpretability

Anthropic: How Misalignment Scales with Bigger Models

Top Papers of the Month from DairAI

Search

RLM with Google's Agent Development Kit (ADK)