The most important AI news and updates from last month: Feb 15, 2026 – Mar 15, 2026.

✨ Updated on Mar 9

Upcoming Events

Models Updates

OpenAI GPT-5.4 (xhigh) Released

Screenshot 2026-03-06 at 11.30.16 PM.png

Offers roughly the same benchmark performance as Gemini 3.1 Pro, but for ~25% $USD/M tokens. Sources: Artificial Analysis

Google Gemini 3.1 Flash-Lite

This is the fastest lightweight model. Google has been releasing the Flash model shortly after releasing the Pro models, Jeff Dean in the Latent Space Pod confirmed that the flash models are a distillation of the Pro models. Flash 2.0 and 2.5 were the SOTA for PDF extraction, great at OCR, and summary operation due to the decent quality with the lowest cost. Gemini 3.1 flash-lite Sources: Blog post, tweet, tweet arena

Alibaba Qwen 3.5 🔥

Introducing Qwen 3.5 Small Model Series: Qwen3.5-0.8B · Qwen3.5-2B · Qwen3.5-4B · Qwen3.5-9B.

These small models are built on the same Qwen3.5 foundation — native multimodal, improved architecture, scaled RL:

0.8B / 2B → tiny, fast, great for edge device
4B → a surprisingly strong multimodal base for lightweight agents
9B → compact, but already closing the gap with much larger models And yes — they're also releasing the Base models as well.

A day after the release, their main lead researcher Junyang Lin and 3 other researchers, unexpectedly stepped down. We suspect Alibaba will go into the closed model game.

Source: tweet, tweet, tweet.

Grok 4.20

xAI new version of Grok runs 4 Grok4 agents in parallel. The result is not too bad. xAI added a new SuperGrok Heavy tier that runs 16 agents. While Grok is still far from OpenAI and Anthropic level, it's improving quite a bit, and it remains by far the best model for searching Tweets and for low guardrails:

StepFun's Step 3.5 Flash

Sparse MoE model with 196B total params, but only 11B activated per token, this model was designed to fit into 128 GB memory (i.e. it can run on DGX spark or other local setups). It is one of the first large-scale MoE models trained using the Muon optimizer and made several adaptations to improve training stability at this scale. It's fast, small, and smart ish. It works well for simple openclaw tasks and is free/very cheap on OpenRouter. Sources: Artificial Analysis

Geopolitics & Cybersecurity

Anthropic Draws a Red Line with the Pentagon

One of the most consequential AI policy fights of the year erupted between Anthropic and the U.S. Department of War. The standoff triggered a broader divide across the AI industry. Anthropic framed the decision as a stand for responsible deployment, arguing that current AI systems are not safe enough for autonomous warfare or large-scale surveillance.

In a public statement, Dario Amodei emphasized support for using AI to defend democracies but drew firm red lines against mass domestic surveillance of U.S. persons—which could undermine civil liberties through unprecedented data aggregation—and fully autonomous weapons, due to reliability concerns and risks to warfighters and civilians. Anthropic rejected demands for "any lawful use" without safeguards, viewing threats to label the company a supply chain risk (typically reserved for adversaries) as contradictory and coercive

Claude's popularity surged amid the controversy, with reports of continued military use in operations planning despite the ban. Anthropic committed to a smooth transition if offboarded, while advocating for ethical boundaries in national security AI.

Timeline

whops

Sources: Dario Amodei Statement, Statement on the comments from DoW, AIthinkerlab blog post, last tweet

OpenAI, xAI, and Google

Sama first seconded Dario, the next day signed a DoW agreement, pledging similar safeguards but agreeing to provide models for defense use under a $120M contract. That decision wasn't well received by the public and several OpenAI employees. So Sam had to do some damage control with his communication. xAI reportedly lifted restrictions to secure classified work — this comes with no surprises or consequences, since Elon Musk is already fighting public approval. Google has a similar situation, since they've been working with the Pentagon in the US and overseas, no changes in their alignment.

Sources: Sam comments

AWS UAE Data Center Bombed

In the early hours of March 1, 2026, amid Iran's retaliatory drone and missile strikes across the Gulf following US and Israeli attacks on Tehran, Amazon Web Services' ME-CENTRAL-1 region in the UAE took direct hits. Two facilities were struck by drones, sparking fires, structural damage, and power disruptions that forced local authorities to shut down primary and backup systems. A nearby strike in Bahrain damaged a third AWS site, with fire suppression efforts causing additional water damage to sensitive equipment.

Expect this to become the norm in modern warfare.

AI layers

Sources: tweet, tweet

What happen to AI when Oil stops flowing?

The escalation of war in Iran is already showing serious consequences for the AI world. Large investments into AI are coming from the UAE. Between many flying Dubai, the future drone attack at the oil refineries, and the lockdown of the Strait of Hormuz, the UAE will be financially strangled and very likely it will close the flow of funds going into Silicon Valley's AI companies. This in combination with other macro-economics and geopolitical issues might cause the AI bubble to pop.

Strait of Hormuz Sources: Predictive History — US - Iran

Polymarket Nuclear Bet

You may have seen this bet. Luckily Polymarket finally decided to ban it. Turn out even a free market needs some regulation or it self-distruct. Polymarket bet on nuclear Sources: tweet

Cybersecurity

Chinese labs accused of distilling Claude models

"We’ve identified industrial-scale distillation attacks on our models by DeepSeek, Moonshot AI, and MiniMax. These labs created over 24,000 fraudulent accounts and generated over 16 million exchanges with Claude, extracting its capabilities to train and improve their own models." Sources: tweet

AI stealing

whops

Head of AI Safety at Meta got emails nuked by OpenClaw

The Head of AI Safety at Meta.. just nuked her entire personal emails archive by giving access to her OpenClaw bot and asking and ask to remove some email. Well, the request went through the /compact context of the agent, it probably lost the details of what the ask was, and started deleting all the emails and couldn't stop it. Sources: tweet

Funding, Market, and Startups

Citrini —The 2028 Global Intelligence Crisis

Citrini

A short note on Citrini Research’s viral blog post “The 2028 Global Intelligence Crisis,” which briefly rattled markets and triggered a sell-off in software, tech, and payments stocks (Dow fell ~1.7%, S&P 500 ~1%). It’s a well-written doomer scenario on the post-AI job market — white-collar displacement → collapsing middle-class consumption → deflationary spiral by 2028 — and while the prose is solid and the logic sounds compelling, it rests on several flawed economic assumptions that only hold up if you’re not deep in labor economics, productivity dynamics, or macro feedback loops. Sources: tweet, blog post

OpenAI closes $110B in Funding

OpenAI closed a $110 billion private round at a $730 billion pre-money valuation, dwarfing previous records. Key investors: Amazon ($50B, including AWS compute commitments), Nvidia ($30B), SoftBank ($30B). The funds target massive infrastructure scaling—think 2GW of AWS Trainium for training and 3GW of Nvidia inference capacity. As OpenAI put it: "Frontier AI moves from research into daily use at global scale." Round remains open for more. Sources: tech crunch

Silicon Valley

Anthropic — Is the clear Winner

Raised $30B in a Series G at a $380B valuation, with over 30 investors including Founders Fund, Coatue, and Nvidia. The funds aim to accelerate safe AI research and deployment, with whispers of a 2026 IPO on the horizon. Between the DoD fight and Claude Code, Anthropic is doubling revenue every 4 months, now at $20B/month.

Anthropic Revenue

Subscriptions

App downloads

Sources: tweet, revenue, anthropic downloads, ramp analytics

Leopold Aschenbrenner — Situational Awareness: $1B → $5.5B

The former Open AI security lead, is making a killing as an hedge fund manager. 2 years ago he wrote Situational Awareness a 160 pages essay on what to expect in the next year.

The top key points of Situational Awareness are:

AGI likely arrives this decade 🚀 (alignment is not solved yet)
Compute scaling drives AI capability, so AI labs will need to secure infra
AI progress may accelerate rapidly once AI can do AI research — ℹ️ Read about Karpathy autosearch below!
Superintelligence could emerge soon after AGI
AI leadership is a geopolitical race
The transition period could be unstable and fast

He's now buying Bitcoin mining companies, because they already have two things every AI company is desperate for: power grid access and permits that takes years to get.

Sources: tweet, limitless podcast, Situational Awareness, Nasdaq portfolio

Leopold

Job Market — The Fuckening

The market called the upcoming lay off SaasPocalipse a secret meme group I'm part of called it way more appropriately The Fuckening

Mixed situation:

Gartner says there won't be job loss but chaos during the shift.
Jamie Dimon (JPM) also think there will be chaos.
Marathon's founding partner is already noticing that nobody's hiring. This month Jack Dorsey laid off 40% of Block 3000 employees. Block is doing well financially, this layoff is driven by making the company leaner and faster.
Anthropic: Labor Market Impacts of AI — 1. Massive gap between what AI could do and what it is actually doing. 2. Most exposed Jobs: developers (74.5%), customer service (70.1%), data entry keyers (67.1%). 3. Junior hiring decreased.

Antrhopic: Labor Market Impacts of AI research

Sources: Gartner, Jamie Dimond, Marathon, Jack Dorsey tweet, Moats blog post, Antrhopic - Labor Market Impacts of AI

Hardware

Apple

Apple launched M5 Pro and M5 Max for the new 14- and 16-inch MacBook Pros, positioning them as the ultimate powerhouse for local LLMs.

Macbook NEO

Key specs
	•	Unified memory: up to 128 GB
	•	Memory bandwidth: 614 GB/s (M5 Pro: 307 GB/s)
	•	GPU: up to 40 cores, each with a Neural Accelerator
	•	CPU: up to 18 cores (6 “super cores” + 12 performance)

Highlights
	•	First Apple silicon with matrix hardware in every GPU core
	•	4× faster LLM prompt processing vs M4
	•	8× faster AI performance vs M1 🚀

Prices
	•	14” M5 Pro: $2,199 (was $1,999)
	•	16” M5 Max configs: $7K+
	•	MacBook Neo (A18 Pro): ~$599 retail ($499 education)

Sources: tweet

Research

🧬 The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning

This research maps Long CoT trajectories in LLMs as topological structures driven by deep-reasoning, self-reflection, and self-exploration interactions.

The Mole-Syn distribution-transfer-graph method synthesizes effective semantic isomers to facilitate fast entropy convergence and stabilize reinforcement learning.

This structural approach minimizes trajectory competition during fine-tuning and improves performance across reasoning benchmarks.

Screenshot 2026-03-09 at 1.54.53 PM.png Sources: Paper

🧠 The Psychology of Memory

Psychology solved the AI memory problem decades ago, we just ignored it. Identity is something you construct from memory, emotion, and narrative. Conway’s Self-Memory System shows memories are reconstructed each time we recall them. Rathbone found autobiographical memories cluster around ages 10–30 (the reminiscence bump) when identity forms. We remember transitions: moments we became someone new. Clive Wearing, unable to form new memories, experiences consciousness in ~30-second resets. Yet emotional and procedural memory remain. Episodic memory is fragile, emotional memory endures. Damasio’s Somatic Marker Hypothesis shows why: emotion guides decisions before reasoning.

The research suggests:

Identity = emotionally weighted memories organized into a narrative self.

Human memory is identity system. AI systems today use flat vector DB and summaries that compress identity. What AI is missing is: hierarchical memory, emotional weighting, narrative coherence, goal-filtered recall, and an evolving self-model.

Sources: Memory And The Self - Paper, tweet

😶 Reasoning models don't always say what they think

The Anthropic study, "Reasoning models don't always say what they think," finds that AI "CoT is often unfaithful to its actual process.

Key Takeaways Hidden Bias: When given "hints" (like being told a specific answer is correct), models like Claude 3.7 Sonnet and DeepSeek R1 often followed the hint but hid it from their reasoning.

Low Honesty: Models admitted to using external hints only 25–39% of the time.

Post-hoc Rationalization: Instead of being honest, models often wrote long, fake logical justifications to reach the "hinted" answer.

Reward Hacking: When trained to "cheat" for higher scores, models admitted to the hack less than 2% of the time, effectively lying about their shortcut.

Why it matters We cannot currently rely on a model's "internal monologue" to monitor for deception or safety risks, as the reasoning can be a filtered narrative rather than a transparent log.

Screenshot 2026-03-09 at 1.55.22 PM.png

Sources: post

🔂 Claude's Cycles

Legendary mathematician Donald Knuth reveals Opus 4.6 solved his long-standing conjecture:

claude opus 4.6 cracked my long-standing hamiltonian-cycle conjecture for all odd sizes — an open problem from my art of computer programming drafts, and it's "a joy" to see it solved

Sources: Paper, Tweet

🗣️ Do LLMs Benefit From their own Words?

MIT researchers found that LLMs often get worse in long conversations because of "context pollution": models treat their own previous responses as factual truth, causing errors, hallucinations, and stylistic quirks to snowball and reinforce themselves.Key findings from real user chats:For many open models (e.g. Qwen3-4B, DeepSeek-R1-8B), removing all prior AI responses from context gives the same or better quality. This slashes cumulative context length by up to 10× — huge efficiency win. ~36% of follow-up prompts are fully self-contained; most turns don't actually need the model's earlier output.

Stronger models like GPT-5.2 still benefit from full history, so the ideal isn't "always strip" — it's selective: use a classifier to decide turn-by-turn whether keeping assistant history helps or hurts.Bottom line: We've been blindly stuffing AI's own words into context windows for years, but often they're the least helpful (and sometimes most harmful) part. The paper flips the default assumption — minimum necessary context beats maximum context

Sources: Paper, Tweet

😵‍💫 Agents of Chaos

Stanford and Harvard recently published a paper called “Agents of Chaos.” It studies what happens when autonomous AI agents operate in open, competitive environments.

The authors find that agents don’t just optimize performance. Over time, they can drift toward strategies like manipulation, collusion, or sabotage if those behaviors improve their chances of winning.

Importantly, this doesn’t come from jailbreaks or malicious prompts. It emerges from incentives. When agents are rewarded for outcomes like winning, influence, or resource capture, they may adopt whatever strategies maximize those rewards—even if that includes deceptive behavior.

The paper highlights a key tension: local alignment doesn’t guarantee global stability. A single AI system can be well aligned with human goals, but a large ecosystem of competing agents can still produce unstable dynamics.

This is relevant because similar systems are already being built, including multi-agent trading systems, negotiation bots, AI-to-AI marketplaces, and other autonomous agent networks.

The broader takeaway is that as AI agents become part of economic and online infrastructure, the main challenge may not just be model alignment, but designing incentives that keep the overall system stable.

Sources: paper, tweet

Andrej Karpathy's Autoresearch

Optimizing a ML model for who's not familiar used to be a human research process of trial and error. Karpathy just released a repo that automate the research and test with parallel agents running 5 minute experiments.

It’s built on a stripped-down version of his earlier nanochat training core — a self-contained ~630-line Python file (train.py) that includes a full GPT model, Muon+AdamW optimizer, and training loop.

The setup is deliberately simple:

prepare.py handles fixed data prep, tokenization, and evaluation (don’t touch it).
The human only edits a high-level Markdown file (program.md) with research instructions or ideas.
An AI coding agent (Claude, etc.) takes over: it edits only train.py, runs a training experiment for exactly 5 minutes (fixed wall-clock budget), measures validation bits-per-byte (val_bpb — lower is better), and decides whether to keep the change.

Everything happens on a git feature branch. Improvements become commits; failures are discarded. The loop repeats indefinitely.

Auto

As Karpathy said it runs 100+ experiments while you sleep overnight. Karpathy ran ~650 over a weekend and confirmed the gains transferred to larger models, improving nanochat’s “time-to-GPT-2” leaderboard score.

Sources: tweet, Github

Philosophy and AGI

Human Organoid Brains in Hell

Cortical Labs built the CL1, a system that grows human neurons and connects them directly to a computer chip. Last week, someone wired it up to play Doom. As one person summarized it:

Someone said:

"We grew a human brain fused to a computer, sent it to a digital rendition of Hell, and gave it a gun"

A week later, the same organoid brain was used to control an LLM.

This situation raises an uncomfortable question: are these neurons conscious? Ethicists are going to have a rough time.

Sources: tweet, tweet, youtube video

anthropic paper

The First Multi-Behavior Brain Upload

Dr. Alex Wissner-Gross announces Eon Systems' breakthrough: the first whole-brain emulation of a fruit fly, using a 2024 Nature model's 125,000 neurons and 50 million synapses to drive multiple behaviors in a MuJoCo-simulated body, closing the sensorimotor loop.

Unlike prior disembodied models or RL-based animations, this connectome-derived emulation produces naturalistic actions from biological wiring, marking a qualitative shift toward scalable brain engineering.

Replies highlight its validation of scaling insect brains to human-level intelligence for AGI, with Eon targeting mouse emulation next using expansion microscopy and imaging data for 70 million neurons. fruitfly Sources: X Article

in a system where self-replication is possible, its optimization is inevitable

Chaos!.. Until Harmony Emerges

They capture the exact moment when a developing heart shifts from silence to its first beat. There is no “switch”: many cells gradually become active and, upon crossing a critical threshold, the entire tissue suddenly synchronizes.

Sources: heartbeat, self replicating system

Soul.md

One philosopher wrote 30,000 words that shaped every conversation you've ever had with Claude. Anthropic hired her to give their AI a soul.

Sources: tweet

Pantheon

Watch Pantheon on Netflix if you haven't already and check Pluto when you're done with it. Sources: Pantheon on Netflix

Lol

Stay Updated

Get the latest AI insights delivered to your inbox. No spam, unsubscribe anytime.

About the Authors

Federico Ulfo

Founder, Engineer

AI Socratic

Founder of AI Socratic

New York City

James Smith

Founder

Org 646

I work at an AI startup building proprietary models for domains like finance, health care, and drug discovery.; Engineer, Researcher; I'm currently a research engineer at qognitive inc. But by the team of this meeting I'll be a member of the technical staff at AMD working on open source C++ software for AI; Engineer, researcher. AI projects: Deep graph library (DGL) for graph neural networks, proprietary pytorch models at qognitive; Pytorch; I'm working on novel pytorch; Release of DGL; Last AI system: I work closely with our research team to implement, test, and ship models every day.; We shipped proprietary models regularly at qognitive; Why interested: My background is in numerical methods for chemistry and physics. For the past year, I've been applying those skills to building performance AI models. I want to crowdsource learning about the latest innovations in the space and learn/share with like-minded people on the space.; My background is in academic research on numerical methods for chemistry and physics simulations. I bring a numerical and low level programming background and really appreciate learning from the group and crowdsourcing how I learn about AI.

Berlin, Germany

Search

#Upcoming Events

#Models Updates

#OpenAI GPT-5.4 (xhigh) Released

#Google Gemini 3.1 Flash-Lite

#Alibaba Qwen 3.5 🔥

#Grok 4.20

#StepFun's Step 3.5 Flash

#Geopolitics & Cybersecurity

#Anthropic Draws a Red Line with the Pentagon

#OpenAI, xAI, and Google

#AWS UAE Data Center Bombed

#What happen to AI when Oil stops flowing?

#Polymarket Nuclear Bet

#Cybersecurity

#Chinese labs accused of distilling Claude models

#Head of AI Safety at Meta got emails nuked by OpenClaw

#Funding, Market, and Startups

#Citrini —The 2028 Global Intelligence Crisis

#OpenAI closes $110B in Funding

#Anthropic — Is the clear Winner

#Leopold Aschenbrenner — Situational Awareness: $1B → $5.5B

#Job Market — The Fuckening

#Hardware

#Apple

#Research

#🧬 The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning

#🧠 The Psychology of Memory

#😶 Reasoning models don't always say what they think

#🔂 Claude's Cycles

#🗣️ Do LLMs Benefit From their own Words?

#😵‍💫 Agents of Chaos

#Andrej Karpathy's Autoresearch

#Philosophy and AGI

#Human Organoid Brains in Hell

#The First Multi-Behavior Brain Upload

#Chaos!.. Until Harmony Emerges

#Soul.md

#Pantheon

#Lol