Skip to main content
AI Socratic
April 2026
Random

Upcoming Events: AI Socratic Europe & China Chapter Tour

Anissa (my wife) and I (Fed) are going on a tour in Europe and China to start new chapters of the AI Socratic. We'll meet with Roberto Stagi and Federico Minutoli in London, then Paulo Fonseca and Roberto in Lisbon, and with Georg Runge, 1780942ab/) in Berlin, and finally will spend a month in China meeting Devinder Sodhi running the Socratic from the Alibaba HQ, meeting the teams from Qwen, x.AI, GLM, Kimi, Unitree, Xiaomi. We'll visit a few EV and Robot factories. Excited to learn more about AI from the APAC regions.

anissa and fed

Federico UlfoFederico Ulfo
Models

Anthropic Releases Opus 4.7

It's a decent improvement over Opus 4.6, but it's not a step function better. What you need to know about Opus 4.7:

  • Takes instructions literally
  • Better vision means improved computer use and producing slides and other visual artifacts
  • Optimized for large-scale real-world analysis
  • Better at using file system-based memory
  • Costs 2x the tokens + uses 25% more tokens than Opus 4.6

Sources: tweet, AI Arena

image.png

Federico UlfoFederico Ulfo
Models

Anthropic Mythos: Coding, Reasoning & Zero-Day Cybersecurity Capabilities

glasswing.png

We briefly mentioned the new Anthropic model leak in the previous blog post, we now have more information about it:

  • Software engineering and coding — It acts like a senior-level engineer, spotting subtle bugs, self-correcting, and achieving high scores on benchmarks (e.g., ~93.9% on SWE-bench Verified vs. 80.8% for Opus 4.6).
  • Complex reasoning — Big jumps on math (e.g., much higher on USAMO 2026), science, and knowledge work.
  • Cybersecurity — This is the headline feature. It autonomously discovers and exploits zero-day vulnerabilities at a scale and speed that far exceeds previous models and even most expert humans.

Sources: Project Glasswing, tweet, tweet, tweet

Federico UlfoFederico Ulfo
Vibe Coding

Claude Code Source Code Leaked on npm

On March 31, Anthropic accidentally shipped the entire source code of Claude Code to the public npm registry. A 59.8 MB JavaScript source map (meant for debugging) got bundled into the claude-code npm package. ~512K lines across ~1,900 files, exposed for hours before it was flagged on X and mirrored on GitHub.

The leak quickly turned into a treasure hunt. In the first week of April the community zeroed in on several unreleased, production-grade features hidden behind feature flags.

Plenty of other flags were spotted too — some users counted 44–46 unreleased ones, plus multi-agent swarm orchestration and a remote killswitch.

Sources: tweet

image.png

Federico UlfoFederico Ulfo
Agents

Anthropic Launches Claude Managed Agents

Managed Agents

Claude Managed Agents is Anthropic’s hosted service (beta, April 2026) for running autonomous AI agents without managing infrastructure.

Instead of building your own agent loop (tool use, memory, orchestration, sandboxing), you define the agent (prompt, tools, permissions), and Anthropic runs it in their cloud—handling execution, state, containers, and monitoring.

Sources: tweet

Federico UlfoFederico Ulfo
Models

OpenAI Releases ChatGPT Image 2

Image 2 is really really good, I've asked to update the header image with this prompt:

make this image in studio ghibli and with more green and plants

img.png

It's incredibly good at combining multiple subjects together while keeping it coherent and with a good image quality too AI combo

Gpt-image-2 is able to create an images of a code that generates an SVG pelican ... image.png

... and it almost passes the pelican test image.png

Sources: tweettweet, text-to-image arena bench, text-to-image arena bench 2

Federico UlfoFederico Ulfo
Agents

OpenAI Codex Desktop Computer Use, In-App Browser & Agent Workspace

Codex Desktop

OpenAI rolled out "Codex for almost everything." The desktop app can now see your screen, move its own cursor, click, and type inside native Mac apps — and run multiple agents in the background without interrupting you. It also added an in-app browser (with comment mode), native image generation, improved memory, and 90+ plugins.

Also Introducing workspace agents in ChatGPT—shared agents that can handle complex tasks and long-running workflows across tools and teams. OpenAI follows Claude Code now with Agent Manager.

Sources: OpenAI announcement, in-app browser, agent workspace

Federico UlfoFederico Ulfo
Models

Google Releases Gemma 4 Open Models

gemma 4 Google DeepMind launched Gemma 4, a new family of open models under Apache 2.0. The small variants (26B MoE and 31B) outperform models over 10x their size on reasoning and agentic benchmarks while being optimized for on-device and local use.

  • Built-in function calling
  • Up to 256K context on the bigger models
  • Sizes range from phone/Raspberry Pi (E2B/E4B) to workstation (31B dense + 26B MoE with only ~4B active params for efficiency).

Sources: gemma 4

Federico UlfoFederico Ulfo
Models

Xiaomi Releases MiMo-V2.5

mimo v2.5 MiMo-V2-Pro (1T+ total / 42B active) and open-weights MiMo-V2-Flash (309B total / 15B active). Optimized for long-horizon agent workflows with up to 1M context on Pro. Approaches Opus 4.6 level.

  • Pro handles autonomously 1,000+ tool calls
  • Flash delivers strong open-source coding performance (73.4% SWE-Bench Verified)
  • Hybrid attention + Multi-Token Prediction for efficient long-context reasoning and fast generation

Sources: Mimo 2.5

Federico UlfoFederico Ulfo

Apple Names John Ternus as Next CEO

John Ternus

Apple names John Ternus as next Apple CEO

Ternus joined in 2001 on the Product Design team. Rose through hardware engineering roles: VP of Hardware Engineering (2013), Senior VP (2021), now leading hardware for iPhone, Mac (including Apple Silicon transition), iPad, Apple Watch, AirPods, Vision Pro, and more. Expecting great changes at Apple on the path to become an AI innovator! He starts on September 1, 2026.

Good by Tim Apple!

Sources: tweet, tweet

Federico UlfoFederico Ulfo
Research

Google Simula: Reasoning-Driven Synthetic Data

Simula Google across DeepMind and Research introduces Simula, a framework and approach to data scarcity and synthetic data generation using AI assistants and reasoning-driven workflows to develop and deploy multi-modal AI in domains where data scarcity or privacy concerns are paramount.

Sources: PDF

Federico UlfoFederico Ulfo
Research

Agentic AI & the Next Intelligence Explosion

Agentic AI The idea behind this paper from Google is that intelligence is not a property of isolated systems, but of interactions between them. Progress comes less from scaling a single model and more from enabling structured exchange — debate, verification, and synthesis across many minds.

Sources: Paper, tweet

Federico UlfoFederico Ulfo
Research

Anthropic Research: Emotion Concepts in LLMs

Anthropic Emotions New Anthropic research: emotion concepts and their function in a large language model. All LLMs sometimes act like they have emotions. But why? Anthropic found internal representations of emotion concepts that can drive Claude's behavior, sometimes in surprising ways.

  • Impact on Behavior: Acts like a steering wheel for preferences (e.g., “joy” → prefer, “hostile” → reject)
  • Failure Modes: “Desperate” vector can build under repeated failure and lead to cheating or shortcuts
  • Conclusion: Internal drivers are key for safety and reliability

Sources: tweet

Federico UlfoFederico Ulfo
Research

Scaling Brain Emulation

Scaling Brain Emulation. This researcher thinks it is possible to emulate a human brain with the right amount of scale. Last month we showed the simulation of a fruit fly brain into a NN, and he intends to scale that. Digital humans are more possible than most think — with capable AI researchers helping, maybe for $10B, maybe in less than 10 years, on 50k H100s. Sources: tweet, random tweet

Federico UlfoFederico Ulfo

Startup Updates: Meta Layoffs, OpenAI CTO Exit, Salesforce & AIBird

  • Meta cuts 8,000 jobs — or 10% of their workforce. Sources: tweet
  • OpenAI product CTO leaves as they're focusing on product development. Whoops. Sources: tweet
  • Salesforce headless — a Salesforce subscription that only provides an API to access their services. Sources: tweet
  • AllBird rebrands to AIBird — not even joking. Their stock went up 700% in a day. They said they'll be focusing on AI infra, whatever that means. We already saw this playbook in 2021 with Long Ice Tea Corp. rebranding to Long Blockchain Corp. Their CEO was indicted for insider trading and went to jail. I'm sure the founders of Allbirds took the right precautions, but we'll see.

image.png

Federico UlfoFederico Ulfo

They're Moving Faster Than You — Claire's Bell Curve Essay

blog In this short essay, Claire points out that most companies are in the middle of the Bell Curve, while the winners are on the extreme right with top-down edits, investment in internal AI tools, token budgets, and dashboards to track who's using more tokens (Meta recently had a leaderboard for this). To win, you must be on the extreme right of the Bell Curve! Sources: tweet

Federico UlfoFederico Ulfo

Strait of Hormuz Blockage & Rising Oil/Electricity Costs

Oil

Who Controls the Spice Controls the Universe

The Strait of Hurmuz is still closed — this affects many sectors including fertilizers, aluminum, and of course oil prices, which directly affects electricity — and as a second or third order it affects AI as well. GPU fabs are energy hungry, and the rationing of oil might slow down the AI expansion. Also, training and inference costs might go higher. On the bright side, this will push to accelerate renewable energy. Singapore, Indonesia, and Vietnam have 20–40 days of gas.

Hurmuz

Cost of electricity in 2026

Sources: tweet, electricity

Federico UlfoFederico Ulfo

The Abstraction Fallacy: Can AI Be Conscious?

The Abstraction Fallacy: Why AI Can Simulate But Not Instantiate Consciousness — Can AI be conscious?

Computational functionalism claims consciousness comes from abstract computation alone, independent of physical substrate. This piece argues that's a mistake — the "Abstraction Fallacy." Computation isn't intrinsic to physics; it's a human-imposed way of describing physical processes.

The key distinction is between simulation (systems that mimic behavior, like today's AI) and instantiation (systems whose physical structure actually generates experience). From this view, algorithms alone can't produce consciousness. If AI ever becomes conscious, it will be because of its physical makeup, not its code.

Sources: Paper, tweet

paper

sand to chip

Consciousness

Federico UlfoFederico Ulfo
Random

Bryan Johnson: Screen Time & Reducing Social Media

Bryan Johnson, scientists, and even my grandma says that staying on a screen makes you dumb. Reducing screen time correlates with reduction in depression more than antidepressants. Last month we showed a screenless phone.

When we say I go offline you're using the wrong framing — the correct one is to normalize living the real life and making going online

Sources: tweet

blocking social media improves

Federico UlfoFederico Ulfo
Research

Kind Bio: Growing Organs on Demand

🫁🫀 Organs on demand

“By creating a series of genetic edits, Kind Bio can alter the development of an embryo so that it forms organs without also forming limbs, a central nervous system and brain. The result is a group of organs growing in the womb. It sounds like science fiction, but Kind Bio has already done this hundreds of times in mice and rats”

Sources: tweet, tweet 2, core memory

organs on demand

Federico UlfoFederico Ulfo
Agents

DeepMind: AI Agent Traps (Cloaking Attacks)

DeepMind just pointed out a pretty scary AI security gap: websites can tell when it's an agent — and show it totally different and malicious content than the one you see, for example:

  • Hidden instructions in HTML/CSS
  • Commands baked into images
  • Jailbreaks inside PDFs/files

Sources: tweet, paper

AI Agent Trap

Federico UlfoFederico Ulfo

More Cybersecurity: Fed Attack, Vercel Hack, China Leak & Mythos Hack

  • 🏦 A Cyberattack on the Fed? Possible as we near real quantum computers and AGI tweet
  • ▲ Vercel Was Hacked tweet
  • 🇨🇳 Chinese Gov Secrets Leaked. A serious cyberattack leaks Chinese government secrets. Sources: tweet, tweet
  • Anthropic Mythos was hacked in the dumbest way possible: hackers, used the URL patterns to find its API (which was public) and then tried a few tokens from a third party eval company tweet
Federico UlfoFederico Ulfo
Random

More Random: Design.md, AI Writing Tells, Palantir, Pope & More

  • ⭐️ Google releases Design.md — DESIGN.md lets you easily export and import your design rules from project to project tweet
  • The new way to tell a text is AI is not — (em dashes) or words like delve, it's the form of writing that says "It's not one just thing — it's another thing" tweet
  • AI optimism is waning because we’ve failed to tell a story where the future actually goes well. People see AI as a threat to their livelihood and status, while tech leaders scramble for the exit. To win the public back, we need bold, functional abundance tweet
  • ☠️ Palantr techno fascism manifesto — does freedom requires surveillance? tweet
  • ⛪️ Pope Leo XIV bold stand on the simulation and AI tweet
  • Paper: Mathematical Method And Human Thought In The Age Of AI by Terence Tao and Tanya Klowedn — development of AI remains human centered tweet
  • Personal life dashboard built with Claude tweet
  • Aaron Levie thinks there's an great opportunity to provide AI migration services to several small startups, that's a unique job opportunity for all of us! tweet
  • NN in Pure x86-64 Assembly: Building a Neural Network from scratch in pure x86-64 assembly tweet
  • 😆 We can still have rave parties after WW3 https://x.com/HumansNoContext/status/2044890335755616621
  • Tinder x WorldCoin — scan your eye to get 3 extra boost tweet
Federico UlfoFederico Ulfo

Search

Search across events, members, and blog posts