Updates — Voices from the AI Socratic Community

March 2026

Mar 30, 2026Models

Anthropic Reduces Claude Rate Limits During Peak Hours

Rate Limits Reduced 😢

rate hours

"To manage growing demand for Claude we're adjusting our 5 hour session limits for free/Pro/Max subs during peak hours. Your weekly limits remain unchanged. During weekdays between 5am–11am PT / 1pm–7pm GMT, you'll move through your 5-hour session limits faster than before."

Sources: tweet

Federico Ulfo

Comment

Mar 30, 2026Models

Anthropic Models Mythos and Capybara Leaked

New Models Leaked: Mythos and Capybara.

Anthropic accidentally exposed internal assets due to a CMS misconfiguration, revealing development of Claude Mythos and Capybara models. Cybersecurity stock crashes right after.

Claude Mythos

Federico Ulfo

Comment

Mar 30, 2026Models

Apple Opening Up Siri to Other Models

Apple Opening Up Siri To Other

Federico Ulfo

Comment

Mar 30, 2026Models

Gemini Embedding 2: Natively Multimodal Embedding Model

Gemini Embedding 2

Gemini Embedding 2 is our first natively multimodal embedding model that maps text, images, video, audio and documents into a single embedding space, enabling multimodal retrieval and classification across different types of media — and it’s available now in public preview. Sources: tweet

Federico Ulfo

Comment

Mar 3, 2026Models

OpenAI GPT-5.4 (xhigh) Released

Screenshot 2026-03-06 at 11.30.16 PM.png

Offers roughly the same benchmark performance as Gemini 3.1 Pro, but for ~25% $USD/M tokens. Sources: Artificial Analysis

Federico Ulfo

Comment

Mar 3, 2026Models

Google Gemini 3.1 Flash-Lite

This is the fastest lightweight model. Google has been releasing the Flash model shortly after releasing the Pro models, Jeff Dean in the Latent Space Pod confirmed that the flash models are a distillation of the Pro models. Flash 2.0 and 2.5 were the SOTA for PDF extraction, great at OCR, and summary operation due to the decent quality with the lowest cost. Gemini 3.1 flash-lite Sources: Blog post, tweet, tweet arena

Federico Ulfo

Comment

Mar 3, 2026Models

Alibaba Qwen 3.5 Small Model Series

Introducing Qwen 3.5 Small Model Series: Qwen3.5-0.8B · Qwen3.5-2B · Qwen3.5-4B · Qwen3.5-9B.

These small models are built on the same Qwen3.5 foundation — native multimodal, improved architecture, scaled RL:

0.8B / 2B → tiny, fast, great for edge device
4B → a surprisingly strong multimodal base for lightweight agents
9B → compact, but already closing the gap with much larger models And yes — they're also releasing the Base models as well.

A day after the release, their main lead researcher Junyang Lin and 3 other researchers, unexpectedly stepped down. We suspect Alibaba will go into the closed model game.

Source: tweet, tweet, tweet.

Federico Ulfo

Comment

Mar 3, 2026Models

xAI Grok 4.20 with Parallel Agents

xAI new version of Grok runs 4 Grok4 agents in parallel. The result is not too bad. xAI added a new SuperGrok Heavy tier that runs 16 agents. While Grok is still far from OpenAI and Anthropic level, it's improving quite a bit, and it remains by far the best model for searching Tweets and for low guardrails:

Federico Ulfo

Comment

Mar 3, 2026Models

StepFun's Step 3.5 Flash

Sparse MoE model with 196B total params, but only 11B activated per token, this model was designed to fit into 128 GB memory (i.e. it can run on DGX spark or other local setups). It is one of the first large-scale MoE models trained using the Muon optimizer and made several adaptations to improve training stability at this scale. It's fast, small, and smart ish. It works well for simple openclaw tasks and is free/very cheap on OpenRouter. Sources: Artificial Analysis

Federico Ulfo

Comment

← NewerMarch 2026Older →

#Rate Limits Reduced 😢

#New Models Leaked: Mythos and Capybara.

#Apple Opening Up Siri To Other

#Gemini Embedding 2

Search

Rate Limits Reduced 😢

New Models Leaked: Mythos and Capybara.

Apple Opening Up Siri To Other

Gemini Embedding 2