Skip to main content
AI Socratic
March 2026
Models

Anthropic Reduces Claude Rate Limits During Peak Hours

Rate Limits Reduced 😢

rate hours

"To manage growing demand for Claude we're adjusting our 5 hour session limits for free/Pro/Max subs during peak hours. Your weekly limits remain unchanged. During weekdays between 5am–11am PT / 1pm–7pm GMT, you'll move through your 5-hour session limits faster than before."

Sources: tweet

Federico UlfoFederico Ulfo
Models

Anthropic Models Mythos and Capybara Leaked

New Models Leaked: Mythos and Capybara.

Anthropic accidentally exposed internal assets due to a CMS misconfiguration, revealing development of Claude Mythos and Capybara models. Cybersecurity stock crashes right after.

Claude Mythos

Federico UlfoFederico Ulfo
Models

Gemini Embedding 2: Natively Multimodal Embedding Model

Gemini Embedding 2

Gemini Embedding 2

Gemini Embedding 2 is our first natively multimodal embedding model that maps text, images, video, audio and documents into a single embedding space, enabling multimodal retrieval and classification across different types of media — and it’s available now in public preview. Sources: tweet

Federico UlfoFederico Ulfo
Models

Google Gemini 3.1 Flash-Lite

This is the fastest lightweight model. Google has been releasing the Flash model shortly after releasing the Pro models, Jeff Dean in the Latent Space Pod confirmed that the flash models are a distillation of the Pro models. Flash 2.0 and 2.5 were the SOTA for PDF extraction, great at OCR, and summary operation due to the decent quality with the lowest cost. Gemini 3.1 flash-lite Sources: Blog post, tweet, tweet arena

Federico UlfoFederico Ulfo
Models

Alibaba Qwen 3.5 Small Model Series

Introducing Qwen 3.5 Small Model Series: Qwen3.5-0.8B · Qwen3.5-2B · Qwen3.5-4B · Qwen3.5-9B.

These small models are built on the same Qwen3.5 foundation — native multimodal, improved architecture, scaled RL:

  • 0.8B / 2B → tiny, fast, great for edge device
  • 4B → a surprisingly strong multimodal base for lightweight agents
  • 9B → compact, but already closing the gap with much larger models And yes — they're also releasing the Base models as well.

A day after the release, their main lead researcher Junyang Lin and 3 other researchers, unexpectedly stepped down. We suspect Alibaba will go into the closed model game.

Source: tweet, tweet, tweet.

Federico UlfoFederico Ulfo
Models

xAI Grok 4.20 with Parallel Agents

xAI new version of Grok runs 4 Grok4 agents in parallel. The result is not too bad. xAI added a new SuperGrok Heavy tier that runs 16 agents. While Grok is still far from OpenAI and Anthropic level, it's improving quite a bit, and it remains by far the best model for searching Tweets and for low guardrails:

Federico UlfoFederico Ulfo
Models

StepFun's Step 3.5 Flash

Sparse MoE model with 196B total params, but only 11B activated per token, this model was designed to fit into 128 GB memory (i.e. it can run on DGX spark or other local setups). It is one of the first large-scale MoE models trained using the Muon optimizer and made several adaptations to improve training stability at this scale. It's fast, small, and smart ish. It works well for simple openclaw tasks and is free/very cheap on OpenRouter. Sources: Artificial Analysis

Federico UlfoFederico Ulfo

Search

Search across events, members, and blog posts