Skip to main content
AI Socratic
February 2025
Models

xAI launches Grok 3

xAI launched Grok3 this week, it is an order of magnitude more capable than Grok 2, with 10x more computing power thanks to xAI's Colossus 100k H100s.

Grok 3 excels in math, science, coding, and general knowledge, with notable performance in image understanding tasks, achieving a 73.2% score on the MMMU benchmark (xAI Blog).

It's seems to be o1 level but it also introduced DeepResearch (they called it DeepSearch) and Search capabilities. The subscription costs $22/month. The most interesting feature of Grok is still the direct access to the twitter feed.

So while G3 is not full o3 level, which didn't launch yet, it signal xAI entering the arena in a competitive way.

https://x.com/adonis_singh/status/1892109817830851060

Federico UlfoFederico Ulfo
Models

OpenAI Launches O3, Operator, and DeepResearch

We're just at the beginning of the year and OpenAI launched 3 new products under the pro subscription for $200/month.

  • O3 is a new category of GPT models that score 87% on the ARC challenge. OpenAI released o3-mini and o3-mini-high for coding.

  • Operator is an AI agent mode that can be used with chatgpt-4o to use a desktop simulator and run actions that require browsing a web page and clicking links. Here's Karpathy's take on Operator.

  • DeepResearch enables running long research that collects content across multiple sources and summarizes it into a coherent report. It's a super powerful tool that has been received with a bang. It really makes the OpenAI pro subscription worth it. DeepResearch is currently the highest scoring in the Humanity's Last Exam.

https://x.com/tomaspueyo/status/1887270096013529530

Federico UlfoFederico Ulfo
← NewerFebruary 2025Older →

Search

Search across events, members, and blog posts