Skip to main content
AI Socratic

Optimizing a ML model for who's not familiar used to be a human research process of trial and error. Karpathy just released a repo that automate the research and test with parallel agents running 5 minute experiments.

It’s built on a stripped-down version of his earlier nanochat training core — a self-contained ~630-line Python file (train.py) that includes a full GPT model, Muon+AdamW optimizer, and training loop.

The setup is deliberately simple:

  • prepare.py handles fixed data prep, tokenization, and evaluation (don’t touch it).
  • The human only edits a high-level Markdown file (program.md) with research instructions or ideas.
  • An AI coding agent (Claude, etc.) takes over: it edits only train.py, runs a training experiment for exactly 5 minutes (fixed wall-clock budget), measures validation bits-per-byte (val_bpb — lower is better), and decides whether to keep the change.
  • Everything happens on a git feature branch. Improvements become commits; failures are discarded. The loop repeats indefinitely.

Auto

As Karpathy said it runs 100+ experiments while you sleep overnight. Karpathy ran ~650 over a weekend and confirmed the gains transferred to larger models, improving nanochat’s “time-to-GPT-2” leaderboard score.

Sources: tweet, Github

React:

Comments

Sign in as a member to join the conversation.

Loading comments…

Stay Updated

Get the latest AI insights delivered to your inbox. No spam, unsubscribe anytime.

Search

Search across events, members, and blog posts