Mini-ML: 12M Parameter LLM Trained From Scratch in Rust
They trained a 12M parameter LLM on their own ML framework using a Rust backend and CUDA kernels for flash attention, AdamW, and more. Inspirational project for anyone who wants to better understand how to build LLMs.
Sources: tweet
Sources:
Google across DeepMind and Research introduces Simula, a framework and approach to data scarcity and synthetic data generation using AI assistants and reasoning-driven workflows to develop and deploy multi-modal AI in domains where data scarcity or privacy concerns are paramount.
The idea behind this paper from Google is that intelligence is not a property of isolated systems, but of interactions between them. Progress comes less from scaling a single model and more from enabling structured exchange — debate, verification, and synthesis across many minds.
New Anthropic research: emotion concepts and their function in a large language model. All LLMs sometimes act like they have emotions. But why? Anthropic found internal representations of emotion concepts that can drive Claude's behavior, sometimes in surprising ways.