Talaas ASIC Chip: Weights Printed on Silicon
Talaas released a new chip that has the weights printed on the chip and run Llama 3.1 8B 8x faster than Cerebras and 45x faster than on an NVIDIA B200 🚀: • 17,000 token/sec • raised $169 million with 24 • runs Llama 3.1 8B, released July 2024, aggressively quantized to 3-bit and 6-bit precision with measurable quality degradation • 0.75 per 1m tokens vs 10 cents on Cerebras • they're betting that models not changing so frequently — they might be right on the direction, but are they right on the timing?

Why This is Interesting
Ultra-fast inference makes operations synchronous instead of asynchronous. That means you can use inference inside code the way you use any other operator — with the understanding that the output is probabilistic. While costs is really high today, I'd imagine ASICS to be a viable option for many reasons:
- Model updates will slow down. We’re running out of high-quality internet data and synthetic data is degrading the pool. It’s plausible hyperscalers freeze major datasets around ~2027 and move to slower weight updates. If weights stabilize, printing them onto silicon makes economic sense.
- Speed compounds — internally first. Hyperscalers are already experimenting with Groq/Cerebras for faster internal inference. Imagine 40× faster Opus-level models powering internal agents. Recursive improvement accelerates. If one company crosses a self-improvement threshold, the velocity gap could become impossible to close.
- Inference becomes a coding primitive. Today LLM calls are async API requests. On-chip inference makes them synchronous and inline — effectively a probabilistic operator inside code. That opens an entirely new programming paradigm.
- IQ ceiling + capex crossover. For most human use cases, GPT-5 / Opus 4.x are already beyond what’s needed. Printing those weights onto chips for large-scale Q&A becomes viable once capex amortized over 2–3 years is cheaper than GPU inference. If that crossover happens, the industry shifts — assuming no new architecture leap resets the game.