SakanaAI × Nvidia: Sparser, Faster, Lighter Transformer (TwELL)
May 31, 2026
Sakana AI & NVIDIA's ICML 2026 paper introduces TwELL — a new sparse format for LLM feedforward layers that achieves >95% unstructured sparsity (via ReLU + light L1) while staying fully compatible with fast GPU tiled matrix multiplies. Result: 20%+ faster inference/training, lower memory & energy use on billion-scale models, with open-source CUDA kernels. Minimal accuracy loss.

Sign in as a member to join the conversation.
Loading comments…
Get the latest AI insights delivered to your inbox. No spam, unsubscribe anytime.
Search across events, members, and blog posts