Skip to main content
AI Socratic
1 / 4
Exit
Anthropic: Natural Language Autoencoders (NLAs)
Research

Anthropic: Natural Language Autoencoders (NLAs)

Models don't always say what they think, they instead encode their thinking into tokens that are not human readable. Anthropic introduces a solution to train models to convert internal neural activati

Federico UlfoFederico Ulfo
Read full update
Use ← → arrow keys to navigate

Search

Search across events, members, and blog posts