
· Claim: Decoder‑only transformer LMs are almost‑surely injective: different prompts map to unique last‑token hidden states; this holds at initialization and is preserved under gradient descent.
· Method: Prove components are real‑analytic, show collisions occur only on a measure‑zero parameter set, and that GD updates don’t move parameters into that set in finite steps.
· Evidence: Billions of collision tests on six SOTA LMs found no collisions.
· Algorithm (SipIt): Reconstructs exact input text from hidden activations by exploiting causality; sequentially matches each token’s hidden state given the known prefix; offers linear‑time guarantees.
· Failure cases: Applies to decoder‑only transformers with analytic activations and continuous initialization; quantization, weight tying, duplicated embeddings, or non‑analytic parts can break injectivity. OK there are ways to preserve "privacy" to the question.
Paper: arxiv.org/abs/2510.15511
Stay Updated
Get the latest AI insights delivered to your inbox. No spam, unsubscribe anytime.

Comments
Sign in as a member to join the conversation.
Loading comments…