Skip to main content
AI Socratic

ML Street Talk, is one of my new favorite AI podcast, incredible topic quality and guests.

Federico Barbero discusses why transformers struggle with tasks like counting and copying long text due to architectural bottlenecks and limitations in maintaining information fidelity. He draws comparisons to over-squashing in graph neural networks and highlights the role of the softmax function in these challenges, while also proposing practical modifications to improve transformer performance.

https://www.youtube.com/watch?v=FAspMnu4Rt0

React:

Comments

Sign in as a member to join the conversation.

Loading comments…

Stay Updated

Get the latest AI insights delivered to your inbox. No spam, unsubscribe anytime.

Search

Search across events, members, and blog posts