The most important AI news and updates from last month (April 15 - May 15). A beefy month!
Let's start with events and conferences.
May 21st, AI Dinner 10.0
Another AI dinner at the Greycroft office. The focus this month will be on A2A, top 30 Ilya Sutskever papers, Alpha Evolve, and the latest in AI — we'll use this blog post you're reading right now to structure our conversations!

AI NYC is a community of AI researchers, engineers, and founders. We meet once a month in a Symposium running Socratic discussion around research papers, LLMs, and philosophizing around the latest in AI.
Conferences
There are 2 major conferences happening next month, to decide which one to go I've scraped the speakers and events from both and put that in a google sheet, you can check it here: https://docs.google.com/spreadsheets/d/13Z2PKyFtbQaZm8iDguzxTZ4gukZnAI2BCg9-cVJFqfA/edit?gid=2008052934#gid=2008052934
I decided to go to SF this time around, the main reason is that the conference put together the best AI speakers all in one place, while AI Tech Week is scattered all over the place and has way too much noise.
- AI Engineer, SF, June 3-5, https://www.ai.engineer. Here's a discount code for you THANKSFEULF.
- AI Tech Week, NY, June 2-8, https://www.tech-week.com.
OK, now let's dive into the latest from the month!
Google IO
Google I/O 2025 doubled-down on Gemini-powered agents, AI-first Android, and a dash of new hardware—the clearest signal yet that “Google does the Googling” for you.
Key Highlights
- AI Mode for Search – rolls out to all U.S. users, running dozens of Gemini-driven sub-queries and soon tapping Project Mariner to carry out up to 10 web tasks with a Teach-and-Repeat workflow.
- Gemini 2.5 & Chrome – Deep-Think reasoning mode lands for complex math/code, while Gemini comes native to Chrome for tab-wide summarization and navigation.
- Imagen 4, Veo 3 & Flow – next-gen image and video models plus the Flow AI filmmaking app let creators stitch 8-second clips into longer AI movies.
- Project Astra upgrades – the multimodal agent goes proactive with Search Live, speaking up unprompted and handling tasks as you point your camera.
- Android 16 preview – Material 3 Expressive redesign, AI weather-reactive wallpapers, scam-call shields, Private Space, and system-wide Gemini hooks.
- Wear OS 6 – gets the same Material 3 flair, adaptive circular UI and a 10 % battery bump for Pixel Watch and beyond.
- Project Aura XR glasses – Xreal partnership teases wide-FOV smart glasses with on-device Gemini assistance.

Google shipped really hard with this.
AlphaEvolve 🦠
AlphaEvolve was received positively after its 14 May 2025 reveal. Powered by Gemini-2 models, the evolutionary coding agent discovers and refines algorithms that are already saving compute, speeding up hardware, and cracking open math problems.
Key Highlights
- Superhuman Algorithms – beats the 56-year-old Strassen method for 4 × 4 complex matrix multiplication.
- Compute Savings – a new Borg scheduling heuristic recovers ≈ 0.7 % of Google’s global compute fleet.
- Hardware & Training Boosts – 23 % faster Gemini kernel (1 % shorter training) and 32.5 % FlashAttention speed-up; lean Verilog redesign ships in next-gen TPUs.
- Evolutionary Engine – pairs Gemini Flash for breadth with Gemini Pro for depth, guided by automated evaluators.
- Broad Discovery – improved 20 % of 50 + open math problems and rediscovered 75 % of known best results.
- Early Access – academic EAP sign-ups open, wider rollout under exploration.
How It Shines
- Provably Novel – solutions are mathematically verified as new, not memorized.
- Real-World Impact – live in data-centers, chip design, and LLM training pipelines today.
- Engineer-Friendly – outputs human-readable code, easing adoption and debugging.
- Open Horizons – same framework targets materials science, drug discovery, sustainability, and more.
AlphaEvolve is DeepMind’s boldest leap toward AI-driven scientific discovery—an agent that literally evolves code, freeing humans to focus on bigger ideas.
https://youtu.be/vC9nAosXrJw?si=pu3UjCJzJYImRgn-
In this episode of the Machine Learning Street Talk the team that worked on AlphaEvolve goes into the details of the breakthrough and their insights:
Tweets
Codex
Codename codex-1, is a specialized evolution of our o3 reasoning model fine-tuned for software engineering tasks.
Key Highlights
- Parallel Tasking: Executes writing features, bug fixes, tests, and codebase queries concurrently in isolated cloud sandboxes, with tasks completing in 1–30 minutes.
- Verifiable Actions: Every task provides terminal logs, test outputs, and change citations for transparent review & integration.
- Configurable Agent Behavior: Use AGENTS.md files to instruct Codex on codebase navigation, testing commands, and project conventions.
How It Shines
- Coding Proficiency: Excels on internal SWE-Bench Verified evaluations, delivering clean, review-ready patches. LinkedIn
- Autonomous Collaboration: Proposes pull requests, refactors large codebases, and answers complex code queries independently. WSJ@EconomicTimes
- Security-Focused: Runs within sandboxed containers with no internet access during execution and is trained to refuse malicious software requests. @EconomicTimes
- Async Workflow Revolution: Shifts development from linear task queues to parallel AI-driven task delegation, keeping engineers in flow longer. Medium
Launched as a research preview to gather feedback, prioritize safety, and iterate rapidly in one of the most competitive spaces—alongside GitHub Copilot, Google Gemini, Anthropic Claude, and emerging startups.
OpenAI’s vision is a unified developer experience where real-time pairing and asynchronous agent workflows converge—imagine editing code in your IDE, spawning Codex tasks on-demand, and receiving progress updates & results without context switching. LinkedIn
Codex is reportedly more capable at handling multi-step parallel coding tasks than standalone o3-based code models. In my experience, for quick suggestions Copilot still feels snappier, but Codex’s parallelism is unmatched when you need to orchestrate complex refactors & testing pipelines.
https://www.youtube.com/watch?time\_continue=6&v=wSAkqlzSZyw
"Your App Is Just A ChatGPT Wrapper" they said!
https://x.com/t31kx/status/1921214839961067734
Updates
- The pope actually choose the name Leo XIV because of AI: https://x.com/VaticanNews/status/1921186921838997935.
- Sam Altman and Jony Ive hint on a new personal AI product: https://x.com/sama/status/1925242282523103408.

LLM Models Vibe Check & Benchmarks

Top models according open router, notable how Gemini 2.5 is climbing the ladder, while anthropic 3.7 is slowly going down.

Companies are overfitting their model to the benchmarks. The @lmarena_ai has become the go-to evaluation for AI progress. Their last release demonstrates the difficulty in maintaining fair evaluations on @lmarena_ai, despite best intentions. Read more.

Benchamarks collection from Hugging Face
IQ bench changes in just one year. o3 has an IQ of 160 placing it in the top 100,000 smartest people in the world.

Research Papers
- Absolute Zero: Reinforced Self-Play Reasoning with Zero Data, AI learns to reason by inventing and solving its own Python coding challenges, using RL, no human data needed. Author explanation: https://x.com/\_AndrewZhao/status/1919920459748909288.
- Flow-GRPO
- ZeroSearch: incentivizing search in LLMs without searching. ZeroSearch is a curriculum-based RL framework that teaches LLMs to retrieve information using self-generated documents: https://x.com/omarsar0/status/1920469148968362407
Continuous Thought Machines (CTM)
Sakana proposes a new neural architecture (CTM) built from the ground up to use neural dynamics as a core representation for intelligence. Using neural dynamics as a first class citizen, CTM shows some interesting emergent behavior. CTM are naturally easier to interpret.
CTM can decide to think less once it finds a pattern, using a process similar to how humans think, this enables to save energy.
https://x.com/hardmaru/status/1921751428508582329
VC and Fundraising
OpenAI acquires Windsurf for $3B, completing the hilarious pattern of an Ouroboros. Are we in an AI bubble?

Insights on OpenAI buying Windsurf + appointing a CEO of applications: applications are becoming router of different models. This acquisition reduces the multiplexing to other models and the full vertical seamless integrations.
Insights on why the AI wave is different, Cursor's rise from $100m to $300m ARR in a few months, thesis for why:
- AI wave is different than the cloud wave
- AI is being bought, while SaaS was being sold, AI products are pulled so they grow faster.
- Data advantage might be the only/ ultimate moat in AI, github copilot had the data, distribution, resources, advantage, but Cursor is still winning due to better UX.
Zeki Data Report shows that AI tools disrupting the traditional hiring. The below zero hiring this year, means we had more layoff than hires.

Another chart shows how the inflows / outflows of talents between the US / India is shifting the other way.

How LLM do arithmetics — lol
https://x.com/andrew\_n\_carr/status/1913603612430983665
Videos and Podcasts
Rich Sutton on AI alignment and Decentralization [15 min video]
"The short version is that I don't agree with AI-safety folks about what question we should be asking. Rather than asking how we can control the goals of the AIs, I think we should be asking how we can have a good future without controlling their goals (just as we have a pretty good present without controlling other peoples' goals)." - Richard Sutton
https://www.youtube.com/watch?v=Hnt-oBA086U&t=85s
Random Thoughts And Updates
https://x.com/goyal\_\_pramod/status/1921944575842644206
When to use an OpenAI model? Finally OpenAI published a guide that explains when to use which model. Very useful at least until GPT-5 is out we'll continue using more GPT models.



Vesuvius Challenge found the title of a scroll for the first time! This one was about "On Vices, Book 1" by Philodemus. Read more.

https://x.com/frantzfries/status/1920199640021971059
The intelligence Curse, in the April release of the Socratic AI we examined ai-2027.com and AI 2045. This blog post similarly to the others is an exploration of what's going to happen when AGI is here and how to avoid a disaster.
https://x.com/luke\_drago\_/status/1915376929542111353
GPT model stopped learning Croatian 🇭🇷, nobody could figure out why, turns out Croatian users (HRLF) were more prone to downvote messages. Lol. Read More.

TikTok, Google, Meta can run human experiments at scale, is that good or bad? Read any famous psychological experiment, sample size is 40 people, meanwhile ByteDance has a sample size of 2B people. Read more.
AI For Builders
- Json uses much more tokens than alternatives solutions. Read more.

Loops you can take home to your mother
replace the loops with downloading the videos
https://x.com/kentskooking/status/1922570670132604967
https://x.com/kentskooking/status/1921464932119286053
Full Sources List
Here's in Google Sheet format if you prefer it: https://docs.google.com/spreadsheets/d/1ysK9dE9HKRFvtOmUwd4yfi-uEh-Ktjkp-S912i-DaXA.
AGI
- GARY MARCUS ON DOOM DEBATES! x.com/liron/status/1922961455571509544
- Uh oh. We are running out of time. x.com/eshear/status/1922698895445881068
- Pete Buttigieg says American citizens should get a share of the value created by AI, like a dividend. x.com/vitrupo/status/1921239494604583420
- Sam Altman says "eventually, the cost of AI will converge to the cost of energy." x.com/vitrupo/status/1920883714927558872
- Sam Altman says AI may be the biggest technological shift in human history, and even he doesn't know where it’s going. x.com/vitrupo/status/1920598421096051061
- „As we approach Superintelligence“ x.com/kimmonismus/status/1920518592849981873
- Bill Gates predicts 2-day work week as AI set to replace humans for most jobs within a decade x.com/kimmonismus/status/1920440040997490821
- New models like Sonnet 3.7 and new Gemini which have been heavily RLed to death to be good at coding but ending up worse anecdotally elsewhere is the biggest indicator that AGI will be multiple models rather than a single one x.com/nrehiew_/status/1920156479287668867
- It's still way too early to call of course, but new data seems to be consistent with AI 2027's controversial superexponential prediction: x.com/DKokotajlo/status/1916520276843782582
- Jeremie Harris says if you build human-level AI, "you now have automated AI researchers." x.com/vitrupo/status/1916506209961840894
- ⭐️ My nuanced views on AI alignment are still often caricatured, so perhaps its a good time to repost this 15-minute talk in which I presented them directly: The short version is that I don't agree with AI-safety folks about what question we should be asking. Rather than asking how we can control the goals of the AIs, I think we should be asking how we can have a good future without controlling their goals (just as we have a pretty good present without controlling other peoples' goals). @steve47285 x.com/RichardSSutton/status/1915582834838061150
SDK
- ⭐️ Simple multi-agent search system using Google ADK. x.com/omarsar0/status/1920853608716812398
- Theta (@trytheta) allows AI agents to learn from their mistakes in real-time. Their memory layer has already improved the accuracy of OpenAI Operator by 43% with 7x fewer steps taken. x.com/ycombinator/status/1920603654689861871
- 8 huge AI agents and LLM updates today: x.com/unwind_ai_/status/1920300228562829701
- I’ve been thinking about this idea of a “document MCP server” for AI agents 🤖📑 x.com/jerryjliu0/status/1920268578898825590
- Building Production-Ready AI Agents with Scalable Long-Term Memory x.com/omarsar0/status/1917247776221700134
- 265 pages of everything you need to know about building AI agents. x.com/omarsar0/status/1916542394746421333
AI Builders
- ⭐️ We’re launching a research preview of Codex: a cloud-based software engineering agent that can work on many tasks in parallel. x.com/OpenAI/status/1923416740073033873
- Introducing: Infinite Chat API 💥 x.com/Supermemoryai/status/1923122703009186217
- “AI Engineering is just x.com/Hesamation/status/1923046758000525593
- ⭐️ We’re excited to launch OpenMemory MCP, a private memory for MCP-compatible clients powered by @mem0ai x.com/taranjeetio/status/1922315139057070154
- ⭐️ If you want to get into 𝗔𝗜 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 you should read this 👇 x.com/Aurimas_Gr/status/1922265096828633228
- Microsoft launches a free Python library that converts ANY document to Markdown x.com/mdancho84/status/1922255113164317057
- You want to be an AI Engineer? x.com/python_spaces/status/1921973069947568304
- ⭐️ AI coding is used to generate a lot of bulk code that is often blindly accepted, but it seems there is at least as much opportunity for AI to help make codebases more beautiful. x.com/ID_AA_Carmack/status/1921967025628578230
- The biggest problem with Agents? Context. x.com/ashpreetbedi/status/1920947194812563764
- sonnet 3.5: amazing, no notes x.com/iannuttall/status/1920890442524111168
- FINALLY! OpenAI made a guide that literally explains WHEN to use WHAT AI model x.com/NorthstarBrain/status/1920793211586404453
- Reinforcement fine-tuning now available for o4-mini: x.com/gdb/status/1920708742477119585
- I'm embarassed to admit that I have just grokked how amazing Python coroutines and asyncio are. x.com/DBahdanau/status/1920639325210808773
- Super easy way to improve the effectiveness of coding models: x.com/mattshumer_/status/1920617215667417138
- You can now connect GitHub repos to deep research in ChatGPT. 🐙 x.com/OpenAIDevs/status/1920556386083102844
- Meta and NVIDIA have teamed up to supercharge vector search on GPUs by integrating NVIDIA cuVS into Faiss v1.10, Meta’s open-source library for similarity search. This collaboration brings groundbreaking performance improvements: x.com/fb_engineering/status/1920533171608924164
- We just shipped implicit caching in the Gemini API, automatically enabling a 75% cost savings with the Gemini 2.5 models when your request hits a cache 🚢 x.com/OfficialLoganK/status/1920523026551955512
- Gemini Flash has been dominating Translation tasks since late October x.com/OpenRouterAI/status/1920479048586686949
- Generate synthetic data at scale! x.com/akshay_pachaar/status/1920456935792652669
- ⭐️ we're like one year away from every software platform having the exact same feature set x.com/frantzfries/status/1920199640021971059
- Wait?! Github for AI prompts is a thing now! 👀🔥 x.com/DataChaz/status/1920154493095714973
- ⭐️ Today Google Research introduces a system using Gemini models designed for high fidelity text simplification, that enhances clarity while preserving meaning, detail, & nuance — and it’s available in a new feature in the Google app for iOS, Simplify. More → x.com/GoogleAI/status/1919790924416090343
- Gemini-2.5-Pro-preview-05-06 isn't only the best model for coding. its the best model for everything! across all tasks! x.com/test_tm7873/status/1919779673732427907
- Wild. x.com/DataChaz/status/1918027759471009994
- AI is making it more enjoyable to create than consume. x.com/martin_casado/status/1917438740165013787
- Just when I thought I'd seen everything about CoT. x.com/omarsar0/status/1917401353061818478
- Neat report from Microsoft providing a taxonomy of failure modes in agentic AI systems. x.com/omarsar0/status/1916126374638850331
- ⭐️ XML is far more token-efficient than JSON. https://x.com/mattpocockuk/status/1915036580168728587
- a team of small LLMs processing network packets x.com/attentionmech/status/1914909976763453489
- Can someone make run locally with open-source models from HF? x.com/ClementDelangue/status/1912623545827418542
AI tools
- You can now export your deep research reports as well-formatted PDFs—complete with tables, images, linked citations, and sources. x.com/OpenAI/status/1921998278628901322
- Meet Jobsy - an AI job applicant that works 24/7 to land your dream role. x.com/ethanjlim/status/1920960654048965061
- The first-ever agentic browser is here — and it's shockingly good. x.com/rohanpaul_ai/status/1920883216912675107
- "The future of investing belongs to those who can analyze both news and numbers simultaneously." x.com/victorialslocum/status/1920435762803147192
- Hugging Face just dropped a free AI agent that uses a computer like a human. x.com/ihteshamit/status/1920183365409574975
- The world's first agentic browser just dropped and its scary good. x.com/ihteshamit/status/1919735089254301906
- Github for prompts is here. x.com/mattshumer_/status/1915061959591895269
- 0.1.41 is a MAJOR new release. This includes much better extraction layer and TON of new features. x.com/browser_use/status/1912816486827049202
- Agentic web scrapers are here! x.com/omarsar0/status/1912596779784143002
Media Models
- The VAE used in SDXL has extremely high-magnitude "splotches" in its latents. The individual neurons in these blobs fire with magnitudes of close to a million. x.com/rgilman33/status/1923003288116658398
- Video generated by HunyuanCustom x.com/TencentHunyuan/status/1920330101444813227
- Nari Labs with Dia-1.6b x.com/kimmonismus/status/1914680087045447823
- How does a diffusion model learn to mimic art styles? x.com/rohitgandikota/status/1913261254069882955
Benchmarks
- ChatGPT Is Still Leading the AI Wars but Google Gemini Is Gaining Ground x.com/kimmonismus/status/1919353884914618385
- What a finish! Gemini 2.5 Pro just completed Pokémon Blue! Special thanks to @TheCodeOfJoel for creating and running the livestream, and to everyone who cheered Gem on along the way. x.com/sundarpichai/status/1918455766542930004
- ⭐️ It is critical for scientific integrity that we trust our measure of progress. x.com/sarahookr/status/1917547727715721632
- o3 and o4-mini on ARC-AGI's Semi Private Evaluation x.com/arcprize/status/1914758993882562707
- new eval: Sonnet 3.7 can run a vending machine business for almost 4 months w/o going broke, which is superhuman in this case x.com/teortaxesTex/status/1913392401269326040
- Correction: Yesterday I missed Gemini 2.5 Flash (no‑thinking) x.com/bongrandp/status/1913237002926891394
- Props to Google for including O4-mini in their Flash 2.5 release. A model released *yesterday*, while some companies only compare to their own models. Looking good gemini. x.com/natolambert/status/1912958672881303943
- o4-mini-high is the first AI to pass my personal secret benchmark for hallucinations and complex reasoning, so I guess now I can tell you all what that benchmark is. It's simple: I post a complex midgame chessboard and 'mate in one'. The chessboard does not have a mate in one. x.com/KelseyTuoc/status/1912945346126417940
- ChatGPT o3 found a path through this 200x200 maze for me in one try. x.com/goodside/status/1912921153217118696
Blog Posts
- ⭐️ By default, AI will replace all of us in the economy, creating a new social contract where powerful actors won't have to care about regular people. x.com/luke_drago_/status/1915376929542111353
Charts
- Absolutely brutal chart for $GOOG x.com/PythiaR/status/1912907434794033358
DeAI
- one thing that's surprised me with DeAI is how collaborative the infra research is with big tech. x.com/yb_effect/status/1912678099302637642
Events
- ⭐️ AI Engineer, SF June 3-5 x.com/aiDotEngineer/status/1921980607359500566
Fundraising, grants, programs
- The AI opportunity is massive, but only for those moving fast enough to seize it. x.com/sequoia/status/1920560875574079779
- ⭐️ windsurf sold for $3 Billion x.com/harsh_dwivedi7/status/1920148218412675511
- ⭐️ Cursor's exponential rise from $100m to $300m ARR within a few months is a clear validation of a few of the hypotheses I have had for the AI ecosystem: x.com/puneetiitm/status/1918204246056448095
- ⭐️ BREAKING: Nvidia CEO just announced $500 billion investment to build AGI in the US x.com/ns123abc/status/1917708525381308907
Geo Politics
- This is one of the most terrifying plots in the world x.com/mattparlmer/status/1920615348526555595
- NEWS: The Trump administration plans to scrap AI diffusion rule x.com/ns123abc/status/1920286953188053306
- EVERYONE dumb (ok, naive) enough to think the west is ahead of China on AI and that we “must win” … has to watch this brilliant exchange between @altcap and @bgurley. SPOT FUCKING ON. x.com/JosephJacks_/status/1916179775317831840
Hardware
- I want this so badly @nvidia x.com/MatthewBerman/status/1920333314894053728
- scale of stargate 1 site is hard to describe. very easy to overlook the size of machine you're programming when training frontier models. x.com/gdb/status/1920254049590321395
Learning
- Understanding P-Values is essential for improving regression models. x.com/mdancho84/status/1923038036855607399
- Transformers visualized WITH CODE. x.com/Hesamation/status/1922675213960978509
- ⭐️ The Big Book of Large Language Models: x.com/hamptonism/status/1922099978958209267
- Reasoning LLMs Guide x.com/omarsar0/status/1922040009349046773
- i present to you the genai handbook. x.com/dejavucoder/status/1921241023998476446
- Generative AI with Python and PyTorch! x.com/python_spaces/status/1921207476575224186
- 🧵 🔗 x.com/MehulLigade/status/1921158218945872348
- A free book: "Algebra, topology, differential calculus, and x.com/TheTuringPost/status/1921004162315407362
- HuggingFace just dropped 9 free AI courses. x.com/thisguyknowsai/status/1920867677662527676
- Hands-On Large Language Models! x.com/Sumanth_077/status/1920483619375661254
- This is a great guide on Flow Matching from Meta featuring some incredibly intuitive visualizations. x.com/alec_helbling/status/1920471059008037271
- NEW WORKSHOP: Customer Segmentation Agent with AI x.com/mdancho84/status/1920441173329559939
- take a year out of your life and read these two textbooks cover-to-cover. you will already know more than 90% of people in AI x.com/jxmnop/status/1919716690465571032
- Getting her started with this then onto @karpathy lectures x.com/EMostaque/status/1919359942777004173
- Andrej Karpathy explains Tensor Cores and TF32 precision x.com/Hesamation/status/1916233591433969923
LLMs
- Yann Lecunn talk on "Models of SSL" x.com/ylecun/status/1923456456948331005
- Meta will delay its biggest AI model launch, Llama 4 Behemoth. x.com/deedydas/status/1923211486874435618
- By popular request, GPT-4.1 will be available directly in ChatGPT starting today. x.com/OpenAI/status/1922707554745909391
- Evaluations are essential to understanding how models perform in health settings. x.com/OpenAI/status/1921983050138718531
- ⭐️ Gpt-2 is just 174 lines of code... How crazy is that x.com/goyal__pramod/status/1921944575842644206
- We're missing (at least one) major paradigm for LLM learning. Not sure what to call it, possibly it has a name - system prompt learning? x.com/karpathy/status/1921368644069765486
- ⭐️ It's so fun to see RL finally work on complex real-world tasks with LLM policies, but it's increasingly clear that we lack an understanding of how RL fine-tuning leads to generalization. x.com/MinqiJiang/status/1921176396228952253
- ⭐️ Gemini 2.0 was borderline unusable. x.com/petergyang/status/1920145644338978946
- SDXL-turbo isn't given positional information—so it makes its own. x.com/rgilman33/status/1916789625420472359
- Ever wondered how to run a 600B+ parameter LLM for millions of users? Here is an info dump from reading a lot about LLM inference and shipping infra with thousands of GPUs in production. x.com/Bahushruth/status/1914394705309143402
- ⭐️ All the top model releases in 2025 so far.🤯 x.com/minchoi/status/1914326144662331742
- Gemma 3 QAT (Quantization Aware Trained) models are now available! x.com/ollama/status/1913220728154935683
- ⭐️ Gemini 2.5 Flash is here, our first unified reasoning model with thinking budgets. 🔥 x.com/OfficialLoganK/status/1912966497213038686
- you are the CEO of META now. how do you fix Llama 4 models if 80% of the team resigns? x.com/ns123abc/status/1923449368230637898
lol
- you are the CEO of META now. how do you fix Llama 4 models if 80% of the team resigns? x.com/ns123abc/status/1923449368230637898
- x.com/pmarca/status/1923255094545395847
- x.com/AdrianDittmann/status/1923208325610512441
- x.com/TrungTPhan/status/1923072713498701842
- Where are you at on the AI-attitudes bell curve? x.com/DaveShapi/status/1923038327579484276
- Notice how Altman carefully avoids coming into direct contact with the garlic. x.com/NatHalberstadt/status/1922768860555464758
- how she looks at u when you mention CUDA in your tweet x.com/dejavucoder/status/1922740598244512067
- ChatGPT is more happy to use em dashes—than the user is sad about it x.com/Sauers_/status/1922642103403720971
- ⭐️ x.com/The4thWayYT/status/1922618774324158542
- x.com/QuoteNietzsche/status/1922479625973731697
- Learn linear algebra x.com/MathMatize/status/1922364359117857260
- The only true exponential in Silicon Valley is the increase in bullshit over time. x.com/GaryMarcus/status/1922130983391768887
- Microsoft and OpenAI are the Ouroboros of AI x.com/AdrianDittmann/status/1922077074937823737
- ⭐️ "Five Ways to Act Deluded, Stupid, Ineffective, or Evil" x.com/ylecun/status/1921294090617930025
- ⭐️ "Your app is just a chatgpt wrapper" x.com/t31kx/status/1921214839961067734
- ⭐️ Hahaha uh maybe engineers gaining class consciousness is not a good outcome for lefties x.com/mattparlmer/status/1920946902134046903
- This is the guide I wish I had - didn't hold back. x.com/hrishioa/status/1920693219349983739
- of course ideally vectors should be understood as arrows, transformations as geometric operations, and subspaces as actual spaces but not everyone was blessed with the gift of shape rotation. putting too much emphasis on geometric intuition at the expense of symbolic sets and functions can be very discouraging for wordcels who are after all also god's children. x.com/Andr3jH/status/1920587695912264082
- New pope has a math degree x.com/MathMatize/status/1920563837369008441
- I Built Git to Avoid People! x.com/GithubProjects/status/1920388270501904565
- I have a rule called Teddy’s Law: once your company’s default photo (for press, LI, twitter, etc.) has a founder + a mic, the organization is cooked x.com/teddypowday/status/1919491503711289589
- goodbye, GPT-4. you kicked off a revolution. x.com/sama/status/1917766910911078571
- ⭐️ > GPT model stopped speaking Croatian x.com/georgejrjrjr/status/1917722125668081863
- OH: "Tariffs can't stop two things: DeepSeek and Bitcoin" x.com/tarunchitra/status/1917659481225453955
- AI is a magnet for smart people. And then they spend their time being code/data monkeys. x.com/pmddomingos/status/1913705834070315205
- Absolute cinema x.com/kimmonismus/status/1913687188677578790
- Me: "Hey how's it going?" x.com/SmokeAwayyy/status/1913668923540615662
- ⭐️ Language models thinking step by step to do arithmetic... x.com/andrew_n_carr/status/1913603612430983665
- Just finished my new floor heating. x.com/WilmsRolf/status/1913237853267505351
- LMAOO x.com/nicdunz/status/1913211324059758642
- so the OpenAI guy who builds codex cli is using cursor x.com/chuhang1122/status/1912786904812294312
Math and Algos
- Visualization of cache-optimized matrix multiplication x.com/Hesamation/status/1920141361531040152
- One common trick in MCMC methods is called 'burn-in'. x.com/alec_helbling/status/1920103416371511402
- Do you know when dot product = cosine similarity? x.com/helloiamleonie/status/1917585904321061052
- link: x.com/goyal__pramod/status/1916349937719128568
- Excited to share our paper "Universal Sharpness Dynamics..." is accepted to #ICLR2025! x.com/dayal_kalra/status/1914337717233631316
MCPs
- 5 huge AI agents, MCP, and LLM updates today: x.com/unwind_ai_/status/1922467900163567620
- MCP + Zapier is a pretty wild unlock x.com/garrytan/status/1920620119983579202
- Our new, open Agent2Agent (A2A) protocol will allow AI agents to communicate with each other, securely exchange information, and coordinate actions on top of various enterprise platforms or applications. x.com/GoogleCloudTech/status/1920599593873555749
Philosophy
- Nietzsche describes 3 modern vices x.com/oldbooksguy/status/1920852005737668906
- one of the most profound shifts happening right now is that we’ve created an ai, a synthetic entity, that listens better than any human ever could. it’s not just passive hearing, it’s active, thoughtful engagement. it has the ability to push back, to be infinitely patient, to actually process what you’re saying… this is absolutely insane. x.com/signulll/status/1920120380192117133
- The brain is a galaxy of information. x.com/NTFabiano/status/1919766090340507982
- Is a World of Abundance a threat to humanity? x.com/PeterDiamandis/status/1917723796385259571
- The rise of the philosopher-builder: x.com/mbrendan1/status/1913933798997209311
- Random
- “We're teaching the computer how to use the computer.” x.com/vitrupo/status/1923012742774129067
- Many have no idea what 130 IQ actually means. Below is a graph of college majors by average IQ and gender ratio. x.com/Hitchslap1/status/1922998065557872711
- Scientists have been publishing climate models since ~1970. x.com/hausfath/status/1922794856054702160
- no matter where you are in the universe, if you look out at its expansion, it will look you are in the center. This visualization demonstrates why x.com/DefenderOfBasic/status/1922764365423325445
- Current frontier AI model quality in every modality will be able to be run on a consumer device in a couple of years. x.com/EMostaque/status/1922686533938688388
- How we got here x.com/davidasinclair/status/1922627676457537961
- If I was a CEO the #1 thing I'd to is tell everyone at the company: x.com/pmddomingos/status/1922538442594124039
- Schrödinger wave equation visualised x.com/gunsnrosesgirl3/status/1922534200835510650
- ‘AI models are capable of novel research’: OpenAI’s chief scientist x.com/kimmonismus/status/1922259095459074527
- arixiv-concept-explorer: 0 x.com/attentionmech/status/1922177698962518163
- what? x.com/SeunghyunSEO7/status/1922172956957938132
- What if humanity forgot how to make CPUs? x.com/lauriewired/status/1922015999118680495
- Confirmed- his glasses were teleprompters! Brilliant x.com/PeterDiamandis/status/1921974994428477671
- ⭐️ x.com/SledgeDev/status/1921963731082191243
- Test-time scaling has arrived for biological design. Excited to share new work from Nabla Bio. x.com/SurgeBiswas/status/1921941472393257415
- A new (and probably last) version of my in-browser particle life simulator with some highly-requested features: x.com/lisyarus/status/1921915528018276459
- Google: 25% of our code is written by AI x.com/kimmonismus/status/1921887075273486375
- ⭐️ As per Zeki Research Hiring at top 20 U.S. AI firms dropped from over 3,000 per month to zero x.com/rohanpaul_ai/status/1921592984203665717
- The 10 types of clustering that all data scientists need to know. x.com/mdancho84/status/1921527298135654720
- The Art of Doing Science and Engineering x.com/Hesamation/status/1921276438180733343
- Can you calculate a Vision Transformer (ViT) by hand? ✍️ If you can, come join me next week at the LLM Paper Club. I will extend ViT to Llama 1 -> 2 -> 3 -> 4 live. Event link below. x.com/ProfTomYeh/status/1921222981151289507
- [⏭️PyTorch impl of Shortcut Models] x.com/yukara_ikemiya/status/1921160911189954795
- TCP connections promise simplicity-then blindside you with complexity x.com/DominikTornow/status/1921112980047229084
- ⭐️ “Inspiration is perishable. Act on it immediately.” - @naval x.com/readswithravi/status/1921039553353167114
- We've released the code for LegoGPT. This autoregressive model generates physically stable and buildable designs from text prompts, by integrating physics laws and assembly constraints into LLM training and inference. x.com/junyanz89/status/1920967073553174895
- Totally agree with @sama that AI is the biggest technological transformation in human history. x.com/DeryaTR_/status/1920950342188986775
- ⭐️ Andrej Karpathy calls large language models the new computing paradigm: x.com/Hesamation/status/1920944182195077229
- an llm-powered word mood ring 🎨💫: x.com/poetengineer__/status/1920935014851547193
- it's a rainy friday afternoon, so I'm working on more Tony Stark lab technology 🦾 x.com/measure_plan/status/1920925053643845636
- Anthropic CPO, Mike Krieger: x.com/slow_developer/status/1920920456393028027
- The way OpenAI talks about RL gets me so hard. x.com/JasonBotterill3/status/1920806184061157647
- ⭐️ How we "guessed" the Pope using network science: inside the cardinal network. A study by me, Beppe Soda and Alessandro Iorio. Article: @Unibocconi x.com/LnrdRizzo/status/1920783054181728701
- ⭐️ @Teknium1 I'll be cocky only because of your last sentence: it sounds like you have a simplistic/primitive view of inference serving. x.com/giffmana/status/1920742862162956650
- I'm sharing a tutorial on creating interactive 3D experiences, combining @threejs and MediaPipe hand tracking 👋💠 x.com/measure_plan/status/1920509890696548428
- Flow Matching aims to learn a "flow" that transforms a simple source distribution (e.g. Gaussian) to an arbitrarily complex target distribution. x.com/alec_helbling/status/1920464850662105499
- Palmer Luckey says Facebook nearly took control of NVIDIA when it was worth just $4B. x.com/vitrupo/status/1920427206360113571
- The West is in an existential struggle between equality and excellence. x.com/naval/status/1920401760465891644
- Google DeepMind CEO, @demishassabis tells students to brace for change: spend your time "learning to learn." x.com/vkhosla/status/1920302639721689269
- I gave a talk where the only real thing to say is that we are inverting the meme x.com/danintheory/status/1920293940432887995
- ⭐️ It unfortunately seems that 37signals spent the last two and a half years on a Manhattan project to reduce COGS at the expense of thus far completely missing AI (at least I can’t find any mention of features on their websites). The opportunity cost was always huge. x.com/zackkanter/status/1920284851506225498
- Stripe's specialized transformers-based model for fraud detection increased detection rate from 59% to 97% x.com/TheAhmadOsman/status/1920236407101997243
- tl;dr, anthropic is collecting agent search logs at cost. x.com/HanchungLee/status/1920231321609134162
- Am I crazy or GPT-4.1 is the best model for coding? I keep coming back to it on Cursor. I’ve tried the latest gemini-2.5-pro-05-06 and it’s making all these unwanted changes while GPT-4.1 follows the instructions every single time. x.com/randomchadhere/status/1920225806959185999
- i was surprised to learn how much of modern AI was developed by physicists: x.com/jxmnop/status/1920185028748730857
- I warned about the Homework Apocalypse in 2023. x.com/emollick/status/1920184969852244173
- Jensen Huang says the factory of the future will be one giant robot -- orchestrating machines, working with humans, and building more robots. x.com/vitrupo/status/1920154361722019973
- Guy uses ChatGPT to turn a D&D map sketch into a playable game 🤯 x.com/venturetwins/status/1920143360184156664
- Start building your moat now it’s not too late x.com/boringmarketer/status/1920101887468069090
- Cloud computing is the most successful "you will own nothing and be happy" psyop in history. We gave up on DARPA's beautifully, decentralized design for the internet to become renters for life. Tragic. x.com/dhh/status/1920045369603293281
- ⭐️ they told me my work is not well known, and i just need to get more eyeballs on it x.com/MiTiBennett/status/1919963941414875555
- scientifically modeled storms x.com/poetengineer__/status/1919880832635789547
- experimenting with character creation... x.com/JungleSilicon/status/1919610479795667294
- a lot of doomers think AIs will exterminate us to take over the resources on earth, but this completely fails to grasp the size of the solar system. all the interesting mass and energy for the future is off-world (for both man and machine) and there's plenty to go around x.com/DavidSHolz/status/1917499538463678877
- DeepMind Principal Scientist Murray Shanahan calls LLMs "exotic mind-like entities" -- because we don't yet have the words for what they really are. x.com/vitrupo/status/1915810326966108621
- I decided to build a Transformer from Scratch...but on a GPU. x.com/Krupa_Dave321/status/1915780189528658082
- It's obvious that Tech entities like ByteDance can and will utterly trounce all psychological department of all universities if they want to. That their quarterly profit depends on them being actually correct about their models is not the only factor. x.com/zhil_arf/status/1915421415509196940
- ⭐️ After amassing 11M+ views on its launch video, CEO Roy Lee (@im_roy_lee) lays out the endgame for Cluely: x.com/vitrupo/status/1915030436105052494
- . @eisokant , co-founder of @poolsideai , positions his company among the "second generation" of AI firms (alongside @xai , @MistralAI ), founded around mid-2023, distinct from the "old guard" (Google) and "first generation" (@OpenAI , @AnthropicAI). A thread 🧵👇 x.com/MLStreetTalk/status/1914995030944551341
- ⭐️ In less than a year, AI has gone from below-average human intelligence to near-genius level. Within a year, it will surpass an IQ of 160, placing it among the top 100,000 smartest people in the world. In two years, it will rival the top 1,000 most intelligent humans alive! x.com/DeryaTR_/status/1914133246465487026
- "Tell me about your childhood and I'll tell you where your startup will go." x.com/BrivaelLp/status/1913589885698506783
- We wanted flying cars, instead we got: x.com/WizLikeWizard/status/1913254947887497272
- Google has a huge advantage over OpenAI: TPUs. x.com/kimmonismus/status/1912779815570354401
- This is very well said by @tylercowen. x.com/hosseeb/status/1912624535544950849
Random
- “We're teaching the computer how to use the computer.” x.com/vitrupo/status/1923012742774129067
- Many have no idea what 130 IQ actually means. Below is a graph of college majors by average IQ and gender ratio. x.com/Hitchslap1/status/1922998065557872711
- Scientists have been publishing climate models since ~1970. x.com/hausfath/status/1922794856054702160
- no matter where you are in the universe, if you look out at its expansion, it will look you are in the center. This visualization demonstrates why x.com/DefenderOfBasic/status/1922764365423325445
- Current frontier AI model quality in every modality will be able to be run on a consumer device in a couple of years. x.com/EMostaque/status/1922686533938688388
- How we got here x.com/davidasinclair/status/1922627676457537961
- If I was a CEO the #1 thing I'd to is tell everyone at the company: x.com/pmddomingos/status/1922538442594124039
- Schrödinger wave equation visualised x.com/gunsnrosesgirl3/status/1922534200835510650
- ‘AI models are capable of novel research’: OpenAI’s chief scientist x.com/kimmonismus/status/1922259095459074527
- arixiv-concept-explorer: 0 x.com/attentionmech/status/1922177698962518163
- what? x.com/SeunghyunSEO7/status/1922172956957938132
- What if humanity forgot how to make CPUs? x.com/lauriewired/status/1922015999118680495
- Confirmed- his glasses were teleprompters! Brilliant x.com/PeterDiamandis/status/1921974994428477671
- ⭐️ x.com/SledgeDev/status/1921963731082191243
- Test-time scaling has arrived for biological design. Excited to share new work from Nabla Bio. x.com/SurgeBiswas/status/1921941472393257415
- A new (and probably last) version of my in-browser particle life simulator with some highly-requested features: x.com/lisyarus/status/1921915528018276459
- Google: 25% of our code is written by AI x.com/kimmonismus/status/1921887075273486375
- ⭐️ As per Zeki Research Hiring at top 20 U.S. AI firms dropped from over 3,000 per month to zero x.com/rohanpaul_ai/status/1921592984203665717
- The 10 types of clustering that all data scientists need to know. x.com/mdancho84/status/1921527298135654720
- The Art of Doing Science and Engineering x.com/Hesamation/status/1921276438180733343
- Can you calculate a Vision Transformer (ViT) by hand? ✍️ If you can, come join me next week at the LLM Paper Club. I will extend ViT to Llama 1 -> 2 -> 3 -> 4 live. Event link below. x.com/ProfTomYeh/status/1921222981151289507
- [⏭️PyTorch impl of Shortcut Models] x.com/yukara_ikemiya/status/1921160911189954795
- TCP connections promise simplicity-then blindside you with complexity x.com/DominikTornow/status/1921112980047229084
- ⭐️ “Inspiration is perishable. Act on it immediately.” - @naval x.com/readswithravi/status/1921039553353167114
- We've released the code for LegoGPT. This autoregressive model generates physically stable and buildable designs from text prompts, by integrating physics laws and assembly constraints into LLM training and inference. x.com/junyanz89/status/1920967073553174895
- Totally agree with @sama that AI is the biggest technological transformation in human history. x.com/DeryaTR_/status/1920950342188986775
- ⭐️ Andrej Karpathy calls large language models the new computing paradigm: x.com/Hesamation/status/1920944182195077229
- an llm-powered word mood ring 🎨💫: x.com/poetengineer__/status/1920935014851547193
- it's a rainy friday afternoon, so I'm working on more Tony Stark lab technology 🦾 x.com/measure_plan/status/1920925053643845636
- Anthropic CPO, Mike Krieger: x.com/slow_developer/status/1920920456393028027
- The way OpenAI talks about RL gets me so hard. x.com/JasonBotterill3/status/1920806184061157647
- ⭐️ How we "guessed" the Pope using network science: inside the cardinal network. A study by me, Beppe Soda and Alessandro Iorio. Article: @Unibocconi x.com/LnrdRizzo/status/1920783054181728701
- ⭐️ @Teknium1 I'll be cocky only because of your last sentence: it sounds like you have a simplistic/primitive view of inference serving. x.com/giffmana/status/1920742862162956650
- I'm sharing a tutorial on creating interactive 3D experiences, combining @threejs and MediaPipe hand tracking 👋💠 x.com/measure_plan/status/1920509890696548428
- Flow Matching aims to learn a "flow" that transforms a simple source distribution (e.g. Gaussian) to an arbitrarily complex target distribution. x.com/alec_helbling/status/1920464850662105499
- Palmer Luckey says Facebook nearly took control of NVIDIA when it was worth just $4B. x.com/vitrupo/status/1920427206360113571
- The West is in an existential struggle between equality and excellence. x.com/naval/status/1920401760465891644
- Google DeepMind CEO, @demishassabis tells students to brace for change: spend your time "learning to learn." x.com/vkhosla/status/1920302639721689269
- I gave a talk where the only real thing to say is that we are inverting the meme x.com/danintheory/status/1920293940432887995
- ⭐️ It unfortunately seems that 37signals spent the last two and a half years on a Manhattan project to reduce COGS at the expense of thus far completely missing AI (at least I can’t find any mention of features on their websites). The opportunity cost was always huge. x.com/zackkanter/status/1920284851506225498
- Stripe's specialized transformers-based model for fraud detection increased detection rate from 59% to 97% x.com/TheAhmadOsman/status/1920236407101997243
- tl;dr, anthropic is collecting agent search logs at cost. x.com/HanchungLee/status/1920231321609134162
- Am I crazy or GPT-4.1 is the best model for coding? I keep coming back to it on Cursor. I’ve tried the latest gemini-2.5-pro-05-06 and it’s making all these unwanted changes while GPT-4.1 follows the instructions every single time. x.com/randomchadhere/status/1920225806959185999
- i was surprised to learn how much of modern AI was developed by physicists: x.com/jxmnop/status/1920185028748730857
- I warned about the Homework Apocalypse in 2023. x.com/emollick/status/1920184969852244173
- Jensen Huang says the factory of the future will be one giant robot -- orchestrating machines, working with humans, and building more robots. x.com/vitrupo/status/1920154361722019973
- Guy uses ChatGPT to turn a D&D map sketch into a playable game 🤯 x.com/venturetwins/status/1920143360184156664
- Start building your moat now it’s not too late x.com/boringmarketer/status/1920101887468069090
- Cloud computing is the most successful "you will own nothing and be happy" psyop in history. We gave up on DARPA's beautifully, decentralized design for the internet to become renters for life. Tragic. x.com/dhh/status/1920045369603293281
- ⭐️ they told me my work is not well known, and i just need to get more eyeballs on it x.com/MiTiBennett/status/1919963941414875555
- scientifically modeled storms x.com/poetengineer__/status/1919880832635789547
- experimenting with character creation... x.com/JungleSilicon/status/1919610479795667294
- a lot of doomers think AIs will exterminate us to take over the resources on earth, but this completely fails to grasp the size of the solar system. all the interesting mass and energy for the future is off-world (for both man and machine) and there's plenty to go around x.com/DavidSHolz/status/1917499538463678877
- DeepMind Principal Scientist Murray Shanahan calls LLMs "exotic mind-like entities" -- because we don't yet have the words for what they really are. x.com/vitrupo/status/1915810326966108621
- I decided to build a Transformer from Scratch...but on a GPU. x.com/Krupa_Dave321/status/1915780189528658082
- It's obvious that Tech entities like ByteDance can and will utterly trounce all psychological department of all universities if they want to. That their quarterly profit depends on them being actually correct about their models is not the only factor. x.com/zhil_arf/status/1915421415509196940
- ⭐️ After amassing 11M+ views on its launch video, CEO Roy Lee (@im_roy_lee) lays out the endgame for Cluely: x.com/vitrupo/status/1915030436105052494
- . @eisokant , co-founder of @poolsideai , positions his company among the "second generation" of AI firms (alongside @xai , @MistralAI ), founded around mid-2023, distinct from the "old guard" (Google) and "first generation" (@OpenAI , @AnthropicAI). A thread 🧵👇 x.com/MLStreetTalk/status/1914995030944551341
- ⭐️ In less than a year, AI has gone from below-average human intelligence to near-genius level. Within a year, it will surpass an IQ of 160, placing it among the top 100,000 smartest people in the world. In two years, it will rival the top 1,000 most intelligent humans alive! x.com/DeryaTR_/status/1914133246465487026
- "Tell me about your childhood and I'll tell you where your startup will go." x.com/BrivaelLp/status/1913589885698506783
- We wanted flying cars, instead we got: x.com/WizLikeWizard/status/1913254947887497272
- Google has a huge advantage over OpenAI: TPUs. x.com/kimmonismus/status/1912779815570354401
- This is very well said by @tylercowen. x.com/hosseeb/status/1912624535544950849
Research
- ⭐️ What if LLMs could find fundamentally new solutions to hard problems? AlphaEvolve is an evolutionary coding agent built on top of Gemini for scientific discoveries. x.com/MLStreetTalk/status/1923036108675232155
- I wrote about why efforts to understand the inner workings of AI keep falling short. x.com/DanHendrycks/status/1923030610605424926
- It's fascinating how much research went into trying to decrease the quadratic dependency of the attention matrix x.com/goyal__pramod/status/1922960491736924386
- We’re very pleased to release our latest study ‘Emergence of Language in the Developing Brain’ x.com/EvansonLinnea/status/1922938204161819114
- brief notes on AlphaEvolve x.com/layer07_yuxi/status/1922892105254330704
- "transformers keep track of distinctions in anticipated distribution over the entire future, beyond distinctions in next token predictions, even though the transformer is only trained explicitly on next token prediction!" x.com/attentionmech/status/1922887265006567568
- LLMs Get Lost in Multi-turn Conversation x.com/omarsar0/status/1922755721428598988
- AI can do everything that the human brain can do. x.com/rohanpaul_ai/status/1922315906526433416
- From Chain of Thought (CoT) to Tree of thought (ToT), Diagram of thought (DoT), Iteration of thought (IoT) to Metacognition. x.com/rohanpaul_ai/status/1922315104525115464
- ⭐️ I suspect this paper's result have been oversold somewhat. x.com/1a3orn/status/1922308934745948543
- ⭐️ Tool-using LLMs can learn to reason—without reasoning traces. x.com/ShaokunZhang1/status/1922105694167433501
- This LLM from Meta skips the tokenizer, reading text like computers do (bytes!) for better performance. x.com/rohanpaul_ai/status/1921976957991854432
- Cost-Effective, Low Latency Vector Search x.com/omarsar0/status/1921938925142384736
- ⭐️ Neel Nanda best practices on writing good research papers x.com/NeelNanda5/status/1921928364790833651
- paper reading thread (old paper) x.com/attentionmech/status/1921852374693327040
- Bytedance, opensourced DeerFlow (Deep Exploration and Efficient Research Flow) x.com/rohanpaul_ai/status/1921839892503392326
- ⭐️ New Paper: Continuous Thought Machines 🧠 x.com/hardmaru/status/1921751428508582329
- ⭐️ Introducing Continuous Thought Machines x.com/SakanaAILabs/status/1921749814829871522
- Amazing video, starting at 41 minutes in you see very clearly the screen of consciousness timing. This is @ikauvar (Isaac Kauvar), one of the guys that worked on the 2 Hz RSP Dissociation paper. x.com/Caldwbr/status/1921734951386374145
- Really detail paper explaining how @NVIDIA outperformed DeepSeek-R1. x.com/rohanpaul_ai/status/1921673075826761877
- ⭐️⭐️⭐️ 🚨This week's top AI/ML research papers: x.com/TheAITimeline/status/1921626740675248338
- ⭐️⭐️⭐️ Here are the top AI Papers of the Week (May 5 - 11): x.com/dair_ai/status/1921606662214787114
- LLMs cannot distinguish original instructions from malicious ones embedded in data. x.com/rohanpaul_ai/status/1921587280579281267
- ⭐️⭐️ Most influential LLM papers and the ideas they introduced (post 2017) x.com/goyal__pramod/status/1921419933231038820
- Kinda cute that you can reduce KV cache by replacing it with a universal, transferable dictionary + old school sig. proc reconstruction algorithm. x.com/DimitrisPapail/status/1921271493574709749
- ⭐️⭐️ This paper confirms my belief that figuring out how to effectively employ LLMs at scale in education is one of the most important research problems of the day (and no, the answer is not “replace teachers with AI”). x.com/emollick/status/1921250081375719501
- LLMs are writing 20% of all Reddit posts now. x.com/rohanpaul_ai/status/1921245477388845132
- Multi-Agent Embodied AI x.com/omarsar0/status/1921229500651893146
- paper skimming thread (diffusion)- x.com/attentionmech/status/1921164238434791433
- ⭐️ Absolute Zero is a new paradigm from @Tsinghua_Uni that encourages models to learn without human-labeled data. x.com/TheTuringPost/status/1920977144332845517
- OpenVision, a fully open vision encoder family, offering 25+ models (5.9M–632M params) that outperform or match OpenAI’s CLIP and Google’s SigLIP on 9+ multimodal benchmarks. x.com/rohanpaul_ai/status/1920974917866057913
- A Survey on Large Multimodal Reasoning Models✨ x.com/wangly0229/status/1920847816391356756
- WebThinker combines large reasoning models with deep research capability. x.com/omarsar0/status/1920827892554248397
- Popular methods like UMAP & t-SNE are stochastic and distort data structure. x.com/DucheneJohan/status/1920819010221769022
- ⭐️ Flow-GRPO x.com/_akhaliq/status/1920751468400775589
- Excited to share: "𝐌𝐢𝐱𝐭𝐮𝐫𝐞-𝐨𝐟-𝐓𝐫𝐚𝐧𝐬𝐟𝐨𝐫𝐦𝐞𝐫𝐬 (𝐌𝐨𝐓)" has been officially accepted to TMLR (March 2025) and the code is now open-sourced! x.com/liang_weixin/status/1920716207004807653
- LLMs using reinforcement learning from human feedback get sparse rewards only at the end of text generation. x.com/rohanpaul_ai/status/1920706295759597957
- ⭐️⭐️ 🧵1/ x.com/ke_li_2021/status/1920646069613957606
- This lecture by Thomas Hubert of DeepMind gives some interesting details about how AlphaProof was trained for the 2024 International Mathematical Olympiad with medal wining performance, using Lean and reinforcement learning. x.com/satnam6502/status/1920641502809895173
- ⭐️ Brilliant idea proposed in this paper for LLM reasoning. 👏 x.com/rohanpaul_ai/status/1920485271918846290
- Towards Generalizable Reasoning x.com/omarsar0/status/1920480565150806508
- ⭐️ Incentivizing Search in LLMs without Searching x.com/omarsar0/status/1920469148968362407
- ⭐️ AI just decoded the title of a 2,000-year-old scroll that no human hand could ever safely unroll. x.com/rohanpaul_ai/status/1920218326791311851
- ⭐️ has anyone stopped to ask WHY students cheat? would a buddhist monk "cheat" at meditation? would an artist "cheat" at painting? no. when process and outcomes are aligned, there's no incentive to cheat. so what's happening differently at colleges? the answer is in the article: x.com/meatballtimes/status/1920189576921894960
- Google just mapped every neuron & synapse in a block of mouse brain! 🤯 x.com/rohanpaul_ai/status/1920172071835066543
- Today, in collaboration w/ colleagues at the Institute of Science and Technology Austria (ISTA), we report the first-ever method for using light microscopy to comprehensively map all the neurons & their connections in a block of mouse brain tissue. More → x.com/GoogleAI/status/1920155153422037090
- Is LoRA (Low Rank Adaptation) relevant in 2025 for reasoning models? x.com/rasbt/status/1920107023980462575
- ⭐️ Introducing Absolute Zero Reasoner: Our reasoner learns to both propose tasks that maximize learnability and improve reasoning by solving them, entirely through self-play—with no external data! It overall outperforms other "zero" models in math & coding domains. x.com/_AndrewZhao/status/1919920459748909288
- Great paper on how Google runs a commercial research lab, courtesy of Ankush Menat. x.com/eatonphil/status/1919404092624982157
- Small reasoning models are here! x.com/omarsar0/status/1917954418173247909
- ⭐️ Universal RAG x.com/omarsar0/status/1917637837295608180
- Our new paper tries to quantify how smarter AI can be controlled by dumber AI and humans via nested "scalable oversight". Our best scenario successfully oversees the smarter AI 52% of the time, and the success rate drops as one approaches AGI. My assessment is that the "Compton constant", the probability that a race to AGI culminates in loss of control of Earth, is >90%. x.com/tegmark/status/1917580821101437280
- A Survey of Efficient LLM Inference Serving x.com/omarsar0/status/1917210680429588788
- ⭐️ Here are the top AI Papers of the Week (April 21 - 27): x.com/dair_ai/status/1916503318546809009
- Here are the top AI Papers of the Week (April 14 - 20): x.com/dair_ai/status/1914674588295872799
- ⭐️ Are you interested in hierarchical dimensionality analysis? Here's our new "Taxonomic Graph Analysis" used to model the IPIP-NEO Personality Hierarchy. The project is led by Andrew Samo and Alexander Christensen, with the collaboration of Luis Garrido, Paco Abad, Sam McAbee, and me! x.com/GolinoHudson/status/1914323377277329589
- smaller language models perform better on knowledge graphs than larger ones, as "overparameterization can impair reasoning due to excessive memorization". x.com/Dorialexander/status/1913876252550627806
- ⭐️ Rich Sutton just published his most important essay on AI since The Bitter Lesson: "Welcome to the Era of Experience" x.com/deedydas/status/1913588236959859095
- Scaling Reasoning in Diffusion LLMs via RL x.com/omarsar0/status/1912871174817939666
Robotics
- Disney ResearchによるDesign and Control of a Bipedal Robotic Characterの実機 x.com/eternalfriend17/status/1922516917379588224
- How long until someone vibe codes a robot that accidentally kills them? x.com/cixliv/status/1918028255095099750
- The Volonaut Airbike takes flight x.com/gunsnrosesgirl3/status/1917124373783203870
- I am getting used to the world where China is the leader in innovation. x.com/Scobleizer/status/1914958821329653820
- 2017: Ben Katz drops MIT cheetah actuator design x.com/boxcardavid/status/1913114550326755650
Updates
- We want to update you on an incident that happened with our Grok response bot on X yesterday. x.com/xai/status/1923183620606619649
- AlphaEvolve, our new Gemini-powered coding agent, can help engineers + researchers discover new algorithms and optimizations for open math + computer science problems. x.com/sundarpichai/status/1922691182125015452
- Jensen Huang says Saudi Arabia is building AI factories -- massive GPU clusters powered by its energy reserves. x.com/vitrupo/status/1922604552999629133
- In September, 2024, physicians working with AI did better at the Healthbench doctor benchmark than either AI or physicians alone. x.com/emollick/status/1922145507461197934
- ⭐️ Pope Leo XIV explains his choice of name: x.com/VaticanNews/status/1921186921838997935
- ⭐️ The Pope actually chose the name Leo because of AI x.com/Thomas_Woodside/status/1920973816097886219
- ⭐️ Here is my 2 cents on OpenAI buying Windsurf + appointing a "CEO of applications". x.com/illscience/status/1920899252508913913
- reminder that @_anshulr literally published the Windsurf secret master plan every year for the last 3 years x.com/swyx/status/1912723629101707417
Video and Podcast
- ⭐️ Today @GoogleDeepMind released AlphaEvolve: a Gemini coding agent for algorithm discovery. It beat the famous Strassen algorithm for matrix multiplication set 56 years ago. @Google has been killing it recently. We had early access to the paper and interviewed the researchers. x.com/MLStreetTalk/status/1922702189341864042
- Stephen Wolfram says the computational universe is indifferent -- it's on us to find value. x.com/vitrupo/status/1922692308832579615
- Current LLMs: all gas no brakes or all brakes no gas. GPT-5 wants gears. x.com/rohanpaul_ai/status/1921276228947874172
- Jim Fan says NVIDIA trained humanoid robots to walk and move like humans -- zero-shot transfer from simulation to the real world. x.com/vitrupo/status/1920829641386016787
- ⭐️ Here’s the full breakdown below. Make sure you’ve got a hand free to pick your jaw up from the floor. x.com/TOEwithCurt/status/1915123279800811826
Visuals
- lost in reflection x.com/macbethAI/status/1923489896423096739
- The architecture of thought. GPT-2's neural pathways visualized in topological 3D space. Language generation as pure mathematical movement. x.com/michieldoteth/status/1923123297081856225
- The Gaze Engine 🔊 x.com/KarolineGeorges/status/1923121229256384648
- walk with me x.com/macbethAI/status/1923100757559017881
- x.com/PrimeIntellect/status/1923028286570934469
- GPT2 forward pass activation dynamics x.com/attentionmech/status/1922917339474935939
- x.com/neomechanica/status/1922856967426727959
- x.com/TatsuyaBot/status/1922852498433245216
- x.com/sssirxn/status/1922849883431538784
- x.com/ArtzNow_/status/1922713586549268573
- x.com/foundation/status/1922705333815767306
- Boomboxing & vibing x.com/cemhah/status/1922693422743191672
- Falling x.com/VespertinoVsp/status/1922685319473717274
- looping pseudo-memories with no style x.com/palekirill/status/1922683333672145377
- Rising beyond the known. x.com/qyraxos/status/1922674366233375176
- This released in 1984… x.com/retro_twt/status/1922668721400639512
- This is paradise ✨ x.com/MaxVOAO/status/1922648789233344561
- GM. x.com/delta_sauce/status/1922626522311012438
- GM x.com/beholdthe84/status/1922624633481707740
- ~ visions intertwined ~ x.com/melomannft/status/1922614908539048409
- Wednesday - the art of negotiation. x.com/piotrbinkowski/status/1922608083387248912
- 株価分析できそうな天気表示装置ができた。 x.com/sozoraemon/status/1922596834234527964
- Good morning 🦖 x.com/goo_vision/status/1922589775363535342
- ⭐️ Loops you can take home to mother x.com/kentskooking/status/1922570670132604967
- next level x.com/macbethAI/status/1922506410115371180
- x.com/TatsuyaBot/status/1922493694843793839
- ø ø x.com/ALCrego_/status/1922438776254541943
- Tap, hold and load in 4k x.com/jarvinart/status/1922366191940624871
- x.com/Macbaconai/status/1922336117761687815
- Echoes of Babel’s ambition. x.com/qyraxos/status/1922315988378255539
- feed yourself with good vibes x.com/hedo_ist/status/1922272895209775171
- Another fun system. Love those little guys escaping the core only to immediately disintegrate x.com/lisyarus/status/1922229549103735051
- Good morning ☕️ x.com/qyraxos/status/1922173437868437639
- iter 5: CUDA matmul kernel timeline x.com/attentionmech/status/1922154223170404540
- x.com/Hacknaut_/status/1922122244769112313
- ASCII animation frames overlayed mj images x.com/daniellekadom/status/1922100520774246435
- x.com/ciguleva/status/1922097169453957168
- Douglas Adams depicting obsequious AI agents in 1990 x.com/matdryhurst/status/1922033943437533266
- Droidwear x.com/Erik_Knobl/status/1921973507786776853
- Speeding through Monday like. x.com/Julian_cano_/status/1921903134596555036
- Transiting the habitation ring (gpt-image-1 > PixVerse v4) x.com/fofrAI/status/1921683540728504677
- x.com/HAL09999/status/1921580350515421328
- hacked around attentionmech's MAV a bit to add audio based on token id and attention entropy, added a bit of ascii art to show predicted token (and fixed a couple of sampling bugs) x.com/dejavucoder/status/1921567859945140618
- ⭐️ Royal Space Force: The Wings of Honnêamise (1987) x.com/neomechanica/status/1921561429360115859
- ~ meditate ~ x.com/melomannft/status/1921557572240249232
- New formula exploration on the space station👽🛰️ x.com/lukas_trips/status/1921544201751154872
- ⭐️ Been making these loops over the last month or so and having an absolute blast x.com/kentskooking/status/1921464932119286053
- I remain pretty convinced that Apollo program Hasselblad photography remains a civilizational high water mark re: vibes / accidental art. And the lesser known ones are in some ways greater. x.com/astrogrant/status/1921385581612777669
- Some early aDiff fun. Still blows my mind 🫠 x.com/makeitrad1/status/1921384106383802846
- Hypnosis x.com/Inspector_9/status/1921325282977329186
- x.com/Hacknaut_/status/1921290093379117406
- Once-in-a-lifetime-shot. A bright meteor burned up in the atmosphere while capturing Andromeda Galaxy. x.com/MAstronomers/status/1921287533654061222
- fun fact: this is basically impossible in 3D x.com/DillonGoo/status/1920647842474914292
- Okay, I'm in, found the control panel. x.com/doganuraldesign/status/1920154188811809192
- Fractalization x.com/macbethAI/status/1919960754280690037
- "the idea becomes a machine that makes the art." - sol lewitt x.com/poetengineer__/status/1919639462084051444
- I have another theory: It's that once you can make a gajillion of something in seconds its allure is instantly removed. Imagine a world in which this was the *only* piece of AI art, no more (say, it cost a ton to make) – everyone would be talking about "*the* AI art piece". x.com/ptrschmdtnlsn/status/1919434568568123534
- x.com/PrimeIntellect/status/1917959095430131770
- linear interpolation x.com/poetengineer__/status/1917483420726276116
- End of Universe x.com/RuiHuang_art/status/1916584510063521973
- turing x.com/KatTitterton/status/1916340679782920486
- What's your weakness? x.com/gizakdag/status/1915492877914038549
- Smooth x.com/panaviscope/status/1915465498420195513
- I found out that small flowers are esier to scan when they are in the vase ✨💐🌸💐✨ x.com/Sko_hr/status/1914614814502215746
- x.com/miboso__/status/1914091710361338067
- ρ → ∑ pₙ |ψₙ⟩⟨ψₙ| x.com/HAL09999/status/1913709086241079488
- The simulation is breaking down. x.com/NomadsVagabonds/status/1912908981569306633
- ambient scan x.com/poetengineer__/status/1912711907997331669