AI Dinner 10.0 at Greycroft, May 21st
Another AI dinner at the Greycrofthttps://www.greycroft.com/ office. The focus this month will be on A2A, top 30 Ilya Sutskever papers, Alpha Evolve, and the latest in AI — we'll use this blog post yo
Another AI dinner at the Greycrofthttps://www.greycroft.com/ office. The focus this month will be on A2A, top 30 Ilya Sutskever papers, Alpha Evolve, and the latest in AI — we'll use this blog post yo
There are 2 major conferences happening next month, to decide which one to go I've scraped the speakers and events from both and put that in a google sheet, you can check it here: https://docs.google.
Google I/O 2025 doubled-down on Gemini-powered agents, AI-first Android, and a dash of new hardware—the clearest signal yet that “Google does the Googling” for you. Key Highlights AI Mode for Search –
AlphaEvolve was received positively after its 14 May 2025 reveal. Powered by Gemini-2 models, the evolutionary coding agent discovers and refines algorithms that are already saving compute, speeding u
Codename codex-1, is a specialized evolution of our o3 reasoning model fine-tuned for software engineering tasks. Key Highlights Parallel Tasking: Executes writing features, bug fixes, tests, and code
The pope actually choose the name Leo XIV because of AI: https://x.com/VaticanNews/status/1921186921838997935.
Sam Altman and Jony Ive hint on a new personal AI product: https://x.com/sama/status/1925242282523103408https://x.com/sama/status/1925242282523103408.
Top models according open routerhttp://openrouter.com, notable how Gemini 2.5 is climbing the ladder, while anthropic 3.7 is slowly going down. Companies are overfitting their model to the benchmarks.
Absolute Zero: Reinforced Self-Play Reasoning with Zero Data, AI learns to reason by inventing and solving its own Python coding challenges, using RL, no human data needed. Author explanation: https:/
Flow-GRPO
ZeroSearch: incentivizing search in LLMs without searching. ZeroSearch is a curriculum-based RL framework that teaches LLMs to retrieve information using self-generated documents: https://x.com/omarsa
Continuous Thought Machines CTM Sakana proposes a new neural architecture CTM built from the ground up to use neural dynamics as a core representation for intelligence. Using neural dynamics as a firs
OpenAI acquires Windsurf for $3B, completing the hilarious pattern of an Ouroboroshttps://en.wikipedia.org/wiki/Ouroboros. Are we in an AI bubble? Insights on OpenAI buying Windsurf + appointing a CEO
Insights on why the AI wave is differenthttps://x.com/puneetiitm/status/1918204246056448095, Cursor's rise from $100m to $300m ARR in a few months, thesis for why: AI wave is different than the cloud
Zeki Data Reporthttps://zekidata.com/the-us-to-become-a-net-exporter-of-ai-talent-in-2025/ shows that AI tools disrupting the traditional hiring. The below zero hiring this year, means we had more lay
How LLM do arithmetics — lol https://x.com/andrew\n\carr/status/1913603612430983665
Rich Sutton on AI alignment and Decentralization \15 min video\ "The short version is that I don't agree with AI-safety folks about what question we should be asking. Rather than asking how we can con
https://x.com/goyal\\pramod/status/1921944575842644206
When to use an OpenAI model? Finally OpenAI published a guide that explains when to use which modelhttps://help.openai.com/en/articles/11165333-chatgpt-enterprise-models-limits. Very useful at least u
Vesuvius Challenge found the title of a scroll for the first time! This one was about "On Vices, Book 1" by Philodemus. Read morehttps://scrollprize.substack.com/p/60000-first-title-prize-awarded?trie
The intelligence Curse, in the April release of the Socratic AIhttps://site.flowai.xyz/ai-socratic-apr-2025/ we examined ai-2027.com and AI 2045. This blog post similarly to the others is an explorati
GPT model stopped learning Croatian 🇭🇷, nobody could figure out why, turns out Croatian users HRLF were more prone to downvote messages. Lol. Read Morehttps://x.com/georgejrjrjr/status/1917722125668
TikTok, Google, Meta can run human experiments at scale, is that good or bad? Read any famous psychological experiment, sample size is 40 people, meanwhile ByteDance has a sample size of 2B people. Re
Json uses much more tokens than alternatives solutions. Read morehttps://x.com/mattpocockuk/status/1915036580168728587.
replace the loops with downloading the videos https://x.com/kentskooking/status/1922570670132604967 https://x.com/kentskooking/status/1921464932119286053
Search across events, members, and blog posts