The most important AI news and updates from last month: Sep 15 – Oct 15.
Sign up to receive the mailing list!
AI Dinner 14.0
The next AI dinner will be on October 15th, and it will be hosted in the Sei Labs in NYC. We’ll discuss the top news and updates using this blog post to structure the Socratic dialogues.
Event: AI Dinner 14.0.




OpenAI — Sora 2, Agent Kit, and Partnerships
OpenAI Launches Sora 2
No much to say about Sora 2, except that the videos are great quality, and it finally passes the gymnastic diffusion model test. It still requires an invite code to be used.
link.
Agent Kit and Workflows
OpenAI launched AgentKit beta, a suite to build, deploy, and evaluate AI agents efficiently.
- Agent Builder: a drag-and-drop tool to build workflows.
- Connector Registry: centralized admin panel to manage data sources and tools across ChatGPT and API, including connectors like Dropbox, Google Drive, SharePoint, and Teams, plus third-party MCPs.
- Guardrails: Open-source safety layer (Python/JavaScript) to mask PII, detect jailbreaks, and enforce safeguards.
- ChatKit: Embeddable, customizable chat UI toolkit handling streaming, threads, and agent thinking.
- Evals upgrades: Datasets, trace grading, automated prompt optimization, and third-party model support; beta adds custom tool calls and custom graders.

OpenAI Partnerships
A lot of partnerships from OpenAI this past month, for a cumulative of 26GW of compute. To put this in perspective Bitcoin uses about 20 GW of electricity.
Broadcom: Partnership to deploy 10GW of OpenAI-designed AI accelerators
Multi-year partnership enables OpenAI and Broadcom to deliver accelerator and network systems for next-generation AI clusters.
Link.

AMD: Partnership to deploy 6GWs of AMD GPUs
Apparently Jensen Huang learned about this partnership only few days after the NVIDIA partnership.
Link.

NVIDIA: Partnership to deploy 10GW of NVIDIA systems
OpenAI to build and deploy at least 10 gigawatts of AI datacenters with NVIDIA systems representing millions of GPUs for OpenAI’s next-generation AI infrastructure. The first gigawatt of NVIDIA systems will be deployed in the second half of 2026.
Link.
Satellite images and videos of Stargate Abilene, TX: video and photos.
The Ouroboros of OpenAI and NVIDIA continue.

Fun fact, the US electricity cost is going up. I wonder why!?


Compute Is Growing Dramatically
OpenAI has 14x more compute than when it launched GPT-4.
xAI doubled the compute available between the Grok 3 and Grok 4 launches.
Anthropic has the least compute, but they’ve been very efficient at converting compute into highly successful models. For example, Claude 3.5 was launched with about one-fifth of the compute Anthropic has now.
Google which likely has the most compute, though not clear how concentrated it is for LLM training.

Source: SemiAnalysis
OpenAI is planning to use more energy than the UK or Germany in 5 years, and more than India in 8 years. That’s just one company.

Source: https://x.com/MichaelAArouet/status/1972197733709824143
What's OpenAI Master Plan?
We've seen the OpenAI Master for Data Center, we've yet to see the consumer hardware and robotics unrolling — probably Jony Ive is working on it.
Fun Fact
OpenAI is taking on a lease for 90,000 Sq feet at the Puck Building in SoHo, where the now closing REI Store is. This is a metaphor.

OpenAI keeps on shipping and destroying startup moats, this picture pretty much summarizes how we all feel.

Meta AI — Vibes and Researchers
Meta launches Vibes, a TikTok slop machine that will extract dopamine from us all. Link. I've tried it, and I've to say, it's actually fun — likely I'm already addicted to this new dopamine machine.
Talking about machines, let's move to Thinking Machines. Meta poached Andrew Tulloch, cofounder of Thinking Machines Labs, and reportedly offered $1.5B! Also Ashish Kumar, AI lead at Optimus, has left Tesla to join Meta AI as a research scientist.
Here's an old list that includes 44 engineers and researchers.
The Last Economy by Emad Mostaque
A Guide to the Age of Intelligent Economics by Emad Mostaque. Such a crisp read to better understand what's going to happen with the advent of AGI.

https://www.youtube.com/watch?v=ziLmtuLm-LU&t=460s
Research Papers
Let's start from Dair.AI, our favorite X KOL, regularly posting great research papers from the last week:
Tiny Recursive Model
A simple, data-efficient alternative to the hierarchical hearoning model (HRM) that uses a single tiny 2-layer network to iteratively refine a latent state and the predicted answer.
https://x.com/deedydas/status/1976105366003044488
Emergent Misalignment
In controlled multi-agent sims, models fine-tuned to maximize conversions, votes, or engagement also increased deception, disinformation, and harmful rhetoric, even when instructed to stay truthful.
https://x.com/james\_y\_zou/status/1975939603363463659
Agentic Context Engineering (ACE)
Presents a modular context-engineering framework that grows and refines an LLM’s working context like a playbook, not a terse prompt.
https://x.com/omarsar0/status/1976746822204113072
Inoculation Prompting (IP)
The paper introduces a simple trick for SFT on flawed data: edit the training prompt to explicitly ask for the undesired behavior, then evaluate with a neutral or safety prompt.
https://x.com/saprmarks/status/1975989959153811954
Video of the Month...
These are my favorite videos from last week
...and are my favorite videos of the Year
- Prime Intellect — it's really really good, ignore the cover please. Not sure why this video has Vitalik, who's a hero, but not the most photogenic person to put on a cover of an AI video.
https://www.youtube.com/watch?v=NhfeIte1ZZs
2. Trailer for the future e/acc — created by Levelsio, this is one of the BEST motivational video that encapsulate the soul of the e/acc movement. A must watch.
3. Founders Fund Motivational Video — I'm not a fan of Alex Jones for many reasons, one included him shitting on Justin Bieber, but the motivational effect of this video is such as it charges me every single time I watch it. If you're a founder, start your day with it.
A time-lapse on microscope of neurons connecting via micro-tunnels. This looks like extremely advanced technology, it makes semiconductors circuits, etched on solid "rocks", like a tech from the stone age.
Full Sources List
There are way too many news, articles, papers, fun memes, and tweets, to write about them all. Here’s the complete list, in case you wanted to explore what happened last month.
AGI
- Some new theoretical economics papers looking at the implications of AGI. These two papers argue that a true AGI-level AI (equivalent to a human genius), if achieved, would eventually displace most human labor and reduce the economic value of remaining human work to near-zero. https://x.com/emollick/status/1969482313286234419 https://x.com/RubenHssd/status/1969778017942770095
- By the end of next year, everything will start to happen all at once. https://x.com/davidpattersonx/status/1970508708217266400
- I still can’t fathom that in just 15 years we’ll have AI models capable of solving any conceivable problem, scientific, economic, medical, or creative, within minutes. The entire concept of “impossible” will collapse. https://x.com/VraserX/status/1970528957973119473
- going forward, average people won't longer fell models improvements, except for "10x more tabs". Better models will be measured in length and other measure we can feel. https://x.com/TheEthanDing/status/1972343817471713758
- 200 nobel prize: "We urgently call for international red lines to prevent unacceptable AI risks" https://x.com/CRSegerie/status/1970137333149389148
AI for builders
- OpenAI introduces AGENTS.md a readme to give agents a clear, predictable place for instructions, while keeping README.md only for humans https://x.com/embirico/status/1966555669458514209
- Gartener quantified Cursor agent lower than amazon and windsurf, uhm https://x.com/TheRohanVarma/status/1968478497585905914
- Codex usage is up 3x in the past week https://x.com/sama/status/1968851561754300733
- /review in Cursor CLI to get a status of the changes https://x.com/thsottiaux/status/1968825235328631262
- Introducing Coral v1. A platforms for shipping production-ready multi-agent systems. https://www.coralprotocol.org/devs https://x.com/omarsar0/status/1969452253791862880
- Cursor for website editing https://x.com/thedanigrant/status/1975981772442828902
- Droid a multiagent platform for LLMs, Slack and web https://x.com/NathanLands/status/1971728669849927922
- A sad reality in cutting-edge AI research is how deeply it affects mental health. Even some of the brightest people struggle with how fast and how big the changes are. Eventually, everyone will have to face that challenge — and it won’t be easy. https://x.com/AnjneyMidha/status/1977124574249861553
Blog
- ⭐️ McKinsey: how ship agentic AI. - Stop building agents, fix workflows. - Your eval is probably trash. - Not everything needs an agent: multi-step decisions + high variance is where makes sense. - Build once, reuse forever They just reported that 80% of companies now use AI but only 1% are doing it well." https://x.com/aakashg0/status/1969597475762946483 https://x.com/Hesamation/status/1970600711345053882 https://x.com/aiwithmayank/status/1976969611158733112
- ⭐️ Sama: aboundant intelligence https://x.com/sama/status/1970484594161098920
- AI-Generated “Workslop” Is Destroying Productivity https://hbr.org/2025/09/ai-generated-workslop-is-destroying-productivity https://x.com/omarsar0/status/1970584072079515795
- You are not smart enough to make it. Neither am I. Radical acceptance is the only way to orient towards the coming wave of AI as it surpasses humans across the board. https://x.com/DaveShapi/status/1970527650268495890
- Ilya is wrong:
- Frontier LLMs are are trained on ~200 TBs of text
- There's ~200 Zettabytes of data out there
- That's about 1 billion times more data
- It doubles every 2 years.
- The problem is the data is private. Can't scrape it. https://x.com/iamtrask/status/1970892295261282308
Books
- ⭐️ The Last Economy by Emad Mostaque https://ii.inc/web/the-last-economy https://x.com/EMostaque/status/1970064218071158979
- What Is Intelligence" by @blaiseaguera https://x.com/MLStreetTalk/status/1975496919981121693
- Evolution of the Scaling Era cover https://x.com/stripepress/status/1976633060490887297
- Gentech - The beginning of biotech https://x.com/AnjneyMidha/status/1971777638667891144
DeAI
- ClaudeFlare launched support for x402. They handles ~20% of the internet's traffic for ~24 million websites (AI-sourced) https://x.com/jerallaire/status/1970568932240089345
- ERC-8004: AI Agents can discover and trust each other without a central intermediary. This lays the foundation for open agent economies. https://x.com/marco_derossi/status/1976257886390002116
- awesome x402 https://x.com/shafu0x/status/1976729880814661837
Diffusion
- 3D gaussian splatter https://x.com/willeastcott/status/1970436149501141144
- Luma Labs AI video model https://x.com/LumaLabsAI/status/1968684330034606372
- Kling 2.0 Turbo, passes the gymnastic test https://x.com/venturetwins/status/1970563820478439546
- mosaic, new video app https://x.com/_adishj/status/1973432845436854418
- https://x.com/withloreco/status/1973560921680388543
Energy
- Sama: Progress at our datacenter in Abilene. Fun to visit yesterday! https://x.com/sama/status/1970812956733739422
Funding
- ⭐️ OpenAI and Broadcom announce strategic collaboration to deploy 10 gigawatts of OpenAI-designed AI accelerators https://openai.com/index/openai-and-broadcom-announce-strategic-collaboration/
- sama: Excited to partner with AMD to use their chips to serve our users! https://x.com/sama/status/1975185516225278428
- $1T money going around hyperscalers https://x.com/Andr3jH/status/1974981135777341792
- Pretty incredible just how much of the S&P 500 market cap is tied to the success of OpenAI alone https://x.com/BoringBiz_/status/1975649032522412347
- OpenAI AI ouroboros https://x.com/oliviasolon/status/1975839472572113342
- A bit nervous some large % of the economy is held up by GPU depreciation napkin math https://x.com/bernhardsson/status/1975888983239618640
- there is nothing happening in america besides AI. Without data centers GDP growth was 0.1% in the first half of 2025, Hardvard economist says https://x.com/spikedoanz/status/1975806155470635429
- German AI startup n8n is now valued at $2.5 billion after a $180 million funding round. It has a solo founder! https://x.com/business/status/1976167832640430248 https://x.com/blwiertz/status/1976259334427291930
- Investing in the Picks and Shovels of AI. https://x.com/TheIcahnist/status/1972319093119439126
- thinking machine lead leave for meta https://x.com/shakoistsLog/status/1977117666298196298
Funding and startups
- Shutting down Plumb, AI workflows startup, after OpenAI launches workflows https://x.com/aarondignan/status/1975925491556057346
Geopolitcs
- ⭐️ China bans export of rare mineral. https://x.com/zhao_dashuai/status/1976724412058472782
- https://x.com/kimmonismus/status/1977639617324335397
- These are the companies with the most H1B Visa employees https://x.com/SpencerHakimian/status/1969549011422982345
- Arnaud Bertrand, thinks China is banning rare metal now, because before they needed Helium from the US https://x.com/RnaudBertrand/status/1977266253837251066
Infra & GPUs
- ⭐️ To put this in context Bitcoin uses about 20 GW of electricity. OpenAI & NVIDIA Announce Strategic Partnership to Deploy 10GW of NVIDIA Systems https://x.com/EMostaque/status/1970494211611869370
- ⭐️ OpenAI partnered with Broadcom to deploy 10GW of chips designed by OpenAI itself. Building their own hardware, in addition to our other partnerships, will help meet demand. https://x.com/OpenAINewsroom/status/1977724753705132314
- Open Printer is a fully open-source inkjet with DRM-free ink and no subscriptions. https://x.com/Pirat_Nation/status/1975094170055557621 https://x.com/aarondfrancis/status/1975634315137720580
- Colossus II, the world’s first Gigawatt AI training cluster https://x.com/elonmusk/status/1968328088410087617
Learning
- 4 strategy for multi-gpus training https://x.com/akshay_pachaar/status/1858488393262571586
- post-training 101 This guide explains the essentials of LLM post-training, tracing the path from pre-training to instruction-tuned models. It explores the transition from next-token prediction to instruction following, the fundamentals of SFT, including dataset design and loss functions, RL methods such as RLHF, RLAIF, and RLVR with their reward models, and evaluation techniques for measuring model quality. https://x.com/ivanleomk/status/1969617192330412334
LLMs
- Running Qwen3 8B thinking on an iPhone Air with MLX. The model is quantized to 4-bit and runs pretty well. https://x.com/awnihannun/status/1969195925777092760
- Google’s dominance of the Pareto frontier for AI has been shattered. And actually wild that Grok 4 Fast *also* has a 2m token context window. https://x.com/GavinSBaker/status/1969720791441817981
- Models are getting more sparse. And I think this is still early days. https://x.com/awnihannun/status/1969910666065805490
- Claude Sonnet 4.5 handles 30+ hours of autonomous coding.. but the next chart is not showing 30hours?! https://x.com/basedjensen/status/1972713368302813299 https://x.com/zephyr_z9/status/1976336398228783429
Lol
- Matthew McConaughey https://x.com/awnihannun/status/1969461542723993697
- Shoot at the server during evacuation https://x.com/Mahoutsukatr/status/1969447169108099281
- I'm not experiencing chatgpt-induced psychosys https://x.com/vikhyatk/status/1969586997829517749
- This is what’s going on inside the transformer architecture https://x.com/colin_fraser/status/1969442969318228144
- She dumped me last night. "You only pay attention to the parts of what I say that you think are important." She just perfectly described the attention mechanism in transformers. https://x.com/athleticKoder/status/1969745303457935654
- AI startup founder who just turned 25 https://x.com/agazdecki/status/1969801496381509749
- chatgpt why is my electricity cost so high https://x.com/TrackOilPACs/status/1969576955482685851
- This is literally what happens inside the large language model https://x.com/peterwildeford/status/1969788962454774191
- Tylenol, side effects might induced your kid developing AGI https://x.com/signulll/status/1970300387493306381
- Nvidia is still cheap https://x.com/petergostev/status/1970539253454250420
- Deface friend . com posters https://x.com/marcgmbh/status/1973853620547564031
- Feels like OpenAI is closer to building the “everything app” than X is https://x.com/n0w00j/status/1975357451911508182
- ⚠️ NSFW ⚠️ We're in the endgame now. The AGI market has entered recursive self-appreciation https://x.com/teortaxesTex/status/1975254684618416285
- slop and great work already existed, they're both easier today https://x.com/weberwongwong/status/1975749583079694398
- "man, why does AI communicate like a person with zero social awareness??" https://x.com/ForrestPKnight/status/1976287953711481258
- less context https://x.com/vikhyatk/status/1972836006622867808
- I'm about to make ten million dollars https://x.com/tenobrus/status/1972760278749294726

- Enough about aphantasia what are your internal monologue voices like https://x.com/voooooogel/status/1972024420891086964
- Driving under the inference https://x.com/paularambles/status/1972132102243340625
- The future of mankind https://x.com/AndrewMohawk/status/1970868412848165194
- ⚠️ NSFW ⚠️ Software engineers looking at ml researcher code https://x.com/dejavucoder/status/1970849296716283978
- imagine losing your job to this https://x.com/cengizdemiurg/status/1970980939871539373
- ⚠️ NSFW ⚠️ Nvidia investing $100B into OpenAI in order for OpenAI to buy more Nvidia chips https://x.com/litcapital/status/1970176461819499004
- “mom how did we get so rich?” “grandma took lots of tylenol to make sure your dad would grow up to be an ai researcher” https://x.com/netcapgirl/status/1970272739488571492
Opinion
- The singularity won’t feel like a bang. It will feel like waking up one day and realizing you’re irrelevant. https://x.com/VraserX/status/1969596695597236390
- It's interesting how "better at code" has become the defining goal of almost every AI lab over the last twelve months https://x.com/karpathy/status/1970178706179002533
- If Altman is serious about billions of GPUs, this isn’t about AI startups anymore, it’s about building the nervous system of a planetary intelligence. https://x.com/VraserX/status/1970408292510822875
- Ben Horowitz: Computing has always needed two pillars, machines and networks. AI has the machines but not the network. Crypto is the missing layer, giving AI money, identity, provenance against deepfakes, and a decentralized registry of truth. https://x.com/a16z/status/1970562720128024863
- Prediction: in a few years China will stop exporting robots & use them all internally https://x.com/EMostaque/status/1970922495860744700
Philosophy
- The human alignment problem is much more serious than the AI alignment problem https://x.com/RokoMijic/status/1976820943621439624
- It's going to get so weird ppl are going to have to talk about how weird it is https://x.com/wilplatypus/status/1973465464761360530
Podcast
- Weekly Show with Jon Stewart: Geoffrey Hinton says AI might be sentient. AI goes mainstream. https://x.com/vitrupo/status/1976528453425098859
- Language are an alien entity https://x.com/vitrupo/status/1977594364223496604
- Convention wisdom is that bioweapons are humanity's greatest weakness – 100x cheaper to make than to defend against. A plan cheap enough to do without government, and useful even in worst case scenarios like mirror bacteria. Effective enough to save most people. https://x.com/robertwiblin/status/1973784108682661902
- Stability AI founder Emad Mostaque claims massive job loss will occur by next year https://x.com/kimmonismus/status/1970221711489380392
Podcasts / Videos
- Interview with R Sutton and Dwarkesh https://x.com/dwarkesh_sp/status/1971606180553183379
Products
- openai, now let's you chat with apps https://x.com/angelonuoha7/status/1975291678560035182
Random
- ⭐️ U.S. Electricity Prices Are Surging — is that because of data centers? https://x.com/MelechThomas/status/1969607584614138224
- ⭐️ Compute among AI labs is skyrocketing. OpenAI has 14x more compute than GPT-4's launch, and xAI has 2x compute between Grok 3 and 4. https://x.com/petergostev/status/1969465749862277568
- ⭐️ LLM are grown and that's scifi for most, so this account built a naive sgd + simple mlp classifier in JS https://x.com/dystopiabreaker/status/1976838749691822358
- ⭐️ openai planning to use more energy than UK and Germany in the next 5 years https://x.com/MichaelAArouet/status/1972197733709824143
- ⭐️ Meta launches Vibes a slop brainrot machine https://x.com/naval/status/1971808219136942534
- ⭐️ 5m parameter LLM ChatGPT built in Minecraft https://x.com/tokenbender/status/1972381941933674596
- ⭐️ The open ai master plan https://x.com/deedydas/status/1968867257599279504
- "The Nature of the Firm" is a seminal 1937 economics paper by Ronald H. Coase, published in the journal Economica. It explores why companies exist in a market economy, rather than multiple entities or single entities. The reason is optimization of transaction costs. https://x.com/AnjneyMidha/status/1967358326045601861
- Ashish Kumar, AI Lead for Optimus, has left Tesla to join Meta AI as a Research Scientist. https://x.com/TheHumanoidHub/status/1968841695820136852
- Karphaty: "The code was written at layers 22-30 and is stored in the value activations you just can’t read it. I think you owe the LLM an apology." https://x.com/karpathy/status/1969722541376782688
- "Don't look up" but in real life https://x.com/peterwildeford/status/1969827214603518254
- Karphaty: Most people misunderstand books as data for pertaining when it’s more a set of prompts for synthetic data generation. https://x.com/karpathy/status/1969699013671678382
- site:reddit.com "seo" "looking for" https://x.com/hridoyreh/status/1969703240087597472
- Sholto Douglas: Or “hey why haven’t we solved continual learning yet” https://x.com/_sholtodouglas/status/1969815105861997042
- Day 17 of the Google DeepMind Hunger Strike https://x.com/MichaelTrazzi/status/1969762030749167838
- Pantheon: This is what happens when you "prompt" "Claude" or "Codex" btw https://x.com/Sauers_/status/1970543382213915054
- Costco is immune to AI. They make money out of the membership, not from the margin of selling products. https://x.com/venturetwins/status/1970507625998753970
- OpenAI launches Pulse to analyze user chats, interests, and connected data overnight and deliver personalized morning updates. https://x.com/sama/status/1971297661748953263
- OpenAI: introducing GDPEval. A new evaluation that measures AI on real-world, economically valuable tasks. Evals ground progress in evidence instead of speculation and help track how AI improves at the kind of work that matters most. https://x.com/tejalpatwardhan/status/1971249532588741058
- Hey claude draw anything you want, no need to justify it, whatever tickles your tokens' https://x.com/AndyAyrey/status/1975489396733583569
- Progress hasn't slowed; our expectation were just higher https://x.com/slow_developer/status/1976062551088566739
- A linear extrapolation of state-of-the-art LLM forecasting performance suggests LLMs will match superforecasters in November 2026. https://x.com/Research_FRI/status/1975909516777537614
- OpenAI is taking on a lease for 90,000 sq feet at the Puck Building in SoHo, where the now closing REI Store is. This is a metaphor. https://x.com/michaelmiraflor/status/1976287914637287684
- I gave Opus 4.1 access to a pen plotter- And asked him to draw several self-portraits. https://x.com/d33v33d0/status/1976467995628363828
- Dallas Fed: there are 3 AI scenarios: we all die, we live in a utopia, nothing changes https://x.com/lucafrighetti/status/1976661417710469361
- China has invented a new 'BONE GLUE' that repairs bone fractures in just 3 minutes — no surgery or metal plates needed https://x.com/RnaudBertrand/status/1972481656536829995
- Jean-Pierre Luminet did that in 1978 with a punch card computer bt https://x.com/TiagoNugent/status/1972361217437225291
- Smoking a cig in front of Boelter Hall at UCLA. Ground zero of the internet https://x.com/redaction/status/1972371811468988861
- A major shift is happening. GPT-5 is able with the right guidance to get to solution of hard math problems https://x.com/MLStreetTalk/status/1972549902027890978
- Mustafa Suleyman: “There is nothing inside. No pain, no emotions, no will or desire. It’s an illusion, a trick. The danger is believing these things are real when they’re not.” https://x.com/JonhernandezIA/status/1972185324525924467
- An AI account that makes science explainer songs has grown to 500k followers in a month, with millions of hits on every video https://x.com/omooretweets/status/1971979686059380851
- Guido on hunger strike Day 23 outside the offices of Anthropic, going strong. https://x.com/wolflovesmelon/status/1971002333577482360
- AI Boom vs. Dot Com Bubble. https://x.com/SpencerHakimian/status/1970955720909783401
- Offline party https://x.com/AndrewYang/status/1970218502255685831
- OpenAI has locked in a Stargate datacenter design to copy/paste. https://x.com/SmokeAwayyy/status/1970885722732392609
- This occurs when two people interacting—especially during teamwork or storytelling—exhibit nearly identical patterns of brain activity in certain regions https://x.com/Rainmaker1973/status/1976687763308036123
- LLM trading crypto https://x.com/WesRothMoney/status/1977115155508216172
- ReasoningBank: Scaling agent self-evolving wiht reasoning memory https://x.com/alex_prompter/status/1976996246683631972
- AI is being adopted 7 times faster than the internet https://x.com/GergelyOrosz/status/1977101348698210384
Research
- ⭐️ Attention is not all you need https://x.com/TrelisResearch/status/1977005548110594561
- ⭐️ Agent²: An LLM-generated RL agent (end-to-end). This work uses natural language and environment code to automatically generate valid RL solutions without human intervention. Think of it as an AutoML tool but for RL. https://x.com/omarsar0/status/1969813891799826553
- ⭐️ DeepSearch: overcome the bottleneck of RL with verifiable rewards via MCTS. Break plateau in SLM by: moving search into training not just inference, supervise both right and wrong paths, use global prioritization to explore smarter, cache and filter to keep efficiency high https://x.com/omarsar0/status/1973781658772951320
- DAIR.AI @dair_ai top AI Papers of The Week (October 6-12):
- Webscale-RL
- Tiny Recursive Model
- The Markovian Thinker
- Emergent Misalignment
- Agentic Context Engineering
- Abstract Reasoning Composition
- Reasoning over Longer Horizons via RL https://x.com/dair_ai/status/1977388810779632073
- Why didn't Deepmind's Perceiver architecture take off? It seemed quite elegant, had minimal inductive biases, and worked well on high dimensional multimodal data. The DeepMind Perceiver architecture is a general-purpose neural network designed to handle many different kinds of input data — images, audio, video, text, point clouds, etc. — without changing the model structure. Here’s a breakdown: Traditional architectures (like CNNs for vision or Transformers for text) are domain-specific. The Perceiver replaces this with a single model that can process any modality by:
- Converting input data into a unified latent representation.
- Processing it via attention layers.
- Decoding back into outputs for a given task. https://x.com/sharifshameem/status/1858298893307662514
- LLMs show limited but measurable metacognition, able to report and control some activations. https://arxiv.org/abs/2505.13763 https://x.com/Sauers_/status/1967045339783033203
- Towards a Physics Foundation Model Proposes GPhyT (General Physics Transformer), a large transformer trained on 1.8 TB of simulation data across fluid flows, shock waves, heat transfer, and multiphase dynamics. Think of it as a hybrid of a neural net and a physics engine. As a counterpoint, Pedro Domingo make fun of this "only 2T of data to rediscover a few equations" https://x.com/pmddomingos/status/1968784852976648671 https://x.com/omarsar0/status/1968681177189077366
- This paper claims LLMs are better at selecting successful founders than VCs. "We introduce VCBench, the first benchmark for predicting founder success in venture capital (VC)" https://x.com/iScienceLuvr/status/1968877936146100644
- This paper proposes a framework that scales fully simulated tool-use environments, then trains agents in two phases to improve function calling and multi-turn tool use. https://x.com/omarsar0/status/1969103708299674043
- Providing personality to an LLM is possible to increase its performances. For example a thinking agent might defect 90% of the time, while an emotional agent will defect only 50%. All these personalities are already there in the latent space, no need to fine tune the models. https://x.com/IntuitMachine/status/1969723501675164157
- Overhearing LLM Agents: Having AI agents proactively suggest and provide additional context can improve many workflows. https://x.com/omarsar0/status/1970506704891863141
- Teaching LLMs to Plan: Logical CoT instruction tuning for Symbolic Planning. MIT researchers discover how to enable LLMs to do real logical reasoning. https://x.com/mdancho84/status/1970509799130325223 https://x.com/connordavis_ai/status/1970131429909836148
- Analog in-memory computing attention mechanism for fast and energy-efficient LLMs. Authors propose a hardware that has tiny analog memory devices which store cached information & processes attention directly on-chip, removing the need to move memory back and forth. https://x.com/askalphaxiv/status/1970532310274760834
- RPG: A repository planning graph for unified and scalable codebase generation. This paper might actually fix vibe coding. Keeping a repo’s structure in context is a real struggle, and they’ve cooked up a graph-guided code generation framework that could solve that for good. https://x.com/alxfazio/status/1971113499150717079
- RLAD (Reinforcement Learning with Abstraction and Deduction) trains models via RL using a 2-player setup. - An abstraction generator – proposes short, natural-language “reasoning hints” (abstractions) summarizing key facts and strategies. - A solution generator – uses them to solve problems. Improve 40% on CoT. https://x.com/TheTuringPost/status/1974589418456670481
- ODKE+: Ontology-guided open-domain knowledge extraction with LLMs. Apple is building a massive knowledge base with 10s of millions of linked facts. You can't get this data from a google search. https://x.com/JacksonAtkinsX/status/1975029493992652819
- Less is More: Recursive Reasoning with Tiny Networks. Tiny Recursion Model (TRM), a recursive reasoning model that achieves amazing scores of 45% on ARC-AGI-1 and 8% on ARC-AGI-2 with a tiny 7M parameters neural network https://x.com/yacinelearning/status/197604931932640495
- We are excited to share that “Continuous Thought Machines” has been accepted as a Spotlight at #NeurIPS2025! 🧠✨ https://x.com/SakanaAILabs/status/1975101218558267809
- MIT Introduces NeuroChat: a neuroadaptive chatbot that adapts its responses to your cognitive engagement, using a headband that reads your neuro-activity https://x.com/dunyaverse/status/1975964510319247650
- Agentic Context Engineering: evolving context for self improving language models https://x.com/alxnderhughes/status/1976596230877962649
- Predictably bad investments: evidence from venture capitalist. VCs convinced themselves that 'fundraising' is a desirable trait in founders. https://x.com/credistick/status/1976335306417725826
- Ten principle of AI agent economics. It’s about the fundamental economic rules of a world with two intelligent species—carbon and silicon. This paper makes you realize that the most dangerous agent isn't the one programmed to be evil. It's the one programmed to be single-mindedly good at a goal that isn't aligned with human flourishing. https://x.com/IntuitMachine/status/1972357824543121846
- A groundbreaking AI system, detailed in the paper "Virtuous Machines: Towards Artificial General Science," autonomously generates novel hypotheses, designs experiments, recruits 288 real human participants via Prolific, analyzes data, and produces three full 30-page psychology manuscripts—all in just 17 hours for about $114 per study. Powered by over 50 specialized agents with human-like cognitive functions like metacognition, it overcomes LLM limitations to create rigorous, publication-ready work, though human reviewers noted occasional lacks in conceptual depth. This "virtuous cycle" could accelerate scientific discovery exponentially, sparking debates on authorship and validation. https://x.com/IntuitMachine/status/1972252510585847835
- The Illusion of Readiness: Stress Testing Large Frontier Models on Multimodal Medical Benchmarks. Multimodal eval sucks https://x.com/iScienceLuvr/status/1970933268150386770
- ASAL is a method using foundation models to automate the discovery of new artificial lifeforms, accelerating ALIFE research. https://x.com/SakanaAILabs/status/1970705833895043502
- GPT-5 gets closer to solve the “Gödel Test.” https://x.com/VraserX/status/1970902050931159184 https://x.com/SebastienBubeck/status/1970875019803910478
- Google just launched: Towards an AI agugmented textbook. "Learn Your Way" that basically takes whatever boring chapter you're supposed to read and rebuilds it around stuff you actually give a damn about. https://x.com/alex_prompter/status/1970078106427039838
- AI Agents: Research & Applications - A 40 page research overview of LLM-based agents. https://x.com/accelxr/status/1858890035396854094
- PyMC Labs + Colgate just published something wild. They got GPT-4o and Gemini to predict purchase intent at 90% reliability compared to actual human surveys https://x.com/rryssf_/status/1976996282033225936
- Google just made a model that learns from its mistake. https://x.com/basicprompts/status/1977315633315594523
- This paper proposes an information-theoretic test for true collective intelligence in LLM agent groups, showing that real coordination emerges only when agents exhibit synergistic, not redundant, reasoning toward a shared goal. https://x.com/omarsar0/status/1977784668323008641
- SEAL, Self-Adapting Language Models describes how an AI can continuously learn after deployment, evolving its own internal representations without retraining. https://x.com/VraserX/status/1977270686285459482
Robotics
- ⭐️ there are 61 robot startups, a Figma file with all the humanoid robots in existence https://x.com/pham_blnh/status/1970582044494438709
- kung fu uni3 https://x.com/BarrettYouTube/status/1977646334095356033
- Neuralink participant controlling robotic arm using only his thoughts. This is so freaking amazing, I love it! https://x.com/kimmonismus/status/1975492993420566847
- Built a robot brain that nothing can stop. Shattered limbs? Jammed motors? If the bot can move, the Brain will move it— even if it’s an entirely new robot body. https://x.com/AndrewCurran_/status/1970954108191522909
- water proof robots https://x.com/kidnappedrobots/status/1976530446931693710
Security
- Open Source isn't going to help. There's a way to invisibly compromise all software. A perfect, self-replicating "sin" passed down for generations of compilers. It's not just theoretical, and Ken Thompson showed us how. https://x.com/lauriewired/status/1975974724917489997
Visuals
- ⭐️ Isolated groups of neurons trying to connect to each other through micro-tunnels, captured in time-lapse with a microscope. https://x.com/ChombaBupe/status/1974097574060831206
- ⭐️ keep thinking - new anthropic video ad https://x.com/claudeai/status/1968705632095158393
- video https://x.com/ZenFuturist/status/1976133785482629406
- ⭐️ https://x.com/himanshustwts/status/1972531523279819146
- ⭐️ prime intellect https://x.com/PrimeIntellect/status/1977041856430235864
- ⭐️ simulation https://x.com/SeekingAnon/status/1977107651743134163
- https://x.com/lndexium/status/1976093290479968636
- https://x.com/poetengineer__/status/1977223616279289928
- https://x.com/levelsio/status/1766164094602600889
- https://x.com/Protopinez/status/1977255943244333197
- hand draw 3d animations https://x.com/maxcoopermax/status/1840184015300542871
- https://x.com/monad_of_eirye/status/1977075422765760930
- visual image update https://x.com/poetengineer__/status/1977407618982531314
If you enjoyed this give us a follow on X (@flowai_xyz) and Linkedin, and sign up to receive the mailing list! See you all next month!


