Nvidia Just Partnered With the Lab Founded to Surpass AlphaGo. What Ineffable Is Building With Reinforcement Learning Changes the Picture.

David Silver designed the AI that defeated the world Go champion Lee Sedol in 2016. He then designed the AI that defeated every human and AI Go player without ever learning from human games. He then designed AlphaStar, which reached Grandmaster level in StarCraft II. He spent years at Google DeepMind building systems that learn to master complex, strategic environments through self‑play and reinforcement learning, without being taught by humans what good moves look like.
In 2025, Silver founded Ineffable Intelligence to apply those methods to the most general and significant strategic environment of all: real‑world intelligence itself. In April 2026, the company raised $1.1 billion in seed funding at a $5.1 billion valuation from Sequoia Capital and Lightspeed Venture Partners, with backing from Google and other investors. At the time of the raise, it was the largest seed round in Silicon Valley history, held for approximately one week before Recursive Superintelligence announced comparable figures.
On May 13, 2026, Tech in Asia reported that Nvidia has entered into a partnership with Ineffable Intelligence to collaborate on reinforcement learning systems development. The partnership specifics have not been publicly detailed. What is publicly documented is the technical direction Ineffable is pursuing and why Nvidia's involvement makes commercial sense for both parties.
Reinforcement learning is the category of AI that learned to beat humans at Go, chess, StarCraft, and a growing list of domains that reward strategic multi‑step reasoning. It is fundamentally different from the supervised learning that powers most commercial AI applications today. Supervised learning trains on labeled data: show the model millions of examples with correct answers, and it learns to produce correct answers. Reinforcement learning trains through experience: give the model an environment, a set of possible actions, and a reward signal, and let it discover through trial and error which sequences of actions produce the best outcomes.
The difference matters because reinforcement learning, when it works, produces systems capable of behaviors that no human could teach through labeled examples. AlphaGo's winning moves were not moves humans would have played. They were discovered through the AI's own exploration of the game's possibility space. Silver's hypothesis, embedded in Ineffable's founding thesis, is that this self‑discovery mechanism is the path to genuinely general intelligence rather than the sophisticated pattern matching that characterizes current large language models.
Nvidia's partnership positions the company at the technical frontier of this approach. RL systems, particularly the large‑scale self‑play training that Silver pioneered at DeepMind, are computationally intensive in specific ways that differ from the large‑batch, large‑model training that Nvidia's GPUs are optimized for. RL training involves many short episodes with frequent parameter updates, a compute pattern that requires different optimization than the long‑horizon, batch‑processed gradient descent of supervised learning. Collaborating with a lab at Ineffable's technical level gives Nvidia both the research insight and the early hardware relationship to optimize for RL workloads as they become commercially significant.
The competitive field around Ineffable is specific. Anthropic and OpenAI both use reinforcement learning from human feedback as a core component of their post‑training processes, but this is a constrained application of RL rather than the open‑ended self‑improvement that Silver built his career on. Recursive Superintelligence, founded by Richard Socher's team, aims to automate the entire AI development pipeline rather than specifically focusing on RL as the mechanism. Safe Superintelligence takes a safety‑first approach to similar long‑term goals. AMI Labs, backed by Yann LeCun, pursues world models as the foundation for general intelligence.
Ineffable's specificity, building the reinforcement learning infrastructure that makes open‑ended self‑improvement possible, is both a technical focus and a competitive position. Silver knows this problem at a depth that very few researchers match, and the $1.1 billion raised in April gives the company the compute runway to pursue it at the scale that frontier results require.
More at ineffable.ai





