How Two Harvard Dropouts Turned a Nearly Dead Chip Startup Into a $5 Billion Nvidia Rival

In 2023, Gavin Uberti and Robert Wachen could not get a meeting. Their startup, Etched, was burning through its remaining cash, investors were not interested, and the entire AI chip market was consumed by a single obsession: training ever‑larger models. The idea of building specialized silicon just for inference, the computation that happens every time a user sends a prompt and waits for a response, was considered a niche bet at best and a dead end at worst.
Two years later, Etched has raised 800 million dollars in total, is valued at 5 billion dollars, and has booked more than 1 billion dollars in signed customer contracts for a chip that has not yet shipped at scale. TSMC, the world's most advanced contract chipmaker, has already produced Etched's first chip on its 4‑nanometer manufacturing process. The company is now testing full systems with customers and plans to ship its first rack‑scale deployments this summer.
The investors who would not take the meeting in 2023 have been replaced by a different kind of backing entirely. The 500 million dollar round closed in December 2025 and led by Stripes at a 5 billion dollar post‑money valuation included Jane Street, Hudson River Trading, Two Sigma, Ribbit Capital, VentureTech Alliance (a venture firm with a strategic partnership with TSMC), and Peter Thiel. The angel roster includes Andrej Karpathy, Geoffrey Hinton, Fei‑Fei Li, and billionaire Stanley Druckenmiller.
The Bet That Almost Did Not Survive
Etched was founded in 2022. Uberti serves as CEO and Wachen as president. Both left Harvard to start the company and both became Thiel Fellows, recipients of a program that backs young founders who skip or leave higher education to build companies full time. Uberti had previously spent time at Nvidia, giving him a close‑up view of how general‑purpose GPU architectures were designed and where their limits lay. Wachen brought complementary engineering depth, and the two shared a conviction that the transformer architecture, which underpins virtually every major large language model from GPT to Llama to Gemini, was not going to be displaced anytime soon.
That conviction became the design specification for their chip. Rather than building flexible silicon that could handle training, inference, and other workloads across different model architectures, Etched built Sohu to do exactly one thing: run transformer‑based inference as fast as physically possible.
It is called an ASIC, or application‑specific integrated circuit. The trade‑off is fundamental and intentional. Sohu cannot train models. It cannot handle workloads that fall outside the transformer architecture. What it can do is run inference faster and at a fraction of the energy cost of a general‑purpose GPU, because every transistor on the chip is dedicated to that single task rather than spread across general‑purpose compute capabilities that may never be used for AI inference at all.
The problem in 2023 was that nobody particularly cared about inference efficiency yet. The market spotlight was on training runs, model size, and the hardware arms race to build the largest clusters. Etched's pitch landed in a room where the audience was not ready for the question.
What Changed
The shift came gradually and then very quickly. As major AI companies began deploying models to millions of users simultaneously, inference became the dominant cost on their balance sheets. Serving a large language model to customers at scale is not a one‑time training cost. It is a perpetual, recurring, per‑token expense that compounds with every new user and every additional product feature. For companies trying to build sustainable AI businesses, inference economics became the number that mattered most.
That shift is why the global AI inference market, valued at 106 billion dollars in 2025, is projected to reach 255 billion dollars by 2030. It is also why Etched's 2023 problem became a 2025 and 2026 opportunity.
Sohu's performance claims are aggressive by any measure. The company says a single eight‑chip Sohu server can process around 500,000 tokens per second running Meta's Llama 70B model. More strikingly, it claims one Sohu‑equipped server can replace up to 160 Nvidia H100 GPUs for inference workloads. If those figures hold under real production conditions, the cost and power implications for large‑scale AI deployment are significant enough to justify the attention Etched has been getting from some of the most sophisticated buyers of compute in the world.
Etched sells what it calls frontier inference clusters, complete systems that include the chips, custom‑designed racks, and the software layer that runs on top of them. The current systems already support DeepSeek, Qwen, Meta's Llama, and Mamba models. The company's TSMC‑linked investor is more than a financial endorsement; it reflects that the world's leading chipmaker has enough conviction in Etched's technology to take an ownership stake and manufacture the product.
The Competitive Landscape It Is Walking Into
Nobody in this market is confused about who the dominant player is. Nvidia's Blackwell platform and its CUDA software ecosystem represent decades of compounding advantage in AI hardware. Nvidia recently projected more than 500 billion dollars in cumulative data center sales by the end of 2026, a figure that reflects just how deeply embedded its hardware is across every tier of AI infrastructure.
Etched is not trying to unseat Nvidia across all of those use cases. Its thesis is narrower and, because of that, potentially more defensible. By giving up flexibility entirely and committing to transformers, Sohu achieves efficiency gains that a general‑purpose GPU cannot match on the specific workload it is designed for. The risk is that this bet on the transformer architecture's longevity proves wrong, either because a genuinely superior architecture emerges and displaces transformers, or because the efficiency gap between specialized and general‑purpose hardware narrows faster than expected.
The competition is not only Nvidia. Cerebras completed its IPO earlier in 2026 and is pursuing a different approach to inference efficiency through wafer‑scale chips. Groq raised 650 million dollars chasing the same latency‑focused inference market. Amazon, Google, and Microsoft all build proprietary inference chips for their cloud platforms. OpenAI announced its own custom chip built with Broadcom. And in December, Nvidia reached a 20 billion dollar licensing agreement with Groq in a move that took most of Groq's engineering team.
What distinguishes Etched in this crowded field is the totality of its specialization. Most competitors build chips that are more efficient than Nvidia at inference while still maintaining some degree of architectural flexibility. Etched made no such concession. Sohu is hardwired to transformers, which is either the boldest technical bet in AI hardware right now or the most dangerously narrow one depending on how the architecture landscape evolves over the next several years.
What the Billion in Contracts Actually Means
The 1 billion dollar contract figure is the headline that has drawn the most attention, and it deserves some context. These are forward contracts, meaning customers have committed to purchase Etched systems that have not yet been delivered. The commitment reflects genuine demand from buyers who have done their own analysis of the efficiency economics and concluded the potential upside is worth the execution risk of buying from a startup shipping a new chip architecture.
That execution risk is real. The semiconductor industry has seen a long history of startups producing impressive benchmark results and then struggling with the gap between demonstrated lab performance and what is achievable at scale in real production environments. Chip yield, software compatibility, customer integration complexity, and the supply chain demands of rack‑scale systems all represent points where promising hardware can fall short of its projections.
Etched's answer to that risk, beyond the TSMC validation of its silicon and the early customer testing underway, is the quality of the investors who have already stress‑tested the thesis. Jane Street is one of the most compute‑intensive trading firms in the world and is not known for investing in companies it has not evaluated thoroughly from an engineering perspective. Geoffrey Hinton and Fei‑Fei Li bring academic and research credibility that is difficult to manufacture. The presence of both financial and technical validators on the cap table does not guarantee that the chips perform as advertised, but it does suggest that serious people have looked closely at the claims and found them credible enough to back with substantial capital.
The summer 2026 shipment window is now the only variable that materially matters for Etched's short‑term trajectory. If the first rack‑scale systems reach customers and perform as promised, the company's path from a nearly dead 2023 seed‑stage startup to one of the most well‑funded hardware companies in AI will be among the more remarkable comebacks the technology industry has produced in recent years.





