Fractile Raised $220 Million to Solve the Problem Nobody Talks About: AI Models Take a Month to Think. Fractile Plans to Compress That to a Day.

There is a moment in the history of every transformative technology when the constraint shifts. In the early internet era, the constraint was bandwidth: getting data from server to browser was slow, and the entire user experience was shaped by that single bottleneck. The industry solved it with fiber and CDNs and compression algorithms, and the apps that became possible afterward could not have been imagined by the people navigating dial‑up connections.
AI is approaching a similar inflection. The current constraint is not model quality. The foundational models that exist today are remarkable. The constraint is how long it takes a model to generate useful output, and what it costs to produce that output at production scale.
Fractile, the London and Bristol‑based AI chip startup founded in 2022 by Oxford‑trained engineer Walter Goodwin and co‑founder Yuhang Song, has raised $220 million to address this specific problem. The Series B round was co‑led by Accel, Factorial Funds, and Peter Thiel's Founders Fund, with participation from Conviction, Gigascale, O1A, Felicis, Buckley Ventures, and 8VC. Gigascale is a fund backed by former Meta chief technology officer Mike Schroepfer. Former Intel CEO Pat Gelsinger, who invested in Fractile in January 2025, joined the Series B as an angel investor and operating adviser.
The UK government responded to the announcement with characteristic official enthusiasm. AI Minister Kanishka Narayan called the deal "a strong vote of confidence in British AI," adding that it shows "UK companies at the cutting edge are pulling in global investment while anchoring high value jobs and expertise here at home." The UK's national AI ambitions have been a recurring policy theme, and a British chip startup raising $220 million from American tier‑one venture capital is the most commercial version of that ambition materializing.
The 40‑Tokens‑Per‑Second Problem
Goodwin's framing of why Fractile exists is precise and mathematically specific. Current frontier AI models run at approximately 40 tokens per second on standard GPU hardware. A single token is roughly three‑quarters of a word. At 40 tokens per second, generating one million tokens takes approximately seven hours. Generating ten million tokens takes three days. Some advanced AI workloads, including complex multi‑step reasoning tasks and long‑context document analysis, already require tens of millions of tokens to produce useful outputs. At current hardware speeds, those workloads take weeks or months to complete.
As Goodwin wrote in the announcement post: "At the roughly 40 tokens per second at which these models tend to run on existing chips, a single output of this length takes a month to complete. The technical and economic limits on inference speed, above all from memory bandwidth that has failed to scale on current architectures, are what is constraining progress."
The root cause of this bottleneck is architectural. Conventional AI accelerators, including Nvidia's H100, H200, and Blackwell GPUs, store the parameters of AI models in high‑bandwidth memory chips physically separated from the processor. Every time the chip performs a computation, it must read data from memory across a physical connection that has a fixed maximum bandwidth. As models have grown larger and context windows have expanded, the amount of data that must be transferred per computation has grown faster than memory bandwidth has improved. The chip is fast. The wire between the chip and the memory is the binding constraint.
Fractile's solution is to eliminate that wire by performing computations directly inside the memory cells. Its in‑memory compute architecture, which performs matrix multiplications inside SRAM cells alongside the compute logic, removes most of the DRAM dependence that is currently the binding constraint on inference cost. The result, according to Fractile's benchmarks: chips that can run frontier models at speeds between 25 and 100 times faster than current GPU setups, at approximately one‑tenth the cost per token.
The company's target is 1,200 tokens per second, compared to 40 today, which would compress workloads currently taking a month to a single day. Whether those numbers hold under production conditions is the key technical question that the $220 million is specifically designed to answer by building and delivering the first chips.
The Anthropic Signal
The most commercially significant detail in this week's coverage of Fractile came not from the funding announcement but from a separate report. Earlier in May, The Information reported that Anthropic had held discussions with Fractile regarding the purchase of the startup's inference chips when the hardware becomes available in 2027.
If that customer relationship materializes, it would be commercially extraordinary: the company that is simultaneously Fractile's most important potential customer for inference chips is also the company whose recent $1 trillion secondary market valuation and $2.1 billion drug discovery partnership with Isomorphic Labs are dominating the same week's AI headlines. Anthropic pays hundreds of millions of dollars annually for compute to generate its Claude models' responses. A chip that delivers inference at one‑tenth the cost per token is commercially compelling for any buyer at that scale. The talks are early and not confirmed, but the direction they point is clear.
The $220 Million Deployment Plan
Fractile announced in February 2026 that it would invest £100 million, approximately $135 million, to bolster UK operations over three years, expanding its London and Bristol sites and creating a new hardware engineering facility in Bristol. The Series B funds accelerate that commitment alongside the development and commercialization of its first silicon chips and compute systems for enterprise customers.
The company is currently hiring across London, Bristol, San Francisco, and Taipei, a geographic footprint that reflects both its UK‑first identity and the global talent and manufacturing ecosystem required to produce competitive semiconductor hardware. Taipei is the hub of the world's most advanced contract chip manufacturing, suggesting Fractile's production plans involve TSMC or comparable foundries.
The competitive landscape Fractile enters is not empty. Cerebras, which is approaching its Nasdaq IPO with $510 million in revenue and a $24.6 billion contracted backlog, has demonstrated that wafer‑scale chip designs can achieve commercial deployment at hyperscaler customer scale. Groq has deployed inference hardware that has attracted developer adoption through speed and predictable performance. AMD and Intel are each investing in inference optimization. Nvidia's Blackwell architecture includes its own inference efficiency improvements.
Fractile's differentiation is architectural in a way that none of these competitors fully replicate: computing in memory rather than adjacent to memory. Whether that approach produces the performance improvements the company claims at production scale, and whether the total cost of ownership including the chip, the power, and the supporting infrastructure comes in at the fractions of Nvidia's equivalent costs that Fractile benchmarks suggest, is the technical bet that $220 million is funding.
More at fractile.ai





