Anthropic in talks to use Microsoft’s custom AI chips for inference workloads
The Claude maker is diversifying its silicon diet beyond NVIDIA, with Microsoft's in-house chips entering the conversation alongside a $30B Azure commitment.
Anthropic, the AI safety company behind Claude, is in active negotiations with Microsoft to deploy the tech giant’s custom-designed AI chips for inference workloads. The talks signal a broader strategic shift in how leading AI labs think about the hardware powering their models, and it has ripple effects that extend well beyond Silicon Valley.
The discussions were first reported by The Information, and they come on the heels of a sweeping partnership between Anthropic, Microsoft, and NVIDIA that already includes a staggering $30B commitment from Anthropic to purchase Azure computing resources.
What the deal looks like
Here’s the thing about running large language models: training them is expensive, but inference, the part where the model actually responds to user queries, is where the ongoing costs pile up. Every time you ask Claude to help draft an email or summarize a research paper, that’s inference. And at scale, those costs add up fast.
Anthropic’s interest in Microsoft’s in-house silicon is squarely focused on that inference side of the equation. The goal is straightforward: drive down the per-query cost of running Claude across millions of users.
This doesn’t mean Anthropic is abandoning NVIDIA. Far from it. The company’s compute infrastructure will continue to lean heavily on NVIDIA’s Grace Blackwell and Vera Rubin systems, with both companies co-optimizing models to run efficiently on that hardware. But relying on a single chip supplier, even one as dominant as NVIDIA, creates concentration risk. Anthropic appears to be hedging.
Microsoft CEO Satya Nadella has framed the relationship as increasingly symbiotic. The two companies will “increasingly serve each other as customers,” with Claude being distributed through Microsoft’s product ecosystem. In English: Microsoft gets access to one of the most capable AI models on the market, and Anthropic gets preferential access to Azure’s massive cloud infrastructure and now, potentially, its custom silicon.
The silicon diversification play
Anthropic isn’t just looking at Microsoft’s chips. The company has also been exploring AI inference chips from Fractile, a UK-based startup building specialized hardware for exactly these kinds of workloads. The pattern is clear: Anthropic wants options.
This mirrors what other hyperscalers have been doing for years. Google has its TPUs. Amazon has its Trainium and Inferentia chips. Microsoft has been developing its own AI accelerators, including the Maia 100 series, specifically to reduce its own dependence on NVIDIA’s GPUs and offer customers a cost-competitive alternative.
For Anthropic, the calculus is simple. NVIDIA’s top-tier GPUs are powerful but expensive, and supply has been tight for years. Every major AI lab, cloud provider, and sovereign AI initiative is competing for the same chips. By adding Microsoft’s custom silicon and potentially Fractile’s hardware to its toolkit, Anthropic can negotiate from a stronger position and build redundancy into its infrastructure.
Look, this is a company that just committed $30B to Azure. That’s not a casual purchase order. It’s the kind of number that buys you a seat at the table when custom chip allocations are being discussed.
Why crypto and Web3 should pay attention
On the surface, a deal between two AI companies over chip procurement has nothing to do with crypto. But the second-order effects matter.
The AI infrastructure wars directly impact the economics of decentralized computing networks. Projects like Render, Akash, and io.net have built their value propositions around offering cheaper, more accessible GPU compute as an alternative to centralized cloud providers. If Microsoft and Anthropic successfully drive down inference costs through custom silicon, the cost advantage that decentralized GPU marketplaces offer gets squeezed.
That’s not a death sentence for those projects, but it does raise the bar. Decentralized compute networks will need to compete not just on price against NVIDIA GPUs rented through traditional cloud providers, but against purpose-built inference chips that major players are designing specifically to undercut GPU costs.
There’s also the Azure angle. Microsoft Azure has become a critical infrastructure layer for a growing number of Web3 projects and AI-crypto hybrids. Anthropic’s deepening integration with Azure, distributing Claude through Microsoft’s solutions and running on Azure compute, reinforces the platform’s dominance as the default backend for enterprise AI workloads. Any Web3 project building on or integrating with Azure-hosted AI models should be watching how this partnership evolves.
For investors in AI-adjacent crypto tokens, the key metric to track is inference cost trends. If custom chips from Microsoft, Google, and Amazon continue to push inference costs lower, the tokenomics of projects that price their services based on GPU compute margins will need to adjust. The projects that survive will be those offering something centralized providers can’t: censorship resistance, permissionless access, or geographic distribution that cloud giants won’t serve.
The broader takeaway for anyone at the intersection of AI and crypto is that the hardware layer is consolidating around a small number of very large players. Anthropic’s chip diversification strategy is rational, but it’s diversifying within the walled gardens of Big Tech. That dynamic creates both risk and opportunity for decentralized alternatives, depending entirely on whether those alternatives can find use cases that the Microsofts of the world aren’t interested in serving.
Earn with Nexo