Technology & Science
Microsoft Launches Maia 200 Inference Accelerator, Opens SDK to External Users
On 26 Jan 2026, Microsoft released its second-generation Maia 200 AI chip into production data-centres and invited third-party developers to build on it, marking the company’s first broad rollout of in-house silicon for large-scale inference.
Focusing Facts
- Maia 200 contains ~144 billion transistors, delivers 10 petaFLOPS at FP4 and 5 petaFLOPS at FP8, backed by 216 GB HBM3e pushing 7 TB/s bandwidth.
- Microsoft says a single Maia 200 node runs at 750 W and cuts inference costs by 30 % compared with any competing AI chip, including Nvidia’s Blackwell B200.
- Clusters can knit up to 6,144 Maia 200s via an on-die Ethernet NIC offering 2.8 TB/s bidirectional bandwidth, yielding 61 exaFLOPS of aggregate compute.
You've read the facts. The perspectives are behind this line.
Perspectives in this article
- Microsoft corporate communications and affiliated outlets
- Skeptical tech trade press
- Business and financial news outlets