Technology & Science

Microsoft Launches Maia 200 Inference Accelerator, Opens SDK to External Users

On 26 Jan 2026, Microsoft released its second-generation Maia 200 AI chip into production data-centres and invited third-party developers to build on it, marking the company’s first broad rollout of in-house silicon for large-scale inference.

By Priya Castellano

Focusing Facts

  1. Maia 200 contains ~144 billion transistors, delivers 10 petaFLOPS at FP4 and 5 petaFLOPS at FP8, backed by 216 GB HBM3e pushing 7 TB/s bandwidth.
  2. Microsoft says a single Maia 200 node runs at 750 W and cuts inference costs by 30 % compared with any competing AI chip, including Nvidia’s Blackwell B200.
  3. Clusters can knit up to 6,144 Maia 200s via an on-die Ethernet NIC offering 2.8 TB/s bidirectional bandwidth, yielding 61 exaFLOPS of aggregate compute.

You've read the facts. The perspectives are behind this line.

Sign up for daily briefings and 5 full articles per week. No credit card.

Perspectives in this article

  • Microsoft corporate communications and affiliated outlets
  • Skeptical tech trade press
  • Business and financial news outlets
Share

Related Stories