CompaniesHeadlinesMarket & Finance

Huawei Unveils CloudMatrix 384 to Challenge Nvidia’s GB200 NVL72 in AI Infrastructure Race

Huawei Debuts CloudMatrix 384 at WAIC 2025

Huawei made the official launch of its CloudMatrix 384, a rack-scale AI computing system to compete with Nvidia’s GB200 NVL72, at the World Artificial Intelligence Conference (WAIC) in Shanghai on July 26, 2025. The unveiling captured wide attention from market observers and represents the company’s most aggressive step so far in the international AI infrastructure competition.

The architecture combines 384 Ascend 910C processors to provide up to 300 PFLOPs of dense BF16 compute—almost 1.7 times more throughput than Nvidia’s GB200 NVL72, which employs 72 Blackwell GPUs to generate approximately 180 PFLOPs.

Huawei Debuts CloudMatrix 384 at WAIC 2025

Specs and Performance: Brute Force Meets Design Innovation

Although every Ascend 910C chip has a little over 60% of the inference capability of Nvidia’s H1004, Huawei makes it up in system design. The CloudMatrix 384’s performance, as SemiAnalysis calculates, comes from its horizontal scaling capability, applying more chips and better interconnects to circumvent each chip’s limitations.

The platform takes up 16 racks, comprising 12 compute racks and 4 network racks, and contains 6,912 800G optical transceivers. This allows for supernode architecture with all-to-all mesh connectivity, with support for 5.3x scale-out bandwidth and 2.1x memory bandwidth over Nvidia’s NVL72.

Supernode Architecture and Network Breakthroughs

Huawei’s supernode architecture is a departure from traditional Von Neumann models. It uses peer-to-peer design and optical interconnects to reduce latency and boost throughput. Latency drops from 2 microseconds to just 200 nanoseconds, while bandwidth increases 15-fold, enabling faster token generation for large language models like DeepSeek and Qwen.

The architecture also supports expert parallelism, allowing each chip to host a unique expert model, which is particularly beneficial for Mixture-of-Experts (MoE) workloads.

Strategic Push for Chip Independence

Huawei’s CloudMatrix 384 is more than a technical achievement—it’s a strategic response to U.S. export restrictions. Since being placed on the U.S. Entity List, Huawei has faced severe limitations in accessing advanced chips and manufacturing tools.

The Ascend 910C chips are manufactured with SMIC’s N+2 7nm-class process, some of the components reportedly being obtained through workaround channels involving third-party companies. In spite of these constraints, Huawei has still been able to export hundreds of thousands of Ascend chips to the domestic market.

It is against this backdrop of the system launch that China has demonstrated a long-term vision to reduce its reliance on US technology and develop an independent AI ecosystem, especially given that Nvidia’s H100 and GB200 chips remain shut out of China.

Market Context: Nvidia Dominance and Challenge to Huawei

Nvidia remains the global leader in AI infrastructure, with its GB200 NVL72 offering superior power efficiency, software ecosystem, and liquid cooling. The NVL72 consumes 145 kW, while Huawei’s CloudMatrix 384 draws 559 kW, making it 2.3x less power-efficient per FLOP.

Moreover, Nvidia’s CUDA ecosystem and widespread adoption in Western markets give it a significant edge.

Huawei’s CANN framework is still maturing, and its software compatibility remains a challenge for global deployment.

However, in China—where energy is cheaper and access to Nvidia chips is limited—Huawei’s brute-force approach may be more viable.

Implications for Global AI Infrastructure

Huawei CloudMatrix 384 marks a turning point in global AI infrastructure dynamics. The system presents an attractive alternative for nations under U.S. export restrictions or in need of supply chain diversification. The system has already been implemented in the Chinese provinces of Anhui and Inner Mongolia.

However, there are limitations to the system. The system’s high power consumption, high cost (estimated at $8 million per unit), and software ecosystem holes may stifle adoption outside of China.

Still, Huawei’s entry into the high-performance AI computing space adds pressure on Nvidia and reshapes the geopolitical tech competition, especially as AI becomes central to national strategies.

Leave a Reply

Your email address will not be published. Required fields are marked *