
Fenghua No.3: China’s Bold Bid to Build a CUDA-Class GPU
Introduction
On September 22, 2025, Innosilicon unveiled the Fenghua No.3 GPU in Zhuhai, marking a significant leap in China’s quest for self-sufficiency in advanced GPU technology. This launch aims to provide a competitive alternative to NVIDIA and AMD’s GPUs, targeting applications in artificial intelligence (AI), high-performance computing (HPC), visualization, and medical imaging.
Key Features of Fenghua No.3
Architecture
The Fenghua No.3 features a hybrid architecture that integrates a RISC-V CPU core with a “full-function GPU.” This design enables it to handle traditional graphics as well as modern AI computing workloads effectively.
Software Compatibility
One of the standout claims from Innosilicon is the GPU’s compatibility with a “CUDA-class framework and Triton operator.” This suggests it is built to work with popular platforms such as PyTorch and OpenCL. However, the true level of compatibility remains to be seen and may resemble the approach used by AMD’s ROCm/HIP rather than offering full CUDA support.
Memory Specifications
Perhaps the most eye-catching feature of the Fenghua No.3 is its claim of over 112GB of HBM memory per card. This capacity positions it as a powerful tool for training or inferring models with between 32 billion and 72 billion parameters on a single GPU. Additionally, users could potentially scale up to handle models with 671 billion–685 billion parameters across eight GPUs in a single system.
Precision Support
The GPU is designed to support single, double, and mixed-precision math, which is critical for demanding AI and HPC applications that require flexibility and precision in processing.
Special Features
Fenghua No.3 goes beyond general computing; it includes native DICOM support tailored for healthcare applications, enabling medical-grade grayscale imaging. The GPU is also capable of processing YUV444 video and includes hardware ray tracing capabilities, enhancing its versatility.
Display Capabilities
The GPU can drive up to six 8K displays simultaneously at a refresh rate of 30Hz, making it suitable for high-resolution visualization applications.
Market Context
Innosilicon, established as a key player in China’s semiconductor scene, previously developed the Fenghua 1 and 2 GPUs and has access to various CPU/GPU IP licensing. The unveiling of Fenghua No.3 comes amid stringent U.S. export controls that restrict China’s access to leading GPU technologies, such as NVIDIA’s A100/H100 and AMD’s MI300. This situation intensifies our focus on Fenghua No.3 as a strategic move towards national sovereignty in AI infrastructure.
The Chinese media has highly promoted the GPU’s memory capability and its CUDA-like compatibility, emphasizing these aspects as a demonstration of China’s advancing independence in high-performance computing hardware.
Challenges and Concerns
Benchmarking
Despite the ambitious features that Fenghua No.3 promises, there have been no independent benchmarking results released thus far. Important metrics like FLOPS, TOPS, MLPerf, and inference benchmarks are missing, which raises doubts about its real-world performance and competitiveness against established players.
Software Ecosystem
While the GPU is marketed as “CUDA-class,” the actual support infrastructure is still unclear. There are currently no public software development kits (SDKs), GitHub or Gitee repositories, or downloadable backends for Triton. To gain traction within the developer community, Innosilicon would need to produce robust equivalents of key software libraries, such as cuBLAS, cuDNN, NCCL, and ensure stable drivers for popular frameworks like PyTorch and TensorFlow.
HBM Supply and Packaging
The claim of over 112GB of HBM raises questions regarding sourcing and manufacturing. The production of true HBM3 or HBM3E memory generally requires partnerships with major manufacturers such as Samsung, SK Hynix, or Micron, along with cutting-edge packaging solutions. No specifics have been disclosed about the generation of memory used or the packaging technologies employed.
Ecosystem Maturity
Challenges surrounding multi-GPU scalability, driver performance, and long-term adoption of CUDA-based workflows remain to be explored. Without a mature ecosystem that can effectively rival NVIDIA’s CUDA stack, the Fenghua No.3 may struggle to resonate with developers despite its promising features.
Conclusion
In conclusion, the Fenghua No.3 represents a pivotal moment for China in its ongoing ambition for technological sovereignty. While the specifications and potential applications are impressive, the true measure of its success will depend on independent benchmarks, the establishment of a supportive software ecosystem, and the effective sourcing of advanced memory technologies. As the global competition in AI and high-performance computing heats up, all eyes will be on the developments surrounding Fenghua No.3 and its capability to alter the landscape of GPU technology.
Disclosure: We earn commissions if you purchase through our links. We only recommend tools tested in our AI workflows.
For recommended tools, see Recommended tool

0 Comments