From air cooling to liquid cooling, AI drives industrial innovation
The essential reason for electronic devices to generate heat is the process of converting working energy into thermal energy. Heat dissipation is designed to address thermal management issues in high-performance computing devices, optimizing device performance and extending lifespan by directly removing heat from the surface of chips or processors. With the increase of chip power consumption, heat dissipation technology has evolved from the linear temperature equalization of one-dimensional heat pipes to the planar temperature equalization of two-dimensional VC, to the integrated temperature equalization of three-dimensional VC technology path, and finally to liquid cooling technology.

3D VC has better cooling advantages such as "efficient cooling, uniform temperature distribution, and reduced hotspots", which can meet the bottleneck requirements of heat dissipation for high-power devices and temperature equalization in high heat flux density areas. It can also ensure stronger overclocking performance and system stability after overclocking. The thermal conductivity between the heat pipe/equalizing plate is to transfer heat to multiple assembled heat pipes/equalizing plates, which has contact thermal resistance and the thermal resistance of copper itself; And 3D VC, through three-dimensional structure connectivity, undergoes internal liquid phase transition and thermal diffusion, directly and efficiently transferring chip heat to the distal end of the teeth for heat dissipation.

The cooling technology includes two types: air cooling and liquid cooling. In air-cooled technology, the heat dissipation capacity of heat pipes and VC is relatively low. The upper limit of 3D VC heat dissipation can be extended to 1000W, and both require a fan for heat dissipation. The technology is simple, inexpensive, and suitable for most devices. Liquid cooling technology has higher cooling efficiency, including two types: cold plate and immersion type. Among them, cold plate is an indirect cooling method with moderate initial investment, lower operation and maintenance costs, and relatively mature. Nvidia GB200 NVL72 adopts a cold plate liquid cooling solution; Immersion cooling is a direct cooling method with high technical requirements and high operating and maintenance costs.

The training and promotion of AI large models demand higher computing power from chips and improve the power consumption of single chips. The temperature of the chip affects its performance. When the operating temperature of the chip is close to 70-80 ℃, for every 2 ℃ increase in temperature, the performance of the chip will decrease by about 10%. Therefore, the increase in power consumption of a single chip further increases the demand for heat dissipation. In addition, the Nvidia B200 has a power consumption of over 1000W and is close to the upper limit of air-cooled cooling; Policies such as "dual carbon" and "East West Calculation" strictly require PUE for data centers, and the average PUE for liquid cooling is lower than that for air cooling; In terms of TCO, compared to air cooling, the initial investment cost of cold plate liquid cooling is close to that of air cooling, and the subsequent operating cost is lower.

Single phase immersion liquid cooled cabinet: It is a liquid cooled server built into the tank, with the CDU and tank connected by pipelines. The lower pipeline transports low-temperature cooling medium into the tank, and the liquid cooled medium absorbs the heat from the liquid cooled server. After the temperature rises, it flows back to the CDU, and the heat is carried away by the CDU. This structure can achieve full liquid cooling of the server, and the fanless design results in higher power density and lower PUE compared to air cooling. But the technical difficulty is high and the penetration rate is relatively low.

Two phase immersion: With high technical requirements, it can significantly increase the system power density. Due to the high power of the main chip in the server, the chip surface needs to undergo enhanced boiling treatment to increase the gasification core on its surface, enhance the phase change heat transfer efficiency, and achieve a maximum heat dissipation density of over 100W/c ㎡.

Driven by the development of AI computing power and policy PUE, cooling technology needs to be continuously upgraded to control the operating temperature of electronic devices. Chip level heat dissipation will shift from heat pipe/VC to more efficient 3DVC and cold plate cooling solutions, driving continuous innovation in chip cooling technology.






