The evolution of data center cooling

Jason Matteson, director of product strategy Iceotope gives his views on the journey to higher power in the rack

Liquid cooling is fast becoming critical for delivering power and energy efficiencies while responding to increased processing and storage requirements. Iceotope’s Jason Matteson explains what has changed the game.

Today’s data center rack solutions can eclipse 30, 35, or even 45 kilowatts (kW). However, whether you are a colocation provider, an enterprise or edge solutions provider, it has been a long journey to delivering higher power within the rack, with a significant upward trajectory coming in the last couple of years.

I have been in the electronics industry since 1997, with the first 17 years at IBM, within their x86 server division, as a principal cooling architect and engineer for the company’s ultra-dense servers. Following an IBM divestiture in 2014, I spent three years at Lenovo within its R&D group, where I was responsible for identifying and driving disruptive, differentiating technologies.

Certainly, I saw that the move away from air-cooling would be necessary, and it has been a long time coming. As early as  2011, some analysts were already highlighting the advantages of liquid cooling.

When I first started, it was simply about using bigger fans and larger heat sinks, with the then sub-30 watt (W) processors. However, by the early 2000s, we started architecting solutions for CPUs with up to approx. 80W. The struggle for power density had begun in earnest.

Most engineers saw this struggle as a facility cooling problem. Instead of filling a 42U rack with 1U servers, they would half-populate with ~20 1U servers because they were running out of power and cooling. These were thought of as high-density air-cooled solutions. We were continuously challenged with trying to spin the fans faster and faster, which consumed more power to the cube of the RPM. We were already  pushing up against the practical limits of air-cooling technologies, especially in dense form factors, running up against mechanical, power, and acoustic limits.

In addition, the cooling algorithms and system design complexities were at the forefront of every server being developed. By the time we went to Lenovo in 2014, the cooling algorithms that once took less than 10% of the time to design and test, were now taking most of our time. Air-cooling became more complex as densities accelerated.

While at IBM, we had begun developing and delivering liquid cooled platforms, primarily targeting high-performance computing (HPC) and the government-lab type of supercomputer. However, the industry as a whole was, and to a large part still is, demanding air-cooled solutions. At the time, liquid cooling meant water within the technical space and required new infrastructure, so was viewed as distruptive.

Not long after that, energy efficiency became a prominent issue. US government reports began to appear on how data centers were consuming increasing amounts of global energy.

That was the precursor to the paradigm shift.

In addition to the regulatory push for energy savings, new high-performance applications were being developed for HPC or emcompassing artifical intelligence (AI), pushing cooling limits at the chip level. Intel, Nvidia, and AMD were launching higher density CPU and GPU solutions with growing regularity.

The need for high performance compute in all sectors

Today, where liquid cooling was once an option, it is now essential. Data driven applications are demanding more high-performance processors at the top end of the market. It is all but certain that CPU power will increase from 150, to greater than 400/500 watts over the next three to five years. Last year, Intel’s chief architect, Raja Koduri, presented a roadmap to increase transistor density by a factor of 50, possible by 2030. If so, there is an obvious convergence point with industry sustainability and net zero targets.

New applications are the real catalyst in my view, and this is just a taste of what is to come. We are now seeing silicon chip manufacturers specifying liquid cooled only CPUs.

All data centers, including colocation facilities, can increase density and efficiency however there are benefits for all sectors. Within banking and finance, if they can shave off a nanosecond from an electronic trade with an AI, ML algorithm, it can save them hundreds of thousands, if not millions of dollars over a relatively short period of time. They have always invested in the best, and fastest silicon they can get their hands on.

Healthcare is another great example. We’re already seeing AI-powered solutions addressing routine, repetitive and largely administrative tasks on a daily basis. In the next 10 years AI will access multiple sources of data to reveal patterns in disease and aid treatment and care. By then healthcare systems will be able to predict an individual’s risk of certain diseases and suggest preventative measures.

Oil exploration typically operates in harsh conditions, yet is the classic HPC environment, requiring sifting through massive amounts of data as fast as possible. With edge networking supported by advanced processing and maximum-efficiency cooling, these high-performance applications can be located on the rigs allowing analytics to be carried out at the data source.

Edge compute is also critical for smart cities, where data is generated, collected, consumed and analysed across an internet of things (IoT). For example, autonomous vehicles need IoT sensors to learn from the driver, as well as the car and hazards on the road, communicating with other sensors linked to traffic signals, central control or emergency services in real time. Data processing at the edge requires secure, efficient cooling technology with minimal maintenance.

Efficiency, sustainability and the next era of cooling

By taking advantage of liquid cooling, owners and operators can achieve a very high heat recovery and energy-efficient solution. Some companies just haven’t gotten there yet. To drive compute densities up, you’re going to need to drive up efficiency as well. Being more efficient with higher heat recovery will eventually allow for companies to put useful energy back into the grid.

The conversations about the limits of air-cooling for next era of computing are happening now, regardless of politics or geography, and this is resonating with businesses, governments, ecologists and the public as sustainability is where it should be – top of the agenda.

Knowing the types of liquid cooling is essential, from the liquid direct-to-chip, which offers the highest cooling performance at chip level but still requires air cooling, to chassis-level precision immersion, which minimises the liquid required compared to tank immersion solutions. Iceotope’s chassis-level precision immersion solution, which can include our water direct-to-chip cold-plates, can simply be mounted into any ITE industry standard rack, providing a sealed and self-contained solution, which isolates all the server components from airborne contaminants for additional protection, whether in a data centre, on premises, in a hospital or at roadside edge locations. Chassis-level precision immersion cooling enables HPC with maximum heat recovery and brings game-changing reductions in cost, energy, water and space whilst offering a familiar form factor and simplified serviceability.

The shift to liquid cooling has begun in earnest and with key hyperscalers leading the charge, industry wide adoption is inevitable.

Partner Resources

Popular Right Now

Edgecore Insight Podcast

Ep-1: Navigating the Waters of Sustainability

Others have also read ...


2019 – 2020 What – Where – Why

Edge computing relying on location, latency and bandwidth has increased with IOT demands. It is not an instead of but complimenting traditional Enterprise facilities, colo and cloud to get closer to the data source or end users. Where 5G is rolling out enterprise opportunities will follow along with edge facilities. Edge growth in other regions will be more of a steady increase until their network is upgraded

Click to View