The Incredible Power of Power Efficient Storage: How Modern SSDs Are Transforming the Data Center

Solidigm high-density data storage is making the data center more power efficient
Solidigm high-density data storage is making the data center more power efficient

AI has its limits. The insatiable demand for AI compute is stretching energy grids to their breaking point. Five years ago, when the last remaining reactor at the Three Mile Island nuclear plant was decommissioned, no one could have predicted it would come back to life to power just a single data center. But that’s exactly what has occurred with Microsoft’s recent power purchase agreement, and they are not alone in having extreme AI energy challenges. 

Today’s data center architects understand that every watt and square foot matters when it comes to deploying new AI applications. Enterprises cannot run AI on the hardware of yesterday, and storage is no exception. Choosing more energy and space efficient solid-state drives (SSDs) can free up power and space needed for more AI model training and inferencing.

AI power and data is growing

No conversation about data center power efficiency can happen without comprehending the extreme growth in compute power and data over the past ten years. Back in 2014, an average processor needed 100W of cooling. In 2024, that average has grown more than five times,1 with current NVIDIA H100 SXM GPUs needing 700W of cooling.2

Average rack power has seen a corresponding increase in requirements. Rack power in 2014 averaged about 4 to 5kW, whereas in 2024 this has grown to 10 to 14kW,3 with GPU-based compute racks calling for much more. At the recent OCP Summit conference, both Microsoft and Google mentioned they had working racks that scaled from 100s of kW to 1MW. 

We would probably build out bigger clusters than we currently can if you could get the energy to do it.

                                                                                                                                                                       Mark Zuckerberg, Meta4

In addition, GenAI and other AI applications devour ever more data to deliver better models, driving a massive increase in data volume, e.g., 3 to 5 billion new pages added each month to the common crawl.5 We have also seen some AI model data sets more than doubling in size every two years.6

Storage is an underappreciated element of AI power consumption

The challenges in delivering sufficient power and cooling for GPU infrastructure grab today’s headlines, but when power is limited, every watt matters. Beyond compute, storage represents a significant portion of data center energy usage. 

For instance, published data from Meta shows that legacy hard disk drive (HDD) storage consumes 35% of AI recommendation engine cluster power.7 Data from Microsoft says that storage accounts for 33% of an Azure solution’s overall operational emissions which correlates with energy consumption.8 In power-constrained environments, a watt used for storage is, in effect, one less watt for compute. 

Data storage designed with high-capacity SSDs enables you to store more data in fewer devices versus legacy storage. More to the point, all things being equal, fewer drives use less energy, fewer servers, less space, and as a result, can reduce overall cooling requirements. The industry’s highest capacity data center SSD, the Solidigm D5-P5336, available in capacities up to 61.44TB, can store massive data sets in a smaller operational power footprint compared to today’s highest capacity HDDs.9

We find the data capacity used per AI rack (4 DHX servers) varies from somewhere between 0.5 to 2.0PB of data for text-based AI applications and around 16PB of data for vision-based AI applications. Moreover, multiple vendors are showcasing up to 32PB per AI rack. To accurately depict power savings in the table below, we elected to use 16PB of data per compute rack but realize that SSD power savings scale almost linearly depending on how much data is needed. 

For our comparison, we host the 16PB of data on a TLC SSD cache/HDD backend storage or an all-Solidigm QLC SSD solution.

                                                                    16 PB of Data Storage per Compute Rack

Storage Config TLC Cache  
With HDD Backend
All-Solidigm QLC SSD
Data Locality

Split

  • 10% (1.6PB) in TLC NAND as cache
  • 14.4PB in HDDs as bulk storage

All data in QLC NAND

  • 100% (16PB) as bulk storage, no cache required for all SSD storage
Storage Rack Space

~3 racks (78U), including

Cache: 18U (209 TLC SSDs at 7.68TB each) in 12 SSD/1U servers

Bulk Storage: 60U (1,800 HDDs at 24TB each, assuming three-way mirroring), in 90 drive/3U JBoDs

0.5 racks (21U), including

Bulk Storage: 21U (521 SSDs at 61.44TB each, assuming two-way mirroring), in 1-12 SSD/1U server plus 2-32 drive 1U JBoFs, or 76 SSDs per 3U of rack space

 

Storage Power 

18.9kW, including

Cache: 1.3kW, assuming 209 TLC SSDs with

  • 11% duty cycle

  • 18W active power

  • 5W idle power

Bulk Storage: 17.6kW, assuming 1800 HDDs with

  • 100% duty cycle

  • 9.8W average active power

3.7kW

Bulk Storage: assuming 521 QLC SSDs with

  • 11% duty cycle

  • 24W active power

  • 5W idle power

Support Power & Rack space  10.5kW (3.5kW each for 3U-PSU + 3U-networking) and at 6U per rack is 18U of rack space 3.5kW (3U-PSU + 3U-networking) and 6U of rack space
Total Power & Total Rack space 29.4kW, 96U over 3 racks  7.2kW, 27U over 1 rack

The bottom line

The result of deploying an all-Solidigm D5-P5336 QLC SSD array would save the data center up to 22.2kW of power and over 1.6 racks of space for 16PB of AI data. Your mileage may vary, but in general this is the amount of power and space savings you can realize deploying QLC SSDs over legacy storage for a single rack of AI compute.

Saving 22.2kW of power may not seem like much when an NVIDIA DGX H100 server consumes 10.2kW, but that could mean deploying two more of them for AI applications in the data center. And the power savings only increase if more data per compute rack were needed for AI.

We would be remiss if we didn’t mention that there is a cost differential to consider here. The costs to purchase HDDs have historically been lower than SSDs on a $/TB basis. So, acquisition costs may be higher for all-QLC SSD storage.

Nonetheless, for power constrained retrofits or even greenfield data center deployments with limited power, being able to save watts can be a make-or-break factor in bringing new AI applications online.

When it comes to power and space efficiency, today’s enterprise Solidigm QLC SSDs are transforming the modern data center. Choosing energy-efficient, space-efficient SSD storage can deliver a more fully utilized return on AI infrastructure investments.


About the Authors

Dave Sierra is a Product Marketing Analyst at Solidigm, where he focuses on solving the infrastructure efficiency challenges that face today's data centers.

Ace Stryker is the Director of Market Development at Solidigm, where he focuses on emerging applications for the company’s portfolio of data center storage solutions, with a special expertise in AI workloads and solutions.

Notes

1. Average rack power and power segmentation

2. Source: https://resources.nvidia.com/en-us-tensor-core/nvidia-tensor-core-gpu-datasheet

3. Source: https://www.idc.com/getdoc.jsp?containerId=US50554523

4. Source: https://www.techerati.com/news-hub/energy-constraints-holding-back-ai-data-centres-says-mark-zuckerberg

5. Source: https://commoncrawl.org/

6. Source: https://epochai.org/trends#data

7. Source: https://engineering.fb.com/2022/09/19/ml-applications/data-ingestion-machine-learning-training-meta/

8. A Call for Research on Storage Emissions, Carnegie Melon and Microsoft Azure, https://hotcarbon.org/assets/2024/pdf/hotcarbon24-final126.pdf

9. https://www.solidigm.com/products/data-center/d5/p5336.html