Martin Brody, police chief of Amity Island, utters the immortal words at a key moment in 1975’s Jaws. He has just seen the killer shark his team is hunting up close for the first time, and in a moment of terror, informs Captain Sam Quint that his rig is wholly inadequate for the job.
Two years into the artificial intelligence boom, the tech industry is having its own Jaws moment. AI, it turns out, uses a lot of data.
According to FutureTech, the median size of an AI model training set in 2021 was 42 billion data points. That more than doubled in 2022 to 105 billion. Then that number multiplied by more than six in another year, to 750 billion data points in 2023 for the median AI training data set.
One solution is to throw more storage devices at the problem. Hard drives are widely available in capacities up to 24 terabytes (TB) today. If you need another 10 petabytes (PB) of storage to train your latest model, just add 417 more of them!
The problem with this approach, of course, is that 417 hard drives consume a lot of power and rack space. Not to mention the extra work associated with cooling that many drives and replacing them as they fail.
The other solution: a bigger boat. Solidigm recently announced that its highest capacity SSD, the D5-P5336, will be available in Q1’25 at a capacity of 122.88TB. That’s more than 5x the space of the aforementioned hard drive. You’d need just 84 to store 10PB.
How much storage is 122.88TB? Roughly enough for 4K-quality copies of every movie theatrically released in the 1990s, 2.6 times over. Or the entire content of The Beatles’ song catalog, more than 144 thousand times over. Or the collected works of William Shakespeare, more than 17 million times. All in about the size of a deck of cards.
This kind of density doesn’t happen by chance. Let’s review a few key technical innovations that paved the way.
To understand how Solidigm developed the capability for such massive SSDs, let’s detour briefly to 2018. That’s when Intel released the SSD 660p, the world’s first storage device based on QLC, or quad-level-cell, NAND. Solidigm emerged from the sale of Intel’s storage group to SK hynix in 2021, and continued to build on this QLC technology.
QLC works by storing four bits of information per NAND cell, a 33% increase over the TLC media widely used then as well as today. Although that first product was built for personal computers – not data centers – and topped out at 2TB, it laid the groundwork for increasingly dense drives.
Fast-forward to today. Solidigm is shipping its fourth generation of QLC NAND-based SSDs. Thanks to technical breakthroughs since 2018, including the introduction of D5-P4326, D5-P5316, and of course D5-P5336, the 100TB single-device barrier is history. Each media generation has brought leaps in not just density but also performance, endurance, and reliability.
The D5-P5336 SSD embodies high-density innovation at its core. Every design aspect was meticulously optimized to achieve the highest storage density within a minimal footprint, both in size and power consumption. Solidigm’s architects and engineers dedicated countless hours to craft the most densely packed SSD on the market, setting a new industry standard in storage density and efficiency.
The D5-P5336 achieves this unprecedented capacity scaling through several groundbreaking innovations:
Media Bandwidth Expansion: While most SSD vendors rely on active switches that continuously sample the bus on the SoC side and retransmit on the NAND side, consuming power even when idle, Solidigm designed high speed NAND bus multiplexer —a passive switch that consumes almost zero power. This advancement allows the D5-P5336 to maximize energy efficiency, which is crucial as SSD density and the number of NAND components grows.
NAND and Board Design: Our industry insights led us to introduce folded printed circuit boards (PCBs) for high-density U.2 SSD designs, a critical innovation for the 122TB SSD. This design fits a high number of NAND sites alongside memory, controller, and other components on the board. With the industry’s smallest NAND package, Solidigm can integrate more packages per board. By transitioning to a finer ball pitch, we reduced controller size from 23x23 mm to 19x19 mm. Additionally, the use of aluminum electrolytic capacitors, which have a smaller footprint than polymer tantalum capacitors, significantly minimized space requirements for power loss protection.
Power Adaptability: Beyond the standard 25W, Solidigm D5-P5536 series SSD also supports low-power modes. Thanks to the cutting-edge algorithm to stagger the NAND component usage, D5-P5336 122TB delivers outstanding performance at high power efficiency. It allows customers to adapt power and performance to their own storage system requirements and allocate power budgets effectively across compute, networking, and storage resources.
The 122.88TB SSD from Solidigm could not have come to market at a better time.
AI Content demands are exploding on both the training and inference sides. In the race for ever-more-capable and sophisticated models, developers are feeding increasingly gigantic data sets into AI Clusters. Just a few years ago, a model with a parameter count in the millions was impressive. Today, we are well into the territory of hundreds of trillions of parameters - and beyond. High density QLC takes years to wear out under typical synthetic workloads. Early estimates show that the D5-P5336 122.88TB running 32KB or 4KB random writes 100% duty cycle 24/7 will not wear out this drive in the five years of warranty life. That means unlimited random writes endurance during the warranty life of the drive.
Deployed models need that space too. Generative models for images and videos often deal with individual inputs and outputs that are hundreds of times the size of text ones. And in many cases, those models are churning through thousands or even millions of requests a week. Retaining all that data for archive or re-training purposes is a significant undertaking.
And AI is increasingly happening in places outside the data center. Edge inference and fine-tuning are growing practices among enterprises who would rather move their infrastructure toward their data, as opposed to the reverse. There are lots of good reasons to want to do this – latency and security among them – and storage density becomes even more critical at the edge, where on-site servers and endpoint devices have much tighter space constraints.
More than just the need to store all this data, though, is a central challenge to all AI progress: energy efficiency. See our AI landing page for more detail on how ultra-dense SSDs can relieve this problem. In short, the more each drive can store, the fewer drives you need to power.
Ace Stryker is the director of market development at Solidigm, where he focuses on emerging applications for the company’s portfolio of data center storage solutions.
Yuyang Sun is a senior manager of product marketing at Solidigm, with over a decade of experience in SSD storage design, business planning, marketing, and strategy.
Nothing herein is intended to create any express or implied warranty, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, or any warranty arising from course of performance, course of dealing, or usage in trade.
The products described in this document may contain design defects or errors known as “errata,” which may cause the product to deviate from published specifications. Current characterized errata are available on request.
Contact your Solidigm representative or your distributor to obtain the latest specifications before placing your product order.