Blog

AI is supposed to be smart…

But it is dumb and silent without storage.

Andy Marken

AI can provide a multitude of benefits to countries, companies and individuals. We love AI because it is driving the dramatic success of an industry we’ve been active in for years… storage. That’s right, without storage, AI couldn’t develop anything because it wouldn’t have any data to rely on or any place to store its work. Yeah, it’s digital and needs a place to be saved when its work is done, or it just evaporates, never to be seen or heard from again.

“When I was a kid, I used to see men go off on this kind of jobs… and not come back.” Dick, “The Wages of Fear,“ Compagnie Industrielle et Commerciale Cinématographique (CICC), 1955

A few months ago, Jon Peddie Research (JPR) put out a beautiful, very comprehensive 350+ page 2025 AI Processor Report, and people who wanted to talk like they’re “in the know” (Wall Street brokers, chip manufacturers, IT/power/facility production companies and AI wanna be kings or queens), bought a copy.

Some probably bought multiple copies… just in case. 

We know Jon pretty well, but when we read the announcement, the only thing we said was “Piffle.” His understanding of the technologies, research and analysis have always been good; but we’re not AI fanboys, and we know that without a constant flow of data from anywhere/everywhere, those processors don’t even make good fishing lures. More importantly, once the chip has gorged itself and regurgitated new “believable” data, the content can’t simply keep circling the globe. It has to go somewhere to rest and be kept until it’s needed by another AI chip.

In other words, without devices, AIPs have nothing coming in, nothing going out and nothing to do! That’s right, AI is zip, zero, nada without tape, hard drives, optical and solid-state devices that do one thing – keep your (and everyone else’s) information so it doesn’t just disappear.

That’s why there are more than 11,000 data centers around the globe, with more being built every day.

That makes all of the AIP producers (and their investors) extremely happy. Sure, they have rack upon rack of AI chips from Nvidia, the folks with the huge lead in the CPU, GPU and AI accelerator arena, as well as AMD (very respectable player), Qualcomm, Huawei, IBM, Hailo, AWS, Samba and 13- plus other companies around the globe, all in a rush to get a big chunk of what Peddie projects to be a $378B market.

This includes all of the AI companies – OpenAI, Anthropic, Nvidia, Google (Alphabet), Microsoft, Amazon, Meta, Apple, Amazon and others – that are racing to be the first to deliver artificial general intelligence (AGI).

Until they get there, they’re releasing half-baked product, like ChatGPT, that create text, images, code and stuff based on content that has been produced by humans. 

Now they have a new tool they’re pushing that will make great video and change the face of the video entertainment industry. You know – Google’s Veo 3, Meta’s Vibes, Midjourney’s AI video generator and OpenAI’s Sora 2. Sure, they promise they won’t steal professional IP, but we don’t quite believe them.

A few people have computers fast enough and powerful enough to do all of this in their home and office, along with the high-speed, high-capacity storage devices to rapidly and accurately send, receive and store all of the workload. However, most folks rely on those data centers to do all the work. They don’t call them data centers, though. Instead, they say all the work is done in the cloud. Sounds so nice and friendly, but it’s acres and acres of buildings with rack after rack of high-powered AIPs. Oh yeah, and mile after mile of storage devices – tape, hard drives and solid-state drives – strung together on those racks.

Data center

(Source: Deeznutz1, Pixabay)

Without storage, all those expensive and highly touted chips could do is consume energy and get hot. All of the AI data centers require access to large datasets ranging from GB to PB. Smaller facilities have petabytes of storage, and hyperscale operations like those run by Google and Microsoft reach exabyte (millions of TB) storage capacity, while Meta’s lowly AI Research SuperCluster only has about 175PB of storage. Oh, and a single rack can hold over 1,000 TB (1 PB)

Usually, the AI cloud data centers have a mix of storage devices – tape, HD, SSD – to efficiently and effectively deliver and store the processed data. They all have different features, different capabilities, different capacities and different price points.

The storage industry has made a lot of advances since Univac introduced the first tape drive back 
in 1951 and when IBM rolled out their 3 MB-plus, 1,700 pound RAMAC hard drive. It’s a good thing storage device manufacturers have increased the storage capacity and reliability in addition to being able to reduce the cost of today’s drives, because few of us could afford to rent IBM’s RAMAC (it was only available with a lease), as it cost $3,200/month ($38,500 in today’s dollars).

All those AI “cloud” data centers, AIPs and other data processing centers are projected to produce about 181 ZB of data this year, according to storage industry research firms like IDC, and the storage industry is struggling to keep pace with demand with about 200 ZB of storage capacity this year. And there are no signs of companies and people slowing down their creation/storage any time soon. According to the latest estimates from Statista, we (all of us and AI) create, capture, copy and consume about 402.8 million TB daily. We’ll give you a hint, very little of that is deleted because folks always save it… just in case.  

Data storage didn’t really become a thing until the 1960s with the early computers, then databases in the ’70s-’80s, then the Internet and Web in the ’80s-’90s, then big data in the 2000s, then social media, mobile devices, IoT and, finally, AI – and there’s no turning back.

massive bookshelves

(Source: Cincinnati Library Archives)

Before that, data and images were chiseled on rocks, written on papyrus, printed on paper and burned onto glass plates, copper sheets, light-sensitive plates and cellulose acetate film. In other words, only a small fraction of our documented history/data has been digitized – less than 10 percent of all books, according to a survey of European archives, from the time when homo sapiens started walking upright… don’t ask.

But as Peddie indirectly noted, the combination of LLM inference, AI proliferation and memory-bound workloads will keep storage device technology and production busy for years. The industry has come a long way since Jugi Tandon (founder of JT Storage) tried to figure out why the drives produced in the morning in his start-up hard-drive factory in India worked flawlessly, while those manufactured in the afternoon were complete messes. He flew from SoCal to India and discovered when workers took their noon break, they ate in the clean rooms, which were air-conditioned… sticky hands turned out lousy drives.

Then there was Colorado-based MiniScribe that was caught shipping bricks out at the end of the month to make their financial numbers (probably where the phrase bricking a HD came from). It was creative but marked the end of the company. Fortunately, most of the industry – tape, HD, SSD – has been, and is, more technically competent and professional.

A great percentage of today’s data is stored in the large cloud data centers spread around the globe in the racks of NAS/SAN (network-attached storage/storage area network) tape, high-performance, high-capacity hard drives and solid-state drives in rack configurations managed by storage management software. The racked storage is supported by RAID (redundant array of independent disks) in RAID 0 to 5 configurations, which store data across multiple drives for speed, data reliability and security. If you’re really interested in the dumbed-down explanation, go here

RAID is necessary because storage devices do fail – and usually at the worst possible time – making data unavailable, and we all expect our data to be available 99.999 percent of the time.

(Source: Coughlin Associates)

Note the above data life is theoretical and depends a lot on stable temperature and humidity, which are carefully maintained in today’s large AI/cloud data centers. The facilities don’t look for the cheapest storage type in their racks because they want their data to be available as long as possible. But drives do fail even in the miles of racks housed in the “ideal” environment.

One of the differences between their storage and yours is that they don’t go through that screaming/hollering phase before they buy a new drive, replace it and hope you’ve got a backup copy of everything (in the cloud). In fact, they don’t even try to find the one drive that has failed in the miles of racks. They simply add another rack of drives, and “when it’s convenient,” find the deceased drive, swap it out and keep processing/creating data.

It goes without saying, the storage device manufacturers are continuing to improve the data life and performance of their drives; but they’re also hell-bent on developing a faster, more reliable and higher-capacity technology… DNA data storage.
 
Scientists and engineers have been working to develop and refine the technology for years to enable fast, reliable, economic production.

The biological storage advantages are numerous:
• DNA can stably last for hundreds of years.
• Very compact, requiring a lot less storage space.
• It can be easily replicated to create backups.
• It won’t become obsolete as a technology.

One gram of DNA can store 700TB of data, and a few kilograms could store all of the data in today’s world.

And they’re still… working on it.

Today, with the widespread availability and use of the Internet, mobile devices and high-performance, economic computing and driving push of AI techies, people are eagerly using general-purpose AI “solutions” to do all the work for them and have all the fun with it, even they don’t work that well. The general-purpose AI tools may or may not deliver the right results, and if they don’t “know” the right answer, they fake it until they make it, delivering hallucinations and errors, which most folks accept without much question and store the results in the cloud data centers, creating more hallucinations and errors.  

A recent MIT study found that 95 percent of companies that tested and used general-purpose AI found little or no return on their investment, but still, hope springs eternal.

And things like chat bots and cute video generators do support the seemingly unquenchable demand for more storage, which create bigger targets for cybercriminals and others seeking fast financial gain. Sure, we have a backup in the cloud – somewhere – but we also maintain our own private storage system to keep all the “best stuff.”

We were always impressed with filmmaker John Putch’s approach. He’s worked across the industry for years but is also an ardent independent filmmaker perhaps best known for his production of Mojave Phone Booth, the film of a long-abandoned, bullet-riddled, graffiti-covered phone booth in the middle of the Mojave desert. The phone booth has since disappeared (last we heard), but it’s a cool film he funded himself. He had a rule that none of his films had to cost more than a new car, and when the film was completed, he stored it on a fresh hard drive that he would keep in a cool, dry place and then copy it to a new fresh drive every couple of years.

Yeah, never any worry about it being pilfered or lost in the inner reaches of a monstrous AI data center, and then he would sell DVD originals to folks who ordered them. That keeps personal creative work personal, private and away from the prying/manipulative reach of general-purpose/data-hungry AI tools/products.

It’s a lot like the position Mario had in The Wages of Fear when he said, “When someone else is driving, I’m scared.”

We’re really glad so many AI chips and massive numbers of storage devices are being sold but are still a little concerned that the development and distribution of AI products aren’t more closely aligned with personal and human interests and needs. You know, the stuff that advanced science, medicine, education and things like that, rather than mimicking famous people and cats.  

Who knows, it might even solve the environmental damage the AI data centers are producing because we will have plenty of storage available to do the work. 
 
WHAT DO YOU THINK?  LIKE THIS STORY? TELL YOUR FRIENDS, TELL US.