AI Is In Trouble

jeffsFOM · June 6, 2024, 3:23am

AI has a problem. A very large problem that will impede its deployment for the foreseeable future. But that didn’t seem to dampen the enthusiasm for AI that I saw at Computex. Yesterday I attended the AI forum that was part of this year’s Computex 2024 show in Taipei. There is a general sense among those who gave presentations and many attendees that I talked to that computing is at an inflection point. There’s talk of another industrial revolution coming. Although, as I’ll report below, I think AI has a massive reality check coming.

Since most of the hardware used to build AI systems is made in Taiwan naturally the emphasis at Computex was on hardware design and fabrication. For the hardware folks AI is just one big train barreling down the tracks. There was little talk about how AI would be used, what the downsides of AI are, social implications, moral implications. It’s just a modern version of the Oklahoma land rush at the moment, there’s so much money sloshing around but no profits today. History looks like it’s about to repeat (like it always does) the way things happened in the 1890’s (massive overbuilding of railroads), late 1990’s (massive overbuilding of fiber optic networks) and now a massive build-out of AI, with the result that 90% of the companies involved will be bankrupt within 5 years, just as what happened in the 1890’s and 1990’s.

Here’s what the stakeholders are saying. The first session in the morning began with Marc Hamilton of Nvidia giving a rising pep rally talk about Nvidia’s philosophy of IBTG (Infrastructure, Build, Train, Go) and talked about Nvidia’s NIM’s, or Nvidia Inference Microservices. Basically NIMs are just various cloud based AI models tuned for specific applications. The real star of the show was his showing off of the Blackwell GPU and explaining how it worked, and how multiple Blackwell ‘blades’ communicate within a rack. While he mentioned that Nvidia’s approach to integrating the Blackwell GPUs with their CPUs supposedly saved some power he didn’t seem to bat an eye when mentioning that a single rack with 72 Blackwell GPUs requires 100kW of power to run. Therein lies the single largest roadblock to wider deployment of AI “factories” (they don’t call them “data centers” anymore). There’s just not enough power to go around.

Other presenters talked about how they were using AI in their businesses. Google’s John Solomon did a lot of hand-waving about Google’s generic AI offerings. I think Google is seriously worried about how they are going to remain relevant once AI takes over their search functions.

Then there are the true believers such as Tom Anderson from Synopsys. They make software tools that help semiconductor companies design their chips better, as they have been for many years. He claims that in less than 5 years 70% of mobile applications will be created by AI, and that AI is the only way forward.

Xia Zhang from AWS did a lot of hand waving and didn’t give much information. Basically AWS is trying to figure out what to do with (as in how to monetize) massive amounts of data due to AI.

The afternoon sessions started off with Proveen Viadyanathan from Micron Technology. He talked a little bit about his favorite topic, HBM (high bandwidth memory) and showed some slides. But he did say one thing interesting, and I’m quoting him here: “AI is for accumulating knowledge, humans are for getting wisdom.” That definitely needs to be explored in depth much much more.

Eddie Ramirez from ARM was the first one to mention the elephant in the room when he stated there is not enough power generation on the planet to run all the proposed AI factories. So his pitch was how ARM was going to save AI by reducing power consumption in the CPU part of the GPU/CPU lash-up. Nvidia will probably be the biggest beneficiary of their efforts although Google claims to be using some ARM cpu’s in their data centers. For doing inferences in data centers, the claim is that ARM can deliver 2.5 to 3 times the tokens per watt verses X86 based systems. ARM has taken an interesting approach to solving the power problem though, and that is optimizing some CPU’s to improve inference performance for low end systems such that GPU’s would not be needed for such systems.

Rosalina Hiu from Seagate Technology explained some technical info concerning new ways to increase hard drive capacity. She made a good point about the need to preserve all of the data used to train an AI model to “preserve the source of truth”. Because the TCO of SSD drives for data centers is about 6X that of HD’s, data centers still store about 90% of their data on HD’s, so HD’s are not going away any time soon. Storage density keeps increasing, the cost per Gig is half that of SSD. And unlike AI, the power consumption of HD’s is actually decreasing. Their new 30TB drive supposedly consumes only 0.32w/TB.

Jeff Wittich from Ampere again made the point that power consumption was the most challenging constraint on the expansion of AI today. If the present growth rate continues, by 2026 data centers world wide will consume 1050Tw/hrs. That’s more than the entire country of Japan. In 2022 data center power consumption was 460Tw/hr. Even if it could be generated the power grid can’t handle that much capacity increase. So after the doom and gloom speech we got the Ampere marketing presentation. They, like ARM, are attacking the power problem through development of processors that use less power than X86. Here is part of Jeff’s presentation:

The last presenter, K.S. Pua, from Phison, seemed to be the most realistic of the hardware presenters. He posed a question: Will generative AI be profitable in the mid-term? He thinks not. Obviously it’s not profitable now in the early term. He had concerns about models possibly being trained with sensitive (personal) information. For cloud based AI systems the costs to train models have no limits. For an edge based server right now training costs are too expensive. He thinks that unless edge based model training can become reasonably affordable (at least for businesses) that AI right now is heading for a “bubble”. So K’s goal is to create affordable hardware that’s powerful enough to do AI tasks. He says we should own all of our own data, train our own models, and especially run our own inference engines.

To summarize, his concerns are:

Ownership of generative AI is unclear
Great risk of sensitive data leaking
Cloud systems not suitable for sensitive data users
Permanent subscription fees
Unpredictable long term costs

But if you own your own AI system and (marketing plug) use his software to help create your models, he claims the benefits are:

You own your own AI
You own your own data
No leak of sensitive data
Suitable for sensitive data users
One time and predictable costs

While Nvidia is the clear front runner today in terms of hardware design for AI “factories” their future growth is likely to be stifled by lack of power needed to run them. It looks like widespread deployment of AI is headed for a rough patch.

cbrake · June 10, 2024, 8:03pm

Thanks for the detailed report – the power (and cooling) problem is indeed interesting!

khem · June 12, 2024, 6:11am

@jeffsFOM superb summary, I dont get as much info even when I attend conference myself. Your summary of all the talks is bang-on and thanks for doing it. Its quite valuable for folks like us who were not there.

I think power is a one aspects and a big one, but TCO of these AI factories is prohibitively high for general population, there will only be few who will be able to bear the cost. democratized AI is what will benefit humanity in long term.

jeffsFOM · June 12, 2024, 6:44am

Right. I think that Chinese bloke who gave the last presentation was spot on. Own your own data, own your own server. OpenAI is already charging fees in order to use api’s to connect to their server, but question is will they ever be profitable in light of the huge cost of running an AI factory verses an ordinary data center? Things will shake out eventually.

cbrake · June 12, 2024, 3:07pm

@jeffsFOM if you ever get tired of engineering, I think you could do well in tech reporting

bminer · June 12, 2024, 4:10pm

Nice summary, Jeff. The AI industry is indeed primed for a rude awakening. Hundreds of billions of dollars have been spent on AI chips alone. How much actual revenue has been generated from those investments to-date?

The scale of the power (and cooling) problem will rear its ugly head eventually. There’s also a possibility that analog computing will make a comeback to reduce the cost of AI (training, inference, and power consumption), but only time will tell.

What a time to be alive!

jeffsFOM · June 13, 2024, 6:41am

Dunno, it’s really hard to glean any useful information from these trade shows, the noise level is so high. Next year Computex will be in May, little earlier than this year. Be interesting to see what the state of AI will be a year from now. Mostly it’s just a beauty pageant and an excuse to get out of the office and party a little bit. A few of the heavy hitters like Jensen Huang might be making deals but mostly it’s just a lot of glad handing.

jeffsFOM · June 13, 2024, 6:55am

Regarding the 100KW power consumption figure for a 72 GPU Blackwell rack, a little back of the envelope math shows that a single GPU would use 1.388KW. But there’s the CPU’s that go with each Blackwell plus the network management processing. Each “blade” has 2 Blackwell GPU’s married with CPU’s plus the network management electronics. Add in the power requirements for running the liquid cooling and 2.77KW per blade doesn’t look all that unreasonable. A high end gaming system like my son has pulls almost a KW, the monster APC 750W UPS can’t handle it, so I don’t think that 100KW peak power per rack number is that far off. Don’t forget all the storage needed as well.

khem · June 13, 2024, 4:50pm

Does blackwell pair with ARM core for CPU ? if so then perhaps we have best power saving CPU on it already.

jeffsFOM · June 14, 2024, 1:28am

I don’t recall what processor architecture was mated with Blackwell, I think it’s X86 of some flavor. But then the Nvidia Jetson AGX uses an Ampere GPU mated to an ARM CPU so a Blackwell/ARM lashup may be in the works.

khem · June 14, 2024, 1:43am

it might be coming out soon

cbrake · October 1, 2024, 1:23pm

Bringing a nuclear power plant back on-line to power AI: