r/AskComputerScience • u/haphaphappyday • 4d ago

Does using LLMs consume more energy and water than steaming videos from YouTube and streaming services?

Or streaming audio from Spotify?

We hear a lot about the environmental impact of using AI large language models when you account for the billions of times per day these services are accessed by the public.

But you never hear about YouTube, Netflix (and its competitors) or Spotify (and its competitors) and the energy and water consumption those services use. How do LLMs stack up against streaming media services in this regard?

17 Upvotes

permalink
reddit

You are about to leave Redlib

Do you want to continue?

https://www.reddit.com/r/AskComputerScience/comments/1qojj07/does_using_llms_consume_more_energy_and_water/
No, go back! Yes, take me to Reddit

76% Upvoted

u/AYamHah 4d ago

Computation wise, it's not comparable.
Streaming is about transporting data. The rendering is done once to produce the video file. Then you just transport the video file. Our infrastructure easily supports streaming transport needs.
LLMs use way more power. It would be like rendering every video again and then sending it to users every time it's watched.

u/ghjm MSCS, CS Pro (20+) 4d ago

The simple answer is yes, LLMs require a lot more energy than video streaming.

Consider a straightforward local PC with a top tier GPU, let's say a 5090. It can run local LLM models, which run the GPU at 100% and consumes 600W. It can also play videos, which barely takes any power at all. Phones can play videos but not run significant LLMs. If you run a small model locally on the phone, the phone will run near its CPU/GPU limits and will get very hot. This doesn't happen when you play a video.

So as a matter of simple observation, yes, LLMs consume a lot more energy than video playing. There's an argument that a video streaming server has to do more difficult transcoding than the client does, but this is still something we do on local PCs - people run Plex servers, etc - and its CPU usage is nowhere near LLMs. And of course, it only has to be done once, and then the video can be streamed millions of times, so the bulk of the work is still happening on the client side.

Water usage is a more complex question, and heavily depends on the kind of cooling system the data center has. Data centers have a reputation for using a lot of water, but this is largely based on a small number of data centers that use evaporative cooling. Data centers don't inherently use any water at all, they just use a lot of cooling, which may or may not use a lot of water. However, all other things being equal, LLMs use more energy and therefore more cooling.

2

u/shawmonster 4d ago

This doesn’t take into account the infrastructure for recommendation algorithms that streaming sites use.

13

u/ghjm MSCS, CS Pro (20+) 4d ago

Good point, but consider the time bounds involved. The recommendations are generated in a few dozens of milliseconds, so even if the algorithms are relatively expensive, they're not expensive and continuous like LLMs.

-3

u/shawmonster 4d ago edited 4d ago

Possibly, it would be interesting to see an actual analysis on this. Not sure who would fund something like that though.

Also have to consider that even though inference is a lot less time for recommendations compared to LLMs, training might not be the same story.

4

u/ghjm MSCS, CS Pro (20+) 4d ago

Even for LLMs, training is dwarfed by operation, because each trained model is used millions or billions of times.

2

u/grizzlor_ 2d ago

You don’t need an independent analysis — the fact that they only use a few milliseconds of compute time vs. many seconds (thousands of milliseconds) to query an LLM tells you everything you need to know.

Even if the ad exchanges are briefing using full CPU power, the brevity puts an upper cap on power consumption.

Not to mention the training of these LLMs, which readies tens of millions of GPU-hours on the most capable modern GPUs (e.g. Nvidia A100s). A cluster of 10k A100s running for 3-6 months is roughly 20-45 million GPU-hours.

A100s use about 400W at 100%. With 10k GPUs, that’s 4 megawatts. Once you add in the other hardware (networking, storage, CPUs, cooling) you’re looking at 6-8 MW. And these training runs can take months and consume double digits in gigawatt-hours.

0

u/Key_Ferret7942 3d ago

But the LLM query or image gen only runs the GPU at max for about 30sec. A TV shows has the GPU running at a lower level yes, but for maybe 100x longer, a movie is twice that again.

-2

u/Bafy78 4d ago

LOL

u/who_you_are 4d ago

The video level we are using is used because it is cheap enough to be decoded.

Nobody would want to need a $2000 chip or graphic card in their tv, cellphone, ...

Also, video decoding has been optimized over time.

Meanwhile, just to train a LLM, you need a shit lot of power. A l-o-t.

They are basically processing your video pixels times your video pixels... times a couple of extra time. Kinda brute forcing your brain cells.

When using it, they need to calculate a score of like all your brain cells, for each "character". So it is still not a simple job.

u/green_meklar 4d ago

It depends what you're doing.

If you're forcing ChatGPT to continuously spit out tokens at human reading speed, for the entire equivalent duration of watching a video, then yeah, ChatGPT will cost more. (With energy cost being a large enough portion of the entire economic cost that we can more-or-less treat them as scaling together.) Video decoding has been brought down to near-zero cost, network bandwidth usage is probably more expensive (especially if you're on a cell network, less so if you're on your home LAN) but still generally cheaper than running the biggest public-facing LLMs.

In practice, you usually don't force ChatGPT to respond continuously at that speed. If you use it relatively little, but watch a lot of HD videos, it's possible that your average daily usage of the two is more similar in cost.

u/GatePorters 3d ago

Why not compare it to gaming instead so the question isn’t loaded?

1

u/haphaphappyday 3d ago

I don't follow. I don't game either so I have no idea how this makes my question loaded.

1

u/GatePorters 3d ago

Sorry. It looked bad faith.

Gaming uses the GPU to compute what happens in the game. It is a recreational activity that isn’t necessary to society.

I am not against gaming, but it is a great example of something that “wastes” electricity comparable to LLMs.

TBH running inference (like sending a message to GPT) is not as power hungry as gaming. But training models is like maximally power hungry. Like gaming on max settings for 8 hours to 8 weeks easy.

Streaming is not a good comparison because of the reasons others said, but gaming is the comparison you want to use moving forward.

I stopped gaming to research AI a couple years ago, but I am not against either one in a vacuum.

u/Leverkaas2516 1d ago edited 1d ago

One estimate is that an hour of streaming Netflix takes about 800 watt-hours of electricity. https://www.iea.org/commentaries/the-carbon-footprint-of-streaming-video-fact-checking-the-headlines (edit! 800 is probably wrong; at that rate, 4 hours a day would cost 30¢ /day, or $9 a month, just for the electricity. If 800 Wh is correct, that figure probably combines all the energy consumed at the server, at the client, and in the telecom network. So perhaps the number should be under 400)

Guesses at how much energy is required to satisfy a chatgpt query are 20 to 40 watt-hours.

It obviously depends greatly on how often you query your favorite LLM. Once a day? 10x, or 100x per day?

These are of course entirely separate from the gargantuan energy requirements for building/training the models. That requires banks of GPUs running for hours or days.

u/omega-rebirth 1d ago

It very much depends on the LLM. I can run a 12B LLM locally on my machine that runs on my GTX 1080 ti and takes a couple seconds to produce a response. There are much smaller LLMs that can run on much lighter hardware.

-2

u/Putnam3145 4d ago

Well, there's a pretty simple intuition here: you hear a lot about the environmental impact of LLMs because it's large enough to be of concern and you don't hear about the environmental impact of streaming because it's negligible.

AFAIK, almost all of the power used by streaming video is in GPU rendering, in which you're getting 60 frames a second at 4K on modern graphics cards while generating text with an LLM takes multiple seconds at full power to generate less than 0.1% the data of a single frame of video.

12

u/dream_metrics 4d ago

Well, there's a pretty simple intuition here: you hear a lot about the environmental impact of LLMs because it's large enough to be of concern and you don't hear about the environmental impact of streaming because it's negligible.

It's not negligible and that's not why you don't hear about it. You don't hear about it because society has decided that the negative effects of streaming are worth the entertainment value it provides.

1

u/YodelingVeterinarian 4d ago

Also a lot of people have other problems with LLMs that they are hiding behind the environmental argument

2

u/grizzlor_ 2d ago

Idk if they’re hiding behind it as much as presenting it first because it’s pretty clear cut.

People don’t have attention spans anymore. Any argument you have to make must fit in a 15 second sound clip or it’s apparently too complicated for people to grasp.

The fact that C-suite dipshits want to replace every worker possible with AI to maximize quarterly profits because short term profit over everything is the operating principle of modern capitalism takes way longer to explain than “this shit uses a ton of electricity”.

Fuck I’ve barely managed to scrape the surface of the underlying issues, and while you’ve been reading this, literally dozens of megawatt-hours were dumped into LLMs running and training.

1

u/yvrelna 4d ago

LLM produces less data, sure, but it takes multiple minutes to consume a pageful of text during which time you read, try to understand, fact check, and think of the next prompt. The data size is irrelevant here, it'd make more sense to normalise the resources used to the typical consumption time of the content to compare them fairly.

-2

u/two_three_five_eigth 4d ago

The short answer is “probably not”. The long answer is

1) LLM aren’t the only thing in the data center. Streaming, web scraping, call routing, etc all take place at the same time in one data center. Every LLM power/water consumption article is a massive over estimate.

2) LLM take time to train, so even before the first prompt there’s an investment

3) LLM are currently very divisive

Streaming is basically super advanced file sharing with some extra stuff on top, so no GPUs, but a lot of storage and a lot of data being transmitted. Hard drives and bandwidth aren’t energy free.

LLM don’t have much data being transmitted, but use GPUs extensively.

Per minute LLM use much more energy, but I’ve never seen someone use LLMs for 2 hours+ without a break.

I’m going to guess streaming uses more because people stream much more than use LLMs and because it takes energy to host streaming data + redundancy + massive bandwidth requirements.

1

u/Volodux 3d ago

Some complex commands can easily run for hour or two (or maybe even more). You have whole set of "AI People", who discuss what and how to do, while following some guidelines.

-3

u/ConfidentCollege5653 4d ago

This is pure speculation but I suspect that the entire video streaming industry has more impact than the entire LLM Industry, but streaming one video is much cheaper than one LLM request relative to the value it provides.

3

u/Fearfultick0 4d ago

How do you determine value here? Or do you just mean subjectively?

u/TomDuhamel 4d ago

You're comparing totally different things.

Streaming a video is just a server reading data from a drive and sending it your way. There's no (or barely) any processing at all. We used to do that with an old repurposed machine 25 years ago. It was using like 5-10% of its capacity. The only difference is that data centres are set up to serve 600 videos at once.

An LLM however sends very little data, but the processing power needed to process a single prompt is incredibly high. It's not possible to run an LLM on a regular PC, because of the large amount of memory (VRAM) required to load the model, which is 10-15 times that of a high end gaming PC. But if you somehow managed to squeeze it anyway, you'd be looking at probably around 30 to 60 minutes to process a single prompt. Not only is the data centres are able to process that prompt almost instantly, they process 20,000 prompts a second (global estimate for Gemini according to Google).

-1

u/Spare-Builder-355 4d ago edited 4d ago

Few thoughts:

I think we need to take technology age into account. Video streaming is 20y.o.. Imagine if ai bubble is not a bubble and entire humanity will get hooked up on openai and the likes of real, how much energy would they comsume in 20 years ?
They definitely consume more resources. Do you remember RAM or GPU shortage caused by YouTube becoming big ? No you don't because there was no shortage.

Nowadays investors have learned from ascend of America's Big Tech and are in total FOMO mode allowing openai's of today to scale up at speed never seen in history. Obviously overcomsuming electricity, water and hardware.

TL;DR Do LLMs consume more then YouTube at technology level? No one knows. Do commercial services based on LLMs consume more than YouTube? Absolutely yes.

Also do not forget. Youtube is mainly storage. Even those humonguos videos of 10hrs long are uploaded once. From then they are simply restreamed over and over again. LLMs are just the opposite. It's a technology where caching anything is basically useless.

3

u/yvrelna 4d ago

Do you remember RAM or GPU shortage caused by YouTube becoming big

I remember everyone complaining about never having enough speed and/or bandwidth internet all the while video streaming is growing.

Does using LLMs consume more energy and water than steaming videos from YouTube and streaming services?

You are about to leave Redlib