What is the environmental impact of LLM use on the customer's side ? Battle ChatGPT VS DeepSeek,

ChatGPT VS DeepSeek,

Introduction

a. Context

Who hasn’t ever used DeepSeek, ChatGPT, Copilot, Gemini, Mistral…? LLMs (Large Language Models) are becoming an essential part of everyday life, whether you’re a schoolchild, a developer or just a web surfer.

As this Deloitte study points out, these generative AIs have a high environmental cost, particularly on the server side. They currently account for 1.4% of global electricity consumption and are expected to triple by 2030. As the models are very heavy, they consume a lot of power when processing the data and generating the response. In the case of LLM generative AI, consumption is high on the server side, but it would also be interesting to measure the consumption and environmental impact on the user terminal side, because more generally speaking, for a traditional digital service, the impact is much greater on the client side.

This trend is corroborated by other studies, including an analysis by McKinsey published in October 2024, which estimates that demand for AI-enabled data center capacity will increase by an average of 33% per year between 2023 and 2030.

Recently, DeepSeek made a splash in the LLM world, boasting performance similar to the best with a lighter model. But what about on the client side? We propose to measure and compare the performance of DeepSeek applications and ChatGPT, the most popular generative AI today, based on the same user scenario.

b. measurement perimeter

All the measurements were taken using Greenspector Studio.

All these measurements were carried out on a real terminal, in this case a Samsung Galaxy S10 running Android 12, which corresponds to an entry-level smartphone today.

These measurements were carried out on:

DeepSeek version 1.0.8

ChatGPT version 1.2025.028.

c. Methodology and scenario

To measure the two applications, we used a common user journey for the 2 applications tested:

Opening the application

Logging in to an account (compulsory for DeepSeek and recommended for ChatGPT in order to access all the features)

Write an initial ‘simple prompt’: ‘I’m looking for an internship in digital sustainability. Explain to me in one sentence what this is all about’.

Waiting for the response to the simple prompt

A prompt asking for a 500-word response says ‘500-word prompt’: ‘Now go into more detail and explain what digital sustainability is in 500 words without searching the web.’

Waiting for the 500-word prompt response

A prompt asking the same thing as the 500-word prompt but searching the web, ‘web prompt’: ‘Now go into more detail about what digital sustainability is in 500 words by searching the web.’

Waiting for the prompt web response

Download a CV from the Internet

Insert a file in the LLM

Note that these different prompts are linked in a single discussion.

Additional methodological note:

We have only included the measurements that functionally show a result. During the measurements, DeepSeek often failed to respond to user requests, probably because the servers were too busy. We therefore had to perform more measurements on DeepSeek to obtain usable results. We also deleted part of the route initially tested from the response to the prompt file because DeepSeek did not provide a response to the generation of a covering letter.

I. Performances, consumption and environmental costs

a. Application size on the smartphone

The applications are not the same size: DeepSeek is 32.33 MB compared with 76.36 MB for ChatGPT, more than twice the weight of its rival. The same applies to the APK files: DeepSeek’s are twice as light as ChatGPT’s.

Given the number of installations and the number of updates of these 2 applications, this is a significant factor in the impact generated by this usage requirement. We’re talking about more than 10 million downloads on the playstore for DeepSeek and more than 100 million for ChatGPT.

b. Discharge speed

ChatGPT discharges the battery faster when writing a prompt. 102.37 µAh/s compared with 85.51 µAh/s for DeepSeek, figures taken from the single prompt. This trend holds true for all prompts. Given that writing prompts take the same amount of time for both, on average DeepSeek consumes 24% less than ChatGPT when writing prompts.

DeepSeek discharges the smartphone less quickly than ChatGPT when using the application. This is also confirmed in the prompt response, where on average DeepSeek unloads the smartphone 67% more slowly than its American counterpart.

As is often the case, whether it’s a web or mobile application, one of the most power-hungry pages is the home page/screen, or at least the main page. As can be seen from the table above, when the application is opened, DeepSeek consumes around 42% more than ChatGPT. However, once on the main page, ChatGPT in turn consumes 42% more than DeepSeek.

For the other pauses, there is a variation of between 2% and 7% in favour of one or the other, so there is no significant difference there.

c. Response delay

On both applications, the response time for the same prompt varies greatly from one iteration to the next. For example, on ChatGPT, the measurements range from 14.15s to 44.66s for loading the web prompt. DeepSeek, on the other hand, takes between 48.68s and 1m10s for the same prompt. In general, ChatGPT is significantly quicker to respond, whether for a simple prompt or one with more than 500 words. ChatGPT responds at least twice as quickly for a simple prompt.

We can therefore conclude that, on average, ChatGPT responds to our use cases 2.5 times faster than DeepSeek.

d. Energy consumption

In terms of response time and download speed, the table above shows that in terms of pure energy, ChatGPT consumes less even though it consumes much more per unit of time. In fact, as the response times are significantly to its advantage, ChatGPT takes advantage of this to spend less energy than DeepSeek over the entire journey on the user device, i.e. on average 34% less than DeepSeek.

e. Data flow on the network

ChatGPT has a zero data flow with the servers when the first prompt of a discussion is written. It then exchanges several kilobytes with the servers for 500-word prompts and file prompts. Finally, it is particularly datavorous when writing the web prompt, going up to 62kb.

DeepSeek, on the other hand, exchanges only a few kilobytes (5.2 KB) during the first prompt and then exchanges less data with the servers than ChatGPT, on average 90% less.

We can see from the measurements above that, in overall terms, DeepSeek uses more data to answer than ChatGPT. The Chinese AI tends to write longer answers than ChatGPT for the same question. The exception is the web prompt, where GPT uses more network data.

DeepSeek, prompt response 500 words, Data sent, Data received

ChatGPT, prompt response 500 words, Data sent, Data received

If we look a little more closely at how data is transmitted while a response is being received, we can see from the graph above that several packets arrive as the response is received, as it is displayed. Each time a packet is received, the application transmits one in turn to confirm that it has been received. This places regular demands on the smartphone’s radio cell and on the update mechanisms in the response page, which will consume a lot of energy.

If we study the JSON files of the packets received, we can see that ChatGPT’s JSONs are much lighter, due to the larger size of the tokens. Each token contains up to 42 characters, compared with a maximum of just 5 characters for DeepSeek.

For DeepSeek, launching the application requires more data than ChatGPT. However, once the application has been definitively launched, there are only small ‘reasonable’ data flows. DeepSeek therefore manages data flows better than ChatGPT.

ChatGPT is much more demanding in terms of data. It consumes less when the application is opened, but then during each pause (display without interaction) that follows a response, there is an abnormally high flow for a pause stage.

f. CPU

We’ve noticed that DeepSeek discharges its battery less quickly than ChatGPT, and we’ve also noticed that it consumes less CPU. As can be seen from the graph above for each response, ChatGPT is more CPU-intensive than DeepSeek, with in particular a very significant difference on the web prompt with 23.8% of the CPU used by ChatGPT whereas for the same action DeepSeek only uses 8.9%.

However, as explained earlier, ChatGPT uses this CPU for less time, which allows us to make the same conclusion as for energy. ChatGPT is more demanding in terms of CPU per unit of time, but over the whole response it consumes less than DeepSeek. In fact, using more CPU means higher battery consumption, so the two issues are linked.

g. Environment impact

In our case, around 8% of ChatGPT’s carbon footprint is on the client side.

Methodology for projecting environmental impacts

By projecting these flow metrics over a perimeter that does not consider the impact on the Datacenter side, but only the impact on the network and client workstations, we obtain the following data:

As we can see, the energy impact is to ChatGPT’s advantage (10%). However, whether for ChatGPT or DeepSeek, the impact remains very high for just three responses, with an average of more than one gram of CO2 per response. This corresponds to a video lasting around 2 minutes for ChatGPT and 2 minutes 30 seconds for DeepSeek.

Daily environmental impact on smartphones

However, on a large scale, ChatGPT is significantly less polluting. For example, for 100 million uses, i.e. 300 million responses, DeepSeek has an impact that is 40 tonnes greater on the customer side. According to OpenAI, ChatGPT receives 1 billion requests per day, 48% of which are on mobile phones. On average, therefore, ChatGPT consumes 560 tonnes of CO2 per day on the telephone side alone.

II. Accessibility and privacy

a. Accessibility and inclusion

Visually, the two applications are similar, with buttons of the same size and screens laid out in the same way. One criticism is that the contrasts are sometimes too blurred, and the click zones are sometimes too small.

The difference lies in accessibility for the visually impaired or blind. During the automation process, we noticed that the various elements of the DeepSeek layout did not yet have a description, identifier or any other element that would allow us to distinguish them when reading the page. As well as making it more difficult to automate the application, this is above all a problem for the visually impaired. Their assistive software relies on this content to provide a description of the page and enables them to use these applications. This is a bad practice that should be banned to enable the inclusion of as many people as possible.

Regarding the inclusion of older versions of Android, DeepSeek requires at least version 5.0 of Android, which covers 99.7% of potential users worldwide, and ChatGPT requires version 6.0 of Android, which covers 98.4% of potential users.

In terms of accessibility, DeepSeek is less cumbersome and available for a few more uses than ChatGPT, but poses a major problem for the visually impaired.

b. Suspicious authorisations

Smartphone applications require authorisation to carry out their functions using the device’s camera and microphone, which justifies the granting of these permissions.

More surprisingly, ChatGPT requires access to location, contacts and even the calendar. By exploring the application’s APK files, we can see that it also detects the taking of screenshots (when the application is open) and that it also requests access to the Bluetooth connection.

DeepSeek is more reasonable: apart from the camera and microphone, it doesn’t ask for any additional permission.

Conclusion

As we have seen throughout our analysis, DeepSeek discharges the battery less quickly, uses less CPU per unit of time and transmits less data, but has much higher response times. These response times are the main reason why it uses more energy than ChatGPT over the whole of the journey. DeepSeek’s lack of accessibility for the visually impaired is clearly a problem for its use.

In the observed use case, it is clear that the environmental impact of AI is greater on the server side. To give an order of magnitude, the value for Chat GPT for the route taken is 45.48 gEqCO2, according to the Ecologits site, whereas on the user terminals it is only 3.5 gEqCO2, or 8% of consumption.

We can therefore deduce that, at present, on a per-user terminal (and network) basis, ChatGPT has a lower environmental impact than DeepSeek in terms of greenhouse gas emissions. ChatGPT’s slight advantage is due to its lower energy consumption. However, we need to keep an eye on the development of DeepSeek’s response times, because if the Blue Whale application improves on this point, it could easily consume less energy than ChatGPT at the local device level and have less impact on the customer’s environment.

Summary of figures:

For more articles on AI : What is the environmental impact of local AI on our smartphones?

Coming soon: our Greenspector Studio SaaS Self-service offer to test and launch your first independent subscription to the service. Stay informed on Product Hunt

What is the environmental impact of LLM use on the customer’s side ? Battle ChatGPT VS DeepSeek,