Do you have a source for that?

gitremote · 2025-08-09T14:49:50 1754750990

When an end user asks ChatGPT a question, the chatbot application sends the system prompt, user prompt, and context as input tokens to an inference API, and the LLM generates output tokens for the inference API response.

GPT API inference cost (for developers) is per token (sum of input tokens, cached input tokens, and output tokens per 1M used).

https://openai.com/api/pricing/

https://azure.microsoft.com/en-us/pricing/details/cognitive-...

(Inference cost is charged per token even for free models like Meta LLaMa and DeepSeek-R1 on Amazon Bedrock. https://aws.amazon.com/bedrock/pricing/ )

ChatGPT Pro subscription pricing (the chatbot for end users) is $200/month

https://openai.com/chatgpt/pricing/

"insane thing: we are currently losing money on openai pro subscriptions!

people use it much more than we expected."

- Sam Altman, January 6, 2025

https://xcancel.com/sama/status/1876104315296968813

Again, this means that the average ChatGPT Pro end user's chattiness cost OpenAI too much inference (too many input and output tokens sent and received, respectively, for inference) per month than would be balanced out by OpenAI receiving $200/month in revenue from the average Pro user.

The analogy is like Netflix losing money on their subscriptions because their users watch too much streaming, so they ban account sharing, causing many users to cancel their subscriptions, but this actually helps them become profitable, because the extra users using their service too much generated more costs than revenue.