I don't think this quite captures the problem: even if the code is functional and proven to work, it can still be bad in many other ways.
The submitter should understand how it works and be able to 'own' and review modifications to it. That's cognitive work submitters ipso facto don't do by offloading the understanding to an LLM. That's the actual hard work reviewers and future programmers have to do instead.
I wonder how much their revenue really ends up contributes towards covering their costs.
In my mind, they're hardly making any money compared to how much they're spending, and are relying on future modeling and efficiency gains to be able to reduce their costs but are pursuing user growth and engagement almost fully -- the more queries they get, the more data they get, the bigger a data moat they can build.
It almost certainly is not. Until we know what the useful life of NVIDIA GPUs are, then it's impossible to determine whether this is profitable or not.
The depreciation schedule isn't as big a factor as you'd think.
The marginal cost of an API call is small relative to what users pay, and utilization rates at scale are pretty high. You don't need perfect certainty about GPU lifespan to see that the spread between cost-per-token and revenue-per-token leaves a lot of room.
And datacenter GPUs have been running inference workloads for years now, so companies have a good idea of rates of failure and obsolescence. They're not throwing away two-year-old chips.
> The marginal cost of an API call is small relative to what users pay, and utilization rates at scale are pretty high.
How do you know this?
> You don't need perfect certainty about GPU lifespan to see that the spread between cost-per-token and revenue-per-token leaves a lot of room.
You can't even speculate this spread without knowing even a rough idea of cost-per-token. Currently, it's total paper math on what the cost-per-token is.
> And datacenter GPUs have been running inference workloads for years now,
And inference resource intensity is a moving target. If a new model comes out that requires 2x the amount of resources now.
> They're not throwing away two-year-old chips.
Maybe, but they'll be replaced by either (a) a higher performance GPU that can deliver the same results with less energy, less physical density, and less cooling or (b) the extended support costs becomes financially untenable.
>> "In my mind, they're hardly making any money compared to how much they're spending"
> everyone seems to assume this, but its not like its a company run by dummies, or has dummy investors.
It has nothing to do with their management or investors being "dummies" but the numbers are the numbers.
OpenAI has data center rental costs approaching $620 billion, which is expected to rise to $1.4 trillion by 2033.
Annualized revenue is expected to be "only" $20 billion this year.
$1.4 trillion is 70x current revenue.
So unless they execute their strategy perfectly, hit all of their projections and hoping that neither the stock market or economy collapses, making a profit in the foreseeable future is highly unlikely.
They are drowning in debt and go into more and more ridiculous schemes to raise/get more money.
--- start quote ---
OpenAI has made $1.4 trillion in commitments to procure the energy and computing power it needs to fuel its operations in the future. But it has previously disclosed that it expects to make only $20 billion in revenues this year. And a recent analysis by HSBC concluded that even if the company is making more than $200 billion by 2030, it will still need to find a further $207 billion in funding to stay in business.
To me it seems that they're banking on it becoming indispensable. Right now I could go back to pre-AI and be a little disappointed but otherwise fine. I figure all of these AI companies are in a race to make themselves part of everyone's core workflow in life, like clothing or a smart phone, such that we don't have much of a choice as to whether we use it or not - it just IS.
That's what the investors are chasing, in my opinion.
It'll never be literally indispensible, because open models exist - either served by third-party providers, or even ran locally in a homelab setup. A nice thing that's arguably unique about the latter is that you can trade scale for latency - you get to run much larger models on the same hardware if they can chug on the answer overnight (with offload to fast SSD for bulk storage of parameters and activations) instead of just answering on the spot. Large providers don't want to do this, because keeping your query's activations around is just too expensive when scaled to many users.
I haven't seen things work like this in practice, where heavy AI users end up being able to generating a solution, then later grasp it and learn from it, with any kind of effectiveness or deep understanding.
It's like reading the solution to a math proof instead of proving it yourself. Or writing a summary of a book compared to reading one. The effort towards seeing the design space and choosing a particular solution doesn't exist; you only see the result, not the other ways it could've been. You don't get a feedback loop to learn from either, since that'll be AI generated too.
It's true there's nothing stopping someone from going back and trying to solve it themselves to get the same kind of learning, but learning the bugfix (or whatever change) by studying it once in place just isn't the same.
And things don't work like that in practice any more than things like "we'll add tests later" end up being followed through with with any regularity. If you fix a bug, the next thing for you to do is to fix another bug, or build another feature, write another doc, etc., not dwell on work that was already 'done'.
Ironically, AI is really good at the adding tests later thing. It can really help round out test coverage for a piece of code and create some reusable stuff that can inspire you to test even more.
I’m not a super heavy AI user but I’ve vibe coded a few things for the frontend with it. It has helped me understand how you lay out react apps a little better and how the legos that React gives you work. Probably far less than if I had done it from scratch and read a book but sometimes a working prototype is so much more valuable to a product initiative than learning a programming language is that you would be absolutely burning time and value to not vibe code the prototype
> Instead of spending three hours figuring out which API to use, they spend twenty minutes evaluating options the AI surfaced
This really isn't the case from what I've seen. It's that they use Cursor or other code generation tools integrated into their development environment to generate code, and if it's functional and looks from a fuzzy distance like 'good' code (in the 'code in the small' sense), they send an oversized PR, and it's up to the reviewer to actually do the thinking.
This. I have seen MRs with generated open cv lut mapping in them because a junior didnt understand that what they needed was a simple interpolation function.
The crux is always that you dont know what u dont know. AI doesnt fix this.
There was Chainer, which originated the define-by-run model that characterized PyTorch’s effectiveness. It was developed by a much smaller, much less influential company in Japan. Early PyTorch is transparent about the debt owed to Chainer.
Thanks. Yes, I remember Chainer, but only vaguely. I kinda remember looking at it, but not actually using it.
My recollection is that when I looked at Chainer back then, it didn't offer a comprehensive library of preexisting components for deep learning. When I tried PyTorch, on the other hand, I vividly remember it as already having lots of prebuilt components (common layers, activation functions, etc.) in `torch.nn`, so it was easier and faster to get going.
Yes, exactly—not many people know about Chainer nowadays. Back in 2016, PyTorch's interface was actually inferior to Chainer's, and I think Chainer's design was really ahead of its time.
There's a bunch of milestones, for me the standout one was managers starting to abuse marking tickets for large events as "secret" to stop people from reading their screwups. Someone leaked that the cause for some large AWS outage was someone oopsing some CLI command, and it seemed to trigger a pretty large shift.
We are remote. He’s been told at least a few times.
I can do this over 1:1 over remote, but it definitely doesn’t feel the same.
We have a team onsite in a few weeks. I could do it then. I still think it’s more likely than not that things will go sour, especially since the rest of the team will actually be there, and I’ll be letting him know a lot of the team has had issues with him.
The rest of the team is extremely passive and conflict-averse. It’ll be awkward then.
I can try a remote 1:1 since I think there are more ways to keep a cap on this.
The very fact that I have to tiptoe around this makes me realize this guy is super volatile and angry. Even reducing our 1:1 cadence, which I did for all team members simply due to too much meeting load on me, led him to cancel them altogether.
Keep calm. You have time and the team on your side, if not management. Don’t let hin drag you down, he will only continue to sabotage himself. All you need to do is mitigate risk and update management on his misdoings.
Frankly, I think if you asked the difficult engineer, he'd say we're also stuck in a culture of live and let live vs. excellence, and that he's on the side of excellence, and his teammates just aren't good enough.
I’m planning on making it clear we should do this. I’m expecting push-back. I’m just not going to entertain it and rely on my manager to handle scheduling while I take part in technical review. I’m exhausted of having to fight this guy to do what’s right.
It probably won’t be him who writes it but he’ll have to be involved at least as a code reviewer for the AIs. I’m sure he’s going to complain and try to make it look unnecessary. I’m mostly done trying to make things work with this guy. I’ll move up to my skip if I keep getting hassled like this.
The submitter should understand how it works and be able to 'own' and review modifications to it. That's cognitive work submitters ipso facto don't do by offloading the understanding to an LLM. That's the actual hard work reviewers and future programmers have to do instead.