Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

GPT3 was a 175 bln parameters model. All the big boys are now doing trillions of parameters without a substantial chip efficiency increase. So we are talking about thousands of tons of carbon per model, repeated every year or two or however fast they become obsolete. To that we need to add embedded carbon in the entire hardware stack and datacenter, it quickly adds up.

If it's just a handfull of companies doing it, fine, it's negligible versus benefits. If it starts to chase the marginal cost of the resources in requires, so that every mid to large company feels that a few million $ or so spent training their a model on their own dataset makes them more in competitive advantages, then it quickly spirals out of control hence the cryptocoin analogy. That's exactly what many AI startups are proposing.



AI models don’t care if the electricity comes from renewable sources. Renewables are cheaper than fossil fuels at this point and getting cheaper still. I feel a lot better about a world where we consume 10x the energy but it comes from renewables than one where we only consume 2x but the lack of demand limits investment in renewables.


It's also a great load to support with renewables because you can always do training as "bulk operations" on the margins.

Just do them when renewable supply is high and demand is low; that energy can't be stored and would have been wasted anyway.


This is a complete fantasy as the depreciation rate on the hardware is higher than the prices of electricity.

Again, look at bitcoin mining, the miners will happily pay any carbon tax to work 24/7, it's better to run the farm to cover electricity prices and then make some pennies then to keep it off and incur depreciation costs.


Especially if one were to only run the servers during the daytime, when they can be powered directly from photovoltaics.


Which isn't going to happen, because you want to amortize these cards over 24 hours per day, not just when the renewables are shining or blowing.


We currently don't live in a world where renewable energy is available in excess.


This is a dangerous fantasy. Everything we know about the de-carbonation of the grid suggests that conservation is a key strategy for the next decades. There is no credible scenario towards 100% renewables. Storage is insanely expensive and green load smoothing capacity such as hydro and biomass is naturally limited. So a substantial part of the production when renewables drop will be handled by natural gas, which seem to have equivalent emissions similar to coal when you factor in the lost methane, fracked methane in particular.

In addition, even 100% renewable would be attainable, that would still require massive infrastructure investment, resource use and associated emissions, since most of the corresponding industries, such as concrete and steel production, aluminum and copper ore mining and refining etc. are very far from net zero and will stay that way for decades.

To throw into this planet-sized bonfire a large uninterruptible consumer, whose standby capital depreciation on things like state of the art datacenters far exceeds the cost most industries are willing to pay for energy, all predicated on the idea that "demand spurs renewable investments", is frankly idiotic.


Sure there is, Nuclear is zero emission and renewable and powers 70% of the French electric grid. Uranium in the oceans is thought to regenerate itself but even if that turned out to not be true, should be good for at least a few hundred thousand years. It would require a massive infrastructure investment so now would be a good time to get started.


Sounds like we'll have to adjust the price of non-renewables to reflect total cost, not just extraction, transportation, and generation cost.


The average American family is responsible for something like 50 tons per year. The carbon of one family for a decade is nothing compared to the benefits. The carbon of 1000 families for a decade is also approximately nothing compared to the benefits. It's just not relevant in the scheme of our economy.

There aren't that many base models, and finetunes take very little energy to perform.


> GPT3 was a 175 bln parameters model. All the big boys are now doing trillions of parameters without a substantial chip efficiency increase.

It's likely not the model size that's bigger, but the training corpus (see 15T for llama3). I doubt anyone has a model with “trillions” of parameters right now, one trillion maybe as rumored for GPT-4, but even for GPT-4 I'm skeptical about the rumors given the inference cost for super large models and the fact that the biggest lesson we got since llama is that training corpus size alone is enough for performance increase, at a reduce inference cost.

Edit: that doesn't change your underlying argument though: no matter if it's the parameter count that increases while staying at “Chinchilla optimal” level of training, or the training time that increases, there's still a massive increase in training power spent.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: