Not that it invalidates anything you said, but it was 750ms vs 200 microseconds....

pjvsvsrtrxc · on June 16, 2022

> But here's the thing: it's cheaper to waste thousands of CPU cores on bad performance than to have an engineer spend a day optimizing it.

No, it really isn't. It's only cheaper for the company making the software (and only if they don't use their software extensively, at that).

cgriswald · on June 16, 2022

Exactly. Users are subsidizing the software provider with CPU cycles and employee time.

Assume it costs $800 for an engineer-day. Assume your software has 10,000 daily users and that the wasted time cost is 20 seconds (assume this is actual wasted time when an employee is actively waiting and not completing some other task). Assume the employees using the software earn on average 1/8 of what the engineer makes. It would take less than 4 days to make up for the employee's time. That $800 would save about $80,000 per year.

Obviously, this is a contrived example, but I think it's a conservative one. I'm overpaying the engineer (on average) and probably under-estimating time wasted and user cost.

knorker · on June 16, 2022

If humans wait, yes. If you can just buy another server: no.

I 100% agree on saving human time. Human time is expensive. CPU time is absolutely not.

pjvsvsrtrxc · on June 16, 2022

Servers are expensive, too. Humans waiting on servers to process something is even more expensive. No software runs in a vacuum; someone is waiting on it somewhere.

Adding more servers doesn't generally make things faster (latency). It only raises capacity (bandwidth). It does, however, generally cost quite a bit on development. Just about the only thing worse than designing a complex system is designing a complex distributed system.

knorker · on June 17, 2022

I'm of course aware of all this.

If you don't want to take the advise of running the numbers that's up to you.

E.g. if end user latency is 10ms (and it's not voip or VR or something) then that's fast enough. Doesn't matter if it's optimizable to 10 us.

If this is code running on your million CPU farm 24/7, then yeah. But always run the numbers first.

Like I said, the vast majority of code optimization opportunities are not worth taking. Some are, but only after running the numbers.

On the flip side optimizing for human time is almost always worth it, be it end users or other developers.

But run the numbers for your company. How much does a CPU core cost per hour of it's lifetime? Your developers cost maybe $100, but maybe $1000 in opportunity cost.

Depending on what you do a server may cost you as much as one day of developer opportunity time. And then you have the server for years. (Subject to electricity)

Latency and throughput may be better solved by adding machines.

loup-vaillant · on June 17, 2022

> Like I said, the vast majority of code optimization opportunities are not worth taking. Some are, but only after running the numbers.

Casey Muratori said it best: there are 3 philosophies of optimisation. You're talking about the first: actual optimisation where you measure and decide what to tackle. It's rarely used, and with good reason.

The second philosophy however is very different: it's non-pessimisation. That is, avoid having the CPU do useless work all the time. That one should be applied in a fairly systematic basis, and it's not. To apply it in practice you need to have an idea of how much time your algorithm requires. Count how many bytes are processed, how many operations are made… this should give a nice upper bound on performance. If you're within an order of magnitude of this theoretical maximum, you're probably good. Otherwise you probably missed something.

The third philosophy is fake optimisation: heuristics misapplied out of context. This one should never be used, but is more frequent than we care to admit.

knorker · on June 17, 2022

I'm actually also talking about the second.

> avoid having the CPU do useless work all the time

It's not worth an engineer spending 1h a year even investigating this, if it's less than 20 CPU cores doing useless work.

The break even for putting someone full time on this is if you can expect them to save about fourty thousand CPU cores.

YMMV. Maybe you're a bank who has to have everything under physical control, and you are out of DC floor space, power budget, or physical machines.

There are other cases too. Maybe something is inherently serial, and the freshness of a pipeline's output has business value. (e.g. weather predictions for tomorrow are useless the day after tomorrow)

But if you're saying that this second way of optimizing is that things should be fast for its own sake, then you are not adding maximum value to the business, or the mission.

Performance is an instrumental goal of an effort. It's not the ultimate goal, and should not be confused for it.

loup-vaillant · on June 18, 2022

In the specific case of batch processing, I hear you. Machine time is extremely cheap compared to engineer time.

Then there are interactive programs. With a human potentially waiting on it. Someone's whose time may be just as valuable as the engineer's time (morally that's 1/1, but even financially the difference is rarely more than a single order of magnitude). If you have as few as 100 users, shaving off seconds off their work is quickly worth a good chunk of your time.

Machine time is cheap, but don't forget that user's time is not.

knorker · on June 18, 2022

I'm up there on the mound preaching the same thing, trust me.

inkblotuniverse · on June 17, 2022

You should, however, not pessimize. People make cargo-cult architecture choices that bloat their codebase, make itnless readable, and make it 100x slower.

knorker · on June 17, 2022

Maybe.

Using actual numbers vetted by actual expenses in an actual company, if you can save 100 CPU cores by spending 3h a year keeping it optimized, then it is NOT worth it.

It is cheaper to burn CPU, even if you could spend one day a year making it max out one CPU core instead of 100.

It can be better for the business to cargo cult.

Not always. But you should remember that the point of the code is to solve a problem, at a low cost. Reducing complexity reduces engineer cost in the future and may also make things faster.

Put it this way: Would you hire someone at $300k doing nothing but optimizing your pipeline so that it takes one machine instead of one rack, or would you spend half that money (TCO over its lifetime) just buying a rack of machines?

If you wouldn't hire them to do it, then you shouldn't spend current engineers time doing it.

inkblotuniverse · on June 18, 2022

I wasn't talking about optimization! I was talking about non-pessimization, which includes not prematurely abstracting/generalizing your code.

I've seen people making poor decisions at the outset, and having code philosophies that actively make new code 100x slower without any clear gain. Over-generalization, 100 classes and subclasses, everything is an overriden virtual method, dogmatic TDD (luckily, nobody followed that.)

The dogma was to make things more complicated and illegible, 'because SOLID'.

knorker · on June 16, 2022

It depends.

Run the lifetime cost of a CPU, and compare it to what you pay your engineers. It's shocking how much RAM and CPU you can get for the price of an hour of engineer time.

And that's not even all! Next time someone reads the code, if it's "clever" (but much much faster) then that's more human time spent.

And if it has a bug because it sacrificed some simplicity? That's human hours or days.

And that's not even all. There's the opportunity cost of that engineer. They cost $100 an hour. They could spend an hour optimizing $50 worth of computer resources, or they could implement 0.1% of a feature that unlocks a million dollar deal.

Then having them optimize is not just a $50 loss, it's a $900 opportunity cost.

But yeah, shipped software like shrinkwrapped or JS running on client browsers, that's just having someone else pay for it.

(which, for the company, has even less cost)

But on the server side: yes, in most cases it's cheaper to get another server than to make the software twice as fast.

Not always. But don't prematurely optimize. Run the numbers.

One thing where it really does matter is when it'll run on battery power. Performance equals battery time. You can't just buy another CPU for that.

gpderetta · on June 17, 2022

> Next time someone reads the code, if it's "clever" (but much much faster) then that's more human time spent.

yet piles and piles of abstractions are considered acceptable and even desirable while having significant negative effects on code readability.

knorker · on June 17, 2022

Yeah, it doesn't have a simple answer that works for all cases.

Say you need to do some data processing from format A to B. There's already a maintained codebase for converting from A to C, C to D, and a service that converts individual elements from D to A. All steps require storing back onto disk.

For a one-time thing it'll be MUCH cheaper to do it the naive way reusing existing high level blocks, and going to lunch (or vacation), and let it run.

For a recurring thing, or a pipeline with latency requirements, maybe it's worth building a converter from A to B.

Or… it could be cheaper to just shard A and run it on 20 CPUs.

Let's say you have the expensive piles of abstraction, and creating huge waste. At my company one HOUR of engineer time costs about the same as 20 CPUs running for A YEAR.

This means that if you reduce CPU use by 20 cores, forever, then ROI takes a full year. Including debugging, productionizing, and maintenance you pretty much can't do anything in 1h.

Likely your A-to-B converter could take 1h of human time just in ongoing costs like release management.

And to your point about code readability: Sometimes the ugly solution (A-C-D-B) is the one with less code. If you needed the A->C, C->D, D->A components anyway, then writing an A->B converter is just more code, with its potential readability problems.

On the flip side of this: It's been a trend for a long time in web development to just add layers of frameworks and it's now "perfectly normal" for a website to take 10s to load. Like what the fuck, blogspot, how do you even get to the point where you realize you need a "loading" animation, and instead of fixing the problem you actually do add one.

Human lifetimes have been spent looking at just blogspot's cogs spinning.

lmm · on June 17, 2022

If faster software was worth anything to people surely they'd pay for it.

Dylan16807 · on June 17, 2022

That only applies to homo economicus.

knorker · on June 17, 2022

And ones with a choice.

Given the choice between program X and program X plus higher speed at higher cost, some will choose the latter.

But that's never the choice. All else is not equal.

corrral · on June 17, 2022

We shouldn't let people obtain CS degrees until they've had to write at least one fairly-complex program on a platform with little enough RAM that the amount of code in the program starts to be something they have to optimize (because the program itself takes up space in memory, not just the data it uses, which is something we hopefully all know but rarely think about in practice on modern machines). Tens or low hundreds of KB of memory. Get 'em questioning every instruction and every memory allocation.

I'm only half-joking.

[EDIT] For extra lulz let them use a language with a bunch of fancy modern language features so they get a taste of what those cost, when they realize they can't afford to use some of them.

knorker · on June 17, 2022

It's not far fetched. Microcontroller programming should not be seen as magic.

And microcontrollers will never get abundant capacity because smaller and more efficient means less battery, no matter the tech level.

So it's not like "everyone should know the history of the PDP-11" which I would disagree with.

During my schooling we built traffic lights and stuff on tiny machines, and even in VHDL, even though desktop machines were hundreds of MHz. They both have a place still.

rot13xor · on June 16, 2022

Regarding chrome, browsers are basically operating systems nowadays. A standards compliant HTML5 parser is at the bare minimum millions of lines of code. Same for the renderer and Javascript engine.

knorker · on June 17, 2022

That's true. I'm not saying a browser solves a small and simple problem. But on the other hand Chrome takes much more RAM than the operating system (including desktop environment).

Even after closing all tabs, since tabs (and extensions) are basically programs in this operating system.

gpderetta · on June 17, 2022

Why have an engineer spend a day optimizing the program when you can have it spend a month implementing features nobody asked for?

knorker · on June 17, 2022

Why not both?