> All push load balancing algorithms try to somehow predict how busy downstreams are.
They KNOW how busy they are. They are the ones tracking and forwarding connections to them. That's why leastconn works in the first palce
But, for example HAProxy have option to directly back-feed weights via healthchecks from app, so there is an option for app to signal back-pressure in RR balancing
> They KNOW how busy they are. They are the ones tracking and forwarding connections to them.
Well, they do when they're the only ones sending work to the workers.
The article uses a literal black box for the load balancer, but there are workloads that are too heavy for a single machine, so in those cases (and others, like HA) you have to have a pool of load balancers. You can try to make those load balancers know everything about what's happening in the whole system but it can be hard and expensive.
Or, you can have them operate on less-than-perfect knowledge. This is what the round-robin strategy does, and just like round-robin, has its tradeoffs (much simpler, worse 95%ile latency).
All of this is assuming the load balancers and workers servicing connections are the only things running on those machines. In real world usage there can often be other loads on the same hardware, belonging to tenants your team doesn't even have a relationship with, which can complicate things quite a bit.
That’s interesting I haven’t heard of this in HAproxy. I’ve seen that it has a static weight for how many requests to send to a backend in the config file, what’s this functionality called for dynamic weights?
And how do you expose that?
They KNOW how busy they are. They are the ones tracking and forwarding connections to them. That's why leastconn works in the first palce
But, for example HAProxy have option to directly back-feed weights via healthchecks from app, so there is an option for app to signal back-pressure in RR balancing