no-downtime is table stakes in 2025. I can't look at anyone in the eyes and tell them that our product is going to go down for a bit everytime we deploy (it'd also be atrocious friction for frequent deployment).
> - Systemd using socket activation (same as Docker compose, it holds HTTP connections while the HTTP service restarts)
Nit: it holds the TCP connections while the HTTP service restarts. Any HTTP-level stuff would need to be restarted by the client. But that’s true of every “zero downtime” system I’m aware of.
Being successful enough that any amount of downtime is an existential risk is a great problem to have. 99.99% don't have that problem; even huge successful businesses can survive unplanned downtimes (see: recent major outages).
It's far from table stakes and you can absolutely overengineer your product into the ground by chasing it.
"0 downtime" system << antifragile systems with low MttR.
Something can always break even if your system is "perfect". Utilities, local disasters, cloud dependencies.