Systemd is actually pretty damn good *and* it's GPL licensed free software. I un...

Repulsion9513 · on March 30, 2024

> By the way, all the stuff you mentioned is not really part of the actual init system, namely PID 1

Except it literally is. I once had a systemd system suddenly refuse to boot (kernel panic because PID1 crashed or so) after a Debian upgrade, which I was able to resolve by... wait for it... making /etc/localtime not be a symlink.

Why does a failure doing something with the timezone make you unable to boot your system? What is it even doing with the timezone? What is failing about it? Who knows, good luck strace'ing PID1!

matheusmoreira · on March 30, 2024

Turns out you're right and my knowledge was outdated. I seriously believed the systemd service manager was separate from its PID 1 but at some point they even changed the manuals to say that's not supported.

I was also corrected further down in the thread, with citations from the maintainers even:

https://news.ycombinator.com/item?id=39871735

As it stands I really have no idea why the service manager has not been split off from PID 1. Maintainer said that PID 1 was "different" but didn't really elaborate. Can't find much reliable information about said differences either. Do you know?

Repulsion9513 · on March 30, 2024

I have no idea, lol. Maybe the signal handling behavior? You can't signal PID1 (unless the process has installed its own signal handler for that signal). Even SIGKILL won't usually work.

That's my entire problem with systemd though: despite the averred modularity, it combines far too many concerns for anyone to understand how or why it works the way it does.

matheusmoreira · on March 30, 2024

Yeah the signal handling thing is true, PID 1 is the only process that can handle or mask SIGKILL, maybe even SIGSTOP. The systemd manual documents its handling of a ton of signals but there's nothing in there about either of those otherwise unmaskable signals. So I don't really see how systemd is "relying" on anything. It's not handling SIGKILL, is it?

The other difference is PID 1 can't exit because Linux panics if it does. That's actually an argument for moving functionality out of PID 1.

There are other service managers out there which work outside PID 1. Systemd itself literally spawns non-PID 1 instances of itself to handle the user services. I suppose only the maintainers can tell us why they did it that way.

Maybe they are relying on the fact PID 1 traditionally reaps zombies even though Linux has a prctl for that:

https://www.man7.org/linux/man-pages/man2/prctl.2.html

  PR_SET_CHILD_SUBREAPER

What if the issue is just that nobody's bothered to write the code to move the zombie process reaping to a separate process yet? Would they accept patches in that case?

Ludicrously, that manual page straight up says systemd uses this system call to set itself up as the reaper of zombie processes:

> Some init(1) frameworks (e.g., systemd(1)) employ a subreaper process

If that's true then I really have no idea what the hell it is about PID 1 that they're relying on.

Edit: just checked the source code and it's actually true.

https://github.com/systemd/systemd/blob/main/src/core/main.c...

https://github.com/systemd/systemd/blob/main/src/basic/proce...

So they're not relying on the special signals handling and they even have special support for non-PID 1 child subreapers. Makes no sense to me. Why can't they just drop those PID == 1 checks and make a simpler PID 1 program that just spawns the real systemd service manager?

Edit: they already have a simple PID 1 in the code base!

https://github.com/systemd/systemd/blob/main/src/nspawn/nspa...

It's only being used inside namespaces though! Why? No idea.

Repulsion9513 · on March 31, 2024

> The other difference is PID 1 can't exit because Linux panics if it does. That's actually an argument for moving functionality out of PID 1.

I actually kinda think that can be an advantage for a service manager. If your service manager crashes an automatic reboot is nice, in a way. I doubt that's why they did it though.

matheusmoreira · on March 31, 2024

> If your service manager crashes an automatic reboot is nice, in a way.

I don't think it's gonna do that! I saw it in the source code: when it's running as PID 1, systemd installs a crash handler that freezes itself in a desperate attempt to avoid the kernel panic! It's pretty amazing. They could have written it so that PID 1 watches over the service manager and just restarts it if it ever crashes. I mean, systemd already supports soft-rebooting the entire user space which is pretty much exactly what would happen if PID 1 restarted a separate service manager.

Know what else I found in the source code? Various references to /proc/1. I'm starting to think that's the true reason why they want to be PID 1...

pessimizer · on March 29, 2024

People are complaining that it's too big, labyrinthine, and arcane to audit, not that it doesn't work. They would prefer other things that work, but don't share those characteristics.

Also, the more extensive the remit (of this init), the more complexly interconnected the interactions between the components; the fewer people understand the architecture, the fewer people understand the code, the fewer people read the code. This creates a situation where the codebase is getting larger and larger at a rate faster than the growth of the number of man-hours being put into reading it.

This has to make it easier for people who are systemd specialists to put in (intentionally or unintentionally) backdoors and exploitable bugs that will last for years.

People keep defending systemd by talking about its UI and its features, but that completely misses the point. If systemd were replaced by something comprehensible and less internally codependent, even if the systemd UI and features were preserved, most systemd complainers would be over the moon with happiness. Red Hat invests too much into completely replacing linux subsystems, they should take a break. Maybe fix the bugs in MATE.

dralley · on March 29, 2024

>the more complexly interconnected the interactions between the components

This is a bit of a rich criticism of systemd, given the init scripts it replaced.

> Red Hat invests too much into completely replacing linux subsystems, they should take a break. Maybe fix the bugs in MATE.

MATE isn't a Red Hat project. And nobody complains about Pipewire.

Repulsion9513 · on March 30, 2024

A shell script with a few defined arguments is not a complexly interconnected set of components. It's literally the simplest, most core, least-strongly-dependent interconnection that exists in a nix system.

Tell us you never bothered to understand how init worked before drawing a conclusion on it without telling us.

dralley · on March 30, 2024

Have you ever seen the init scripts of a reasonably-complex service that required other services to be online?

Repulsion9513 · on March 31, 2024

Yep.

    depend(){
        need net localmount
        after bootmisc
    }

matheusmoreira · on March 29, 2024

> Red Hat invests too much into completely replacing linux subsystems, they should take a break.

They should do whatever they feel is best for them, as should we. They're releasing free as in freedom GPL Linux software, high quality software at that. Thus I have no moral objections to their activities.

You have to realize that this is really a symptom of others not putting in the required time and effort to produce a better alternative. I know because I reinvent things regularly just because I enjoy it. People underestimate by many orders of magnitude the effort required to make something like this.

So I'm really thankful that I got systemd, despite many valid criticisms. It's a pretty good system, and it's not proprietary nonsense. I've learned to appreciate it.

_factor · on March 29, 2024

Let’s not get started on how large the kernel is. Large code bases increase attack surface, period. The only sensible solution is to micro service out the pieces and only install the bare essentials. Why does the an x86 server come with Bluetooth drivers baked in?

The kernel devs are wasting time writing one offs for every vendor known to man, and it ships to desktops too.

ongy · on March 29, 2024

How is the service manager different from PID1/init?

matheusmoreira · on March 29, 2024

They are completely different things.

Init just a more or less normal program that Linux starts by default and by convention. You can make it boot straight into bash if you want. I created a little programming language with the ultimate goal of booting Linux directly into it and bringing up the entire system from inside it.

It's just a normal process really. Two special cases that I can think of: no default signal handling, and it can't ever exit. Init will not get interrupted by signals unless it explicitly configures the signal dispositions, even SIGKILL will not kill it. Linux will panic if PID 1 ever exits so it can't do that.

Traditionally, it's also the orphaned child process reaper. Process descriptors and their IDs hang around in memory until something calls wait on them. Parent processes are supposed to do that but if they don't it's up to init to do it. Well, that's the way it works traditionally on Unix. On Linux though that's customizable with prctl and PR_SET_CHILD_SUBREAPER so you actually can factor that out to a separate process. As far as I know, systemd does just that, making it more modular and straight up better than traditional Unix, simply because this separate process won't make Linux panic if it ever crashes.

As for the service manager, this page explains process and service management extremely well:

https://mywiki.wooledge.org/ProcessManagement

Systemd does it right. It does everything that's described in there, does it correctly, uses powerful Linux features like cgroups for even better process management and also solves the double forking problem described in there. It's essentially a solved problem with systemd. Even the people who hate it love the unit files it uses and for good reason.

ongy · on March 30, 2024

I know the differences between them conceptionally.

The thing that people usually complain about is systemd forcibly setting its process manager at pid=1. I.e. the thing "discussed" in https://github.com/systemd/systemd/issues/12843

There is a secondary feature to run per-user managers, though I'm unsure whether it does run doesn't run without systemd PID1. Though it might only rely on logind.

matheusmoreira · on March 30, 2024

Wow, I remember reading that PID != 1 line years ago. Had no idea they changed it. I stand corrected then. Given the existence of user service managers as well as flags like --system and --user, I inferred that they were all entirely separate processes.

Makes no sense to me why the service manager part would require running as PID 1. The maintainer just says this:

> PID 1 is very different from other processes, and we rely on that.

He doesn't really elaborate on the matter though.

Every time this topic comes up I end up searching for those so called PID 1 differences. I come up short every time aside from the two things I mentioned above. Is this information buried deep somewhere?

Just asked ChatGPT about PID 1 differences. It gave me the aforementioned two differences, completely dismissed Linux's prctl child subreaper feature "because PID 1 often assumes this role in practice" as well as some total bullshit about process group leaders and regular processes not being special enough to interact with the kernel which is just absolute nonsense.

So I really have no idea what it is about PID 1 that systemd is supposedly relying on that makes it impossible to split off the service manager from it. Everything I have read up until now suggests that it is not required, especially on Linux where you have even more control and it's not like systemd is shy about using Linux exclusive features.