The Physics of Systemd: Dependency DAGs & Socket Activation

Why Linux boots in 2 seconds. The physics of parallel execution graphs, lazy socket activation, and the `sd_notify` heartbeat protocol.

Beginner 40 min read Expert Version →

🎯 What You'll Learn

  • Visualizing the Systemd Dependency DAG (Directed Acyclic Graph)
  • Implementing logic for `Type=notify` and `sd_notify()`
  • Configuring Socket Activation (Lazy Loading)
  • Debugging Boot Chains with `systemd-analyze plot`
  • Mastering Systemd Timers (The Cron Killer)

Introduction

In the old days (SysVinit), Linux booted linearly. Start A. Wait. Start B. Wait. Start C. If C hung, the system froze.

Systemd changed the physics of booting. It treats the entire OS state as a Dependency Graph (DAG). If Service A and Service B don’t depend on each other, they start in parallel on different CPU cores.

This lesson explores the engineering behind this parallelism and the “Socket Activation” trick that makes it possible.


The Physics: Dependency DAGs

Systemd calculates a “Transaction” before doing anything. It resolves the graph.

  • Requires=: Strong dependency. If A requires B, and B fails, A is killed immediately.
  • Wants=: Weak dependency. If A wants B, A tries to start B, but if B fails, A survives.
  • After=: Ordering only. “Don’t start A until B has finished starting”.

Critical Physics: Requires does NOT imply After. If you say Requires=B, Systemd will start A and B simultaneously unless you also say After=B.


Socket Activation: The Lazy Loading Trick

How can we start Nginx, MySQL, and PHP-FPM parallel if they depend on each other? Socket Activation.

  1. Systemd starts first. It creates the listening socket (Port 80) before Nginx starts.
  2. Buffering: If a client connects, the kernel buffers the TCP packet.
  3. Lazy Start: Systemd sees the traffic and then spawns Nginx.
  4. Handoff: Systemd passes the open file descriptor (FD 3) to Nginx. Nginx handles the request.

Result: Zero race conditions. Zero waiting.


Code: The sd_notify Protocol

Most services use Type=simple (Systemd assumes it’s “ready” the moment it forks). This is dangerous. Real engineering uses Type=notify.

The Physics of Notification

The service communicates with Systemd over a UNIX Domain Socket defined in $NOTIFY_SOCKET.

# app.py
import systemd.daemon
import time

print("Initializing database...")
time.sleep(5) # Simulate heavy loading

# TELL SYSTEMD WE ARE READY
# Only now does 'systemctl start' finish!
systemd.daemon.notify('READY=1')

# Heartbeat Loop
while True:
    time.sleep(1)
    # Ping the watchdog
    systemd.daemon.notify('WATCHDOG=1')

The Unit File

# /etc/systemd/system/myapp.service
[Unit]
Description=Notify Type Service

[Service]
Type=notify
ExecStart=/usr/bin/python3 /opt/app.py
WatchdogSec=5s
Restart=always

[Install]
WantedBy=multi-user.target

Physics: If the loop freezes for >5 seconds, Systemd receives no ping. It sends SIGABRT to the process and restarts it. Self-healing infrastructure.


Timers: The Cron Killer

Cron is dumb. It runs at 3AM even if the load is 50.0. Systemd Timers are smart.

# backup.timer
[Timer]
OnCalendar=*-*-* 03:00:00
RandomizedDelaySec=15m  # Prevents thundering herd!
Persistent=true         # Run if we were off at 3AM

[Install]
WantedBy=timers.target

Debugging: Analyze the Boot

You can visualize the exact timeline of your server’s boot.

# 1. Text Summary
systemd-analyze

# 2. Critical Chain (Who held up the boot?)
systemd-analyze critical-chain

# 3. Generate SVG Plot (The Nano Banana Pro visual)
systemd-analyze plot > boot.svg

Practice Exercises

Exercise 1: The Watchdog (Beginner)

Task: Write a script that notifies READY=1 but never pings WATCHDOG=1. Action: Configure WatchdogSec=3s. Observation: Watch status with journalctl -f. See Systemd kill your script every 3 seconds.

Exercise 2: Socket Activation (Intermediate)

Task: Create echo.socket (ListenStream=9999) and echo.service. Action: Stop the service. Telnet to port 9999. Observation: The connection works instantly, and systemctl status echo.service shows “Active”.

Exercise 3: Order vs Requirement (Advanced)

Task: Create Unit A and Unit B. Set A Requires=B but NOT After=B. Make B sleep for 5 seconds. Observation: Verify they start simultaneously. Add After=B and verify A waits.


Knowledge Check

  1. What is the difference between Requires= and Wants=?
  2. Does Requires= imply execution order?
  3. What socket does sd_notify write to?
  4. What happens if a process fails to ping the watchdog?
  5. Why is systemd-analyze plot useful?
Answers
  1. Strictness. Requires kills dependent units on failure. Wants does not.
  2. No. Order is controlled ONLY by After and Before.
  3. The path in $NOTIFY_SOCKET. (A UNIX Domain Socket).
  4. SIGABRT. Systemd kills and restarts it.
  5. Performance. It visualizes boot bottlenecks in an SVG timeline.

Summary

  • DAG: Parallel startups.
  • Socket Activation: FD passing + Lazy loading.
  • Notify: Protocol for “I am ready”.
  • Watchdog: Protocol for “I am alive”.

Questions about this lesson? Working on related infrastructure?

Let's discuss