ADR-0240accepted
Boot-safe LaunchDaemons for critical host services
Context
ADR-0239 tried to patch the reboot gap by keeping the critical host services as user LaunchAgents and adding a root-owned bridge that bootstrapped them into user/$UID whenever gui/$UID was absent.
That design was not earned on Panda.
After installation, the bridge daemon itself loaded, but the real step that mattered kept failing:
launchctl bootstrap user/501 <plist>
Bootstrap failed: 5: Input/output errorSo the machine still needed manual nohup recovery for the same critical surfaces after a headless reboot:
com.joel.colimacom.joel.k8s-reboot-healcom.joel.agent-secretscom.joel.system-bus-workercom.joel.gatewaycom.joel.typesense-portforwardcom.joelclaw.agent-mail
That is needless complexity. The safer shape is to run the host control plane as actual system services.
Decision
Replace the ADR-0239 bridge with boot-safe LaunchDaemons for the critical host services.
- Keep the canonical plist sources in
infra/launchd/. - Install the critical labels into
/Library/LaunchDaemons/, not~/Library/LaunchAgents/. - Run the services in the
systemlaunchd domain. - Use
UserName=joel/GroupName=staffwhere the process should execute with Joel’s home, repo, auth, and filesystem context. - Add
infra/install-critical-launchdaemons.shas the canonical root installer. - Keep
infra/install-headless-bootstrap.shonly as a compatibility wrapper that now delegates to the new installer. - Remove the installed
com.joel.headless-bootstrapsystem daemon and stop documenting the bridge as an active recovery path.
Why this
- Boot-safe by design — no cross-domain launchctl trickery, just real system services.
- Less moving parts — the bridge, periodic probing, and GUI/user handoff logic all disappear.
- Same runtime identity where needed —
UserName=joelkeeps the processes in Joel’s filesystem/auth context without requiring Aqua login. - Cleaner recovery — the installer can also tear down stale user LaunchAgents and manual
nohupfallbacks before bootstrapping the system daemons.
Consequences
Positive
- Critical host services can start at boot without Aqua login.
- The installed runtime matches the repo-managed truth directly.
- The reboot story is simpler to inspect:
launchctl print system/<label>.
Negative
- Installer still requires root once.
- Launchd assets must now remain valid for
/Library/LaunchDaemons/semantics, not just GUI LaunchAgents. - Services that were previously recovered manually may see a brief restart during migration when the installer kills stale fallbacks and reboots them under launchd ownership.
Implementation notes
The migrated critical labels are:
com.joel.colimacom.joel.k8s-reboot-healcom.joel.agent-secretscom.joel.system-bus-workercom.joel.gatewaycom.joel.typesense-portforwardcom.joelclaw.agent-mail
Canonical installer:
sudo ~/Code/joelhooks/joelclaw/infra/install-critical-launchdaemons.shCompatibility alias:
sudo ~/Code/joelhooks/joelclaw/infra/install-headless-bootstrap.shFollow-up
- Run the new installer on Panda and verify each critical label via
launchctl print system/<label>. - Remove or archive any stale local notes that still instruct operators to rely on ADR-0239’s bridge.
- Keep working the separate steering issue: agent-mail search reliability still needs repair so daily steering is based on real traffic.