86 lines
7.4 KiB
Org Mode
86 lines
7.4 KiB
Org Mode
:PROPERTIES:
|
||
:ID: c87d8901-bbee-4397-bff1-af8432b4f66e
|
||
:END:
|
||
|
||
#+title: homey
|
||
#+filetags: :project: :knowledge:
|
||
|
||
** Architecture
|
||
|
||
NixOS homelab on Raspberry Pi 4 (8 GB). Domain: zakobar.com. Static IP: 192.168.1.100. nixpkgs pin: nixos-25.05.
|
||
|
||
Container runtime: Podman via =virtualisation.oci-containers=. All containers join the =homey= podman network (created by =podman-homey-network.service= in =common.nix=). Inter-container DNS via container names; Caddy proxies via =127.0.0.1:<port>=.
|
||
|
||
Reverse proxy: Caddy with Cloudflare DNS-01 wildcard cert for =*.zakobar.com=. Built via =pkgs.caddy.withPlugins= with =github.com/caddy-dns/cloudflare@v0.2.4= (resolved hash: =sha256-pRrLBlYRaAyMYwPXeTy4WqWNRu/L9K6Mn2src11dGh8==). Each vhost generates a pair: HTTPS vhost + =http://= vhost for cloudflared loopback. Authelia =forward_auth= uses =/api/authz/forward-auth= (v4.38+ endpoint, not legacy).
|
||
|
||
Auth stack: OpenLDAP ← Authelia (TOTP 2FA). Authelia config is rendered entirely at build time from Nix strings. A =NIXOS_CONFIG_HASH= env var on the container forces a restart whenever the config changes (bind-mounts resolve symlinks at start, so the running container would otherwise keep the old config).
|
||
|
||
Monitoring: Prometheus (9090) + node_exporter (9100) + systemd_exporter (9558) + Grafana (3002). Node Exporter Full dashboard pre-provisioned (fetchurl hash resolved: =sha256-1DE1aaanRHHeCOMWDGdOS1wBXxOF84UXAjJzT5Ek6mM=, ID 1860 rev 37). Grafana proxy auth: Authelia → Remote-User header → Caddy maps to X-WEBAUTH-USER → Grafana auto-signs-in. All reaching Grafana are confirmed admins (Authelia enforces two_factor + admins group).
|
||
|
||
Uptime Kuma sync: a Python =oneshot= service runs after container start. Hash-gated — only re-syncs when the monitor JSON changes. Supports =keyword= (keyword monitor type) and =maxretries= fields beyond the basic =name/url/interval=.
|
||
|
||
Attic (self-hosted Nix binary cache): cache name =main=, public key =main:9SZt/6plBU7jjQzz90J7O011I13hmJvOMYouxNqExNQ==, endpoint =https://attic.zakobar.com/main=. Setup completed 2026-05-30. NAR content not backed up — reproducible from source. Tokens are stateless JWTs; regenerate with same =atticadm= command if lost.
|
||
|
||
Eurovision Vote: Django app sourced from external flake =github:anerisgreat/eurovote=, wrapped by =modules/services/eurovote.nix=. Uses DynamicUser + StateDirectory so systemd owns =/var/lib/eurovote/=; no tmpfiles entry needed.
|
||
|
||
Backup: Restic daily at 03:00 to S3 (Backblaze B2, bucket =zakobar-home-backup=). Pre-hook: Nextcloud maintenance mode on + pg_dump. Post-hook: maintenance mode off. Manual offload: =restic copy= to local disk. NAR content and media excluded.
|
||
|
||
Reliability hardening in =hosts/pi-main/default.nix=:
|
||
- Hardware watchdog: =bcm2835_wdt= kernel module, systemd watchdog runtimeTime=300s / rebootTime=360s
|
||
- WiFi power save disabled: brcmfmac driver drops connections under low traffic; disabled via =iw= on interface up
|
||
- Network watchdog: timer every 2 min (starts 5 min after boot), pings gateway, restarts wpa_supplicant, reboots if still dead after 30s
|
||
- zramSwap: zstd, 25% RAM (~2 GB) — breathing room for PHP upload spikes
|
||
- Nix build-dir: =/mnt/data/nix-build= — avoids small tmpfs filling during large builds
|
||
|
||
Bootstrap: =pi-main-bootstrap= config builds an SD image (=sd-image-aarch64.nix=) for first flash.
|
||
|
||
** Conventions
|
||
|
||
=homeyConfig= specialArgs (passed to every module): =domain=, =organization=, =timezone=. Never hardcode domain strings.
|
||
|
||
=users.mutableUsers = false= — all user config must be declared in Nix.
|
||
|
||
Secret injection pattern: =LoadCredential= stages sops-decrypted file into =$CREDENTIALS_DIRECTORY= before Exec*; shell script reads and exports. Ephemeral env files written to =/run/= and cleaned up in =ExecStopPost= / =postStop=.
|
||
|
||
Authelia =accessControlRules=: sorted by =priority= at build time (lower = first). Authelia stops at first match, so more-specific rules must have lower priority. Ranges: 0=bypass, 10–19=blanket bypass, 20–49=admin two_factor+deny pairs, 50–64=one_factor open, 65–79=per-path (resources + subject combos).
|
||
|
||
=accessControlRules= option is declared unconditionally (not inside =mkIf cfg.enable=) so any module can contribute rules even when Authelia is disabled. Same pattern for =homey.monitoring.monitors=.
|
||
|
||
Attic token generation: run =atticadm make-token= inside the container. Tokens are stateless; losing one means just regenerating it with the same command.
|
||
|
||
DynamicUser services (Eurovision Vote): secrets must be mode =0444= (not =0400=) because DynamicUser gets a random UID that cannot be pre-assigned as owner.
|
||
|
||
Caddy Cloudflare plugin secrets: uses =LoadCredential= + =ExecStart= override (clears list with empty string first, then sets the real start command) to export =CLOUDFLARE_API_TOKEN= before exec-ing caddy.
|
||
|
||
** Gotchas
|
||
|
||
hdparm APM udev rule was removed — USB-SATA bridges often don't support APM commands and hdparm hangs indefinitely, causing boot-time crashes. =hdparm= is still available as a package for manual use.
|
||
|
||
=storage.nix= config is gated on =lib.mkIf (cfg.device != "")= — if =homey.storage.device= is empty string, the mount and all tmpfiles rules are skipped. Useful during initial setup.
|
||
|
||
Grafana login form disabled (=disable_login_form = true=) — recovery requires re-enabling it in the Nix config. All proxy-auth users are auto-assigned Admin role (safe because Authelia already restricts to admins group).
|
||
|
||
Nextcloud preview generation: a separate =oneshot= service =nextcloud-generate-previews= (declared in =hosts/pi-main/default.nix=) fills missing thumbnails after first start. Must be triggered manually or via timer.
|
||
|
||
Authelia config bind-mount gotcha: NixOS resolves the symlink to the nix store path at container start. Without =NIXOS_CONFIG_HASH= env var, a config change would not take effect until manual container restart.
|
||
|
||
WiFi network name: =Zakobar=. sops secret key: =wifi/psk=. The secret file must contain exactly one line: =wifi_psk=<password>=. The =ext:wifi_psk= format is wpa_supplicant's literal substitution syntax, not an env var.
|
||
|
||
Attic: writing ephemeral TOML config (not a real file in the store) via =ExecStartPre= shell script that writes to =/run/attic-config.toml=. JWT secret interpolated into the TOML at runtime.
|
||
|
||
** Key Files
|
||
|
||
- =flake.nix= — module list, =mkHost= builder, =homeyConfig= specialArgs, =rpi4Headless= hardware snippet
|
||
- =hosts/pi-main/default.nix= — enabled services, static IP, WiFi, reliability hardening, Attic substituter config
|
||
- =hosts/pi-main-bootstrap/default.nix= — SD card bootstrap image
|
||
- =modules/caddy.nix= — =virtualHosts= option, dual vhost generation, Authelia forward_auth snippet
|
||
- =modules/services/authelia.nix= — access control rule rendering, =accessControlRules= option (unconditional)
|
||
- =modules/services/uptime-kuma.nix= — =homey.monitoring.monitors= option (unconditional), sync script
|
||
- =modules/services/attic.nix= — Nix binary cache, JWT token config, netrc injection for Nix daemon
|
||
- =modules/services/attic-setup.md= — post-deploy steps, token commands, client config, setup history
|
||
- =modules/services/eurovote.nix= — DynamicUser wrapper for external flake module
|
||
- =modules/monitoring.nix= — Prometheus + Grafana, proxy auth wiring, Node Exporter Full dashboard
|
||
- =modules/common.nix= — Nix settings, podman network creation, sops global config
|
||
- =modules/storage.nix= — external HD mount, =extraDirs= option
|
||
- =modules/backup.nix= — Restic, pre/post hooks, =extraPaths= option
|