From e2ff0eb4281ee30a2a3b611a25e61b9efcbd253d Mon Sep 17 00:00:00 2001 From: Aner Zakobar Date: Wed, 15 Apr 2026 17:20:35 +0300 Subject: [PATCH] Update AGENTS.md for NixOS port branch --- AGENTS.md | 460 ++++++++++++++++++++++++++++-------------------------- 1 file changed, 236 insertions(+), 224 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index 05d04c7..b3fd80c 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -1,255 +1,267 @@ # AGENTS.md -This is a Helm chart for deploying a self-hosted home environment (Homey) on Kubernetes. +Self-hosted home server configuration for a Raspberry Pi 4 (8 GB), managed +entirely through NixOS. Services run as podman containers under systemd. +Remote access is via Cloudflare Tunnel; local access goes through Caddy +with Let's Encrypt TLS (DNS-01, Cloudflare API). -## Project Overview +The original Kubernetes/Helm setup is preserved on the `main` branch. +This branch (`nixos-port`) is the active NixOS port. -- **Type**: Helm Chart (Kubernetes package manager) -- **Language**: YAML + Go template syntax (Helm templating) -- **Key Files**: - - `Chart.yaml` - Chart metadata - - `values.yaml` - Default configuration values - - `templates/` - Kubernetes manifest templates (auth.yaml, media.yaml, phpldapadmin.yaml, _definitions.yaml) - - `files/` - Configuration file templates (processed by Helm with `tpl` function) +--- -## Build/Lint/Test Commands +## Project Structure -### Helm Validation -```bash -# Lint the Helm chart for errors -helm lint . - -# Template rendering (dry-run install) -helm template test-release . --debug - -# Install/upgrade in cluster -helm upgrade --install homey . -n homey - -# Verify chart against Kubernetes API -helm kubeval . - -# Check schema validation of values.yaml -helm schema generate +``` +flake.nix # Entry point — defines all hosts +modules/ + common.nix # Shared system config (nix, podman, sops, SSH) + storage.nix # External HD mount + per-service directory layout + caddy.nix # Caddy reverse proxy (DNS-01 ACME, forward_auth) + cloudflared.nix # Cloudflare Tunnel for remote access + backup.nix # Restic daily backups + services/ + openldap.nix # OpenLDAP — central identity provider + authelia.nix # Authelia — SSO gateway + gitea.nix # Gitea — Git server + nextcloud.nix # Nextcloud + PostgreSQL + phpldapadmin.nix # phpLDAPadmin — LDAP web UI + jellyfin.nix # Jellyfin — media server (disabled by default) + transmission.nix # Transmission — torrent client (disabled by default) +hosts/ + pi-main/ + default.nix # Service selection + host-specific overrides + hardware.nix # Pi 4 boot, SD card labels, ARM platform +secrets/ + .sops.yaml # Age key configuration + secrets.yaml # sops-encrypted secrets (commit only after encrypting) +PORTING.md # Step-by-step migration guide from the old Helm setup ``` -### Manual Template Testing -```bash -# Render templates locally with custom values -helm template homey . -f values.yaml --set homey.url=example.com +## Services and URLs -# Template with debug output -helm template homey . --debug 2>&1 | less +All services live under `home.zakobar.com`. + +| Service | URL | Auth | +|---------|-----|------| +| Authelia | `auth.home.zakobar.com` | Public (it is the auth portal) | +| Gitea | `git.home.zakobar.com` | Authelia one_factor | +| Nextcloud | `nextcloud.home.zakobar.com` | Nextcloud-native | +| phpLDAPadmin | `ldapadmin.home.zakobar.com` | Authelia two_factor, admins only | +| Jellyfin | `jellyfin.home.zakobar.com` | Authelia one_factor | +| Transmission | `torrent.home.zakobar.com` | Authelia two_factor, admins only | + +Internal ports (all bound to `127.0.0.1`): + +| Container | Port | +|-----------|------| +| openldap | 389 | +| authelia | 9091 | +| gitea | 3000 | +| nextcloud | 8080 | +| nextcloud-postgres | 5432 | +| phpldapadmin | 8081 | +| jellyfin | 8096 | +| transmission | 9092 (not 9091 — avoids clash with authelia) | + +## Storage Layout + +All persistent data lives on the external HD at `/mnt/data/`: + +``` +/mnt/data/ + openldap/ + etc-ldap-slapd.d/ → /etc/ldap/slapd.d in container + var-lib-ldap/ → /var/lib/ldap in container + authelia/config/ → /config + gitea/data/ → /data + nextcloud/ + html/ → /var/www/html + db/ → /var/lib/postgresql/data + db-dump/ → pg_dump output (pre-backup) + jellyfin/config/ → /config + media/movies|tvshows|... → shared media (read-only to jellyfin) + transmission/config/ → /config + restic-cache/ → restic local cache ``` -### Kubectl Validation -```bash -# Dry-run apply to validate manifests -kubectl apply -f templates/auth.yaml --dry-run=server +The drive device path is set per-host in `hosts//default.nix` via +`homey.storage.device`. Use a `/dev/disk/by-id/` path for stability. -# Get rendered template directly -helm template homey . | kubectl apply --dry-run=server -f - +## Build / Validate Commands + +```bash +# Check flake structure and evaluate all hosts (no build) +nix flake check + +# Dry-run: show what would change without applying +sudo nixos-rebuild dry-activate --flake .#pi-main + +# Apply configuration +sudo nixos-rebuild switch --flake .#pi-main + +# Build without switching (e.g. cross-compile on workstation) +nix build .#nixosConfigurations.pi-main.config.system.build.toplevel + +# Show diff between running system and new config +nvd diff /run/current-system $(nix build --no-link --print-out-paths .#nixosConfigurations.pi-main.config.system.build.toplevel) ``` +## Secret Management + +Secrets are managed with [sops-nix](https://github.com/Mic92/sops-nix) and +age keys. The encrypted `secrets/secrets.yaml` is committed to the repo; the +age private key lives on the Pi at `/var/lib/sops-nix/key.txt`. + +```bash +# Edit secrets (decrypts, opens $EDITOR, re-encrypts on save) +sops secrets/secrets.yaml + +# Encrypt a plaintext secrets.yaml for the first time +sops --encrypt --in-place secrets/secrets.yaml + +# Add a new host key (after generating it on the new machine) +# 1. Add the public key to secrets/.sops.yaml +# 2. Run: +sops updatekeys secrets/secrets.yaml + +# Generate a new age key on a host +age-keygen -o /var/lib/sops-nix/key.txt +age-keygen -y /var/lib/sops-nix/key.txt # print public key +``` + +Secrets that must come from the old deployment (see `PORTING.md` for how to +extract them from the old k8s cluster): + +- `openldap/admin_password`, `openldap/config_password`, `openldap/ro_password` +- `gitea/admin_password` +- `nextcloud/admin_password`, `nextcloud/postgres_password` + +Everything else (authelia JWT/session/encryption keys, gitea JWT tokens, +restic password, Cloudflare tokens) can be generated fresh. + ## Code Style Guidelines -### YAML Structure +### Nix -1. **Document Separators**: Use `---` at the start of each YAML document - ```yaml - --- - apiVersion: v1 - kind: ConfigMap +1. **Module pattern** — every service is an opt-in module with an `enable` option: + ```nix + options.homey.myservice.enable = lib.mkEnableOption "My service"; + config = lib.mkIf config.homey.myservice.enable { ... }; ``` -2. **Indentation**: Use 2 spaces (not tabs) - ```yaml - spec: - containers: - - name: app - image: nginx - ``` +2. **`homeyConfig` specialArgs** — top-level site config (domain, org name, + timezone) is passed via `specialArgs` in `flake.nix` and accessed as + `homeyConfig` in every module. Do not read domain/org from hardcoded strings. -3. **Trailing Commas**: Optional but preferred for multi-line lists - ```yaml - accessModes: - - ReadWriteMany - - ReadOnlyMany - ``` +3. **No secrets in the Nix store** — secrets are always read from sops-managed + files at runtime, never embedded in the built config. Use + `config.sops.secrets."key".path` to get the runtime path of a secret file. -4. **Quotes**: Use quotes for strings that might be interpreted as other types - - Always quote: `.Values.homey.url | quote` - - Optional for simple strings like names +4. **Secret injection pattern** — because `oci-containers` `environmentFiles` + is limited, use a `systemd ExecStartPre` script to write an ephemeral env + file at `/run/-secrets.env` and reference it via `EnvironmentFile`. + Clean it up in `postStop`. -### Kubernetes Resources +5. **`--network=host`** — all containers use host networking for simplicity on + a single-node setup. Services communicate via `127.0.0.1:`. -1. **Labels**: Use Kubernetes recommended labels - ```yaml - labels: - app.kubernetes.io/name: openldap - app.kubernetes.io/component: auth - ``` - -2. **Naming**: Use kebab-case for resource names - ```yaml - name: openldap-admin - name: nextcloud-postgres - ``` - -3. **Storage**: Always specify `storageClassName: longhorn` - ```yaml - spec: - storageClassName: longhorn - ``` - -### Helm Template Syntax - -1. **Variable Assignment**: Use `$_ := set` for complex assignments - ```yaml - {{- $_ := set $ "varname" (include "homey.lookuporgensecret" (merge (dict "secretname" "secret-name") $)) }} - ``` - -2. **Include with Merge**: Always pass `$` as the last argument - ```yaml - {{ include "homey.randomsecret" (merge (dict "secretname" "secret-name" "secretval" $secretval) $) }} - ``` - -3. **Quote Values from .Values**: Use `quote` filter - ```yaml - value: {{ .Values.homey.url | quote }} - ``` - -4. **Template Definitions**: Define reusable templates in `_definitions.yaml` - - `homey.lookuporgensecret` - Look up existing secrets or generate random - - `homey.randomsecret` - Generate a random secret - - `homey.randHex` - Generate random hex string - -5. **Template Spacing**: Use whitespace control to avoid extra newlines - ```yaml - {{- "leading minus" -}} # No newline before - {{ "trailing minus" -}} # No newline after - ``` - -### Secret Management - -1. **Annotations**: Always annotate managed secrets to prevent deletion - ```yaml - annotations: - "helm.sh/resource-policy": "keep" - ``` - -2. **Secret Generation Pattern**: - ```yaml - # Check for existing secret, create if not exists - {{- $secretObj := (lookup "v1" "Secret" .Release.Namespace "secret-name") | default dict -}} - {{- $secretData := (get $secretObj "data") | default dict -}} - {{- $pass := (get $secretData "password") | default (randAlphaNum 32 | b64enc) -}} - ``` - -3. **Never hardcode secrets** - Use the secret lookup pattern above - -### Config Files (files/ directory) - -1. **Go Templates in Configs**: Use `tpl` function to process config files - ```yaml - data: - config.yml: |- - {{ tpl (.Files.Get "files/authelia-config.yaml" | indent 4) . }} - ``` - -2. **Accessing Variables**: Config files can access `.Values.*` and custom variables set in templates - -### Ingress Configuration - -1. **TLS**: Always specify TLS with proper hosts and secret - ```yaml - spec: - ingressClassName: {{ .Values.homey.ingress_class }} - tls: - - hosts: - - auth.{{ .Values.homey.url }} - secretName: {{ .Values.homey.certname }} - ``` - -2. **Authelia Integration**: Use auth snippets for protected ingresses - ```yaml - annotations: - nginx.ingress.kubernetes.io/auth-url: http://authelia.{{ .Release.Namespace }}.svc.cluster.local:9091/api/verify - nginx.ingress.kubernetes.io/auth-signin: https://auth.{{ .Values.homey.url }}?rm=$request_method - ``` - -### Resource Organization - -1. **File Structure**: - - `templates/_definitions.yaml` - Helper templates (secrets, utilities) - - `templates/auth.yaml` - Authentication services (OpenLDAP, Authelia, Gitea, Nextcloud, Radicale) - - `templates/media.yaml` - Media services (Jellyfin, Transmission) - - `templates/phpldapadmin.yaml` - LDAP admin interface - -2. **Manifest Order** (within a file): - - PersistentVolumeClaim - - Secrets - - ConfigMaps - - Deployments - - Services - - Ingress - -3. **Unused Resources**: Keep deprecated manifests in `unused/` directory - -### Environment Variables - -1. **Naming**: Use uppercase with underscores - ```yaml - - name: LDAP_ORGANISATION - value: {{ .Values.homey.organization }} - ``` - -2. **Value Sources**: Prefer `valueFrom.secretKeyRef` over inline values - ```yaml - - name: PASSWORD - valueFrom: - secretKeyRef: - name: secret-name - key: password - ``` - -### Volume Mounts - -1. **subPath**: Use `subPath` for shared PVCs - ```yaml - volumeMounts: - - mountPath: /data - subPath: service-name/data - ``` - -2. **Read-only ConfigMaps**: Mark config mounts as read-only - ```yaml - readOnly: true - ``` - -## Common Operations +6. **Systemd ordering** — always express `after`/`requires` dependencies + explicitly. The external HD mount unit is `mnt-data.mount`; containers that + need storage must depend on it. ### Adding a New Service -1. Add values to `values.yaml` -2. Create/extend template in `templates/` -3. Add PVC if persistent storage needed -4. Add Ingress with appropriate annotations -5. Test with `helm template .` +1. Create `modules/services/.nix` following the existing module pattern. +2. Add `homey..enable = false` as the default option. +3. Import the new module in `flake.nix` (in the `modules` list inside `mkHost`). +4. Enable it in `hosts/pi-main/default.nix`. +5. Add a Caddy virtual host block in `modules/caddy.nix`. +6. Add the service data directory to `modules/storage.nix` `tmpfiles.rules`. +7. Add the data path to the `paths` list in `modules/backup.nix`. +8. Add any new secrets to `secrets/secrets.yaml` (plaintext) and document them. -### Updating Secrets - -Secrets are generated on first install. To regenerate: -```bash -kubectl delete secret -n homey -helm upgrade --install homey . -n homey -``` - -### Debugging Templates +### Updating or Regenerating Secrets ```bash -# Show all template variables available -helm template . --show-only templates/_helpers.tpl +# Edit the encrypted file — sops opens $EDITOR +sops secrets/secrets.yaml -# Render single template -helm template . --show-only templates/auth.yaml +# Copy updated secrets to the Pi and rebuild +rsync secrets/secrets.yaml admin@pi-main:/path/to/homey/secrets/ +ssh admin@pi-main 'sudo nixos-rebuild switch --flake /path/to/homey#pi-main' ``` + +### Debugging Containers + +```bash +# List all running containers +podman ps + +# Follow logs for a service +journalctl -fu podman-authelia.service + +# Drop into a running container +podman exec -it authelia sh + +# Restart a single service +sudo systemctl restart podman-gitea.service + +# Check why a service failed to start +systemctl status podman-openldap.service +journalctl -u podman-openldap.service --since "5 min ago" +``` + +--- + +## Outstanding TODOs + +These items are known gaps that need to be addressed before the setup is +production-ready: + +- [ ] **`caddy.nix` — fix `vendorHash`**: The Caddy build with the Cloudflare + DNS plugin uses `lib.fakeHash` as a placeholder. After the first `nix build`, + replace it with the hash Nix reports in the error message. + +- [ ] **`hosts/pi-main/default.nix` — fill in real values**: + - SSH public key in `users.users.admin.openssh.authorizedKeys.keys` + - External HD device path in `homey.storage.device` + - Backup repository URL in `homey.backup.repository` + +- [ ] **`secrets/secrets.yaml` — populate and encrypt**: Fill in all secret + values (old passwords from k8s + freshly generated ones), then run + `sops --encrypt --in-place secrets/secrets.yaml` before committing. + +- [ ] **`secrets/.sops.yaml` — add real age keys**: Replace both + `AGE-PUBLIC-KEY-*` placeholders with actual public keys (workstation + Pi). + +- [ ] **Cloudflare Tunnel**: Create the tunnel in the Zero Trust dashboard, + copy the tunnel token into secrets, and configure public hostnames. See + `modules/cloudflared.nix` and Phase 3 of `PORTING.md` for details. + +- [ ] **Gitea LDAP auth**: After first start, configure LDAP authentication + in Gitea's admin panel (Admin → Authentication Sources → Add LDAP source). + The old Helm chart had this commented out; it must be done manually once. + Relevant settings: + - Host: `127.0.0.1`, Port: `389`, Security: Unencrypted + - Bind DN: `cn=readonly,dc=home,dc=zakobar,dc=com` + - User search base: `ou=users,dc=home,dc=zakobar,dc=com` + +- [ ] **Nextcloud LDAP app**: After restoring the Nextcloud volume, verify + the LDAP Users and Contacts app is still configured correctly + (Admin → LDAP/AD Integration). + +- [ ] **`hosts/pi-main/hardware.nix` — verify SD card labels**: The file + assumes partition labels `NIXOS_SD` (root) and `FIRMWARE` (boot). Relabel + after flashing if they differ, or update the `fileSystems` entries. + +- [ ] **Second machine**: When ready, add `hosts/pi-secondary/` and uncomment + the `pi-secondary` entry in `flake.nix`. Services communicating cross-machine + should reference the primary Pi's LAN IP instead of `127.0.0.1`. + +- [ ] **Jellyfin and Transmission**: Both modules are written and importable + but disabled. Enable in `hosts/pi-main/default.nix` when ready: + ```nix + homey.jellyfin.enable = true; + homey.transmission.enable = true; + ```