Files
2026-06-07 00:59:22 +03:00

397 lines
16 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# AGENTS.md
Self-hosted home server configuration for a Raspberry Pi 4 (8 GB), managed
entirely through NixOS. Services run as podman containers or native NixOS
services under systemd. Remote access is via Cloudflare Tunnel; local access
goes through Caddy with Let's Encrypt TLS (DNS-01, Cloudflare API).
The original Kubernetes/Helm setup is preserved on the `main` branch.
This branch (`nixos-port`) is the active NixOS port.
---
## Project Structure
```
flake.nix # Entry point — defines all hosts
modules/
common.nix # Shared system config (nix, podman, sops, SSH)
storage.nix # External HD mount + per-service directory layout
caddy.nix # Caddy reverse proxy (DNS-01 ACME, forward_auth)
cloudflared.nix # Cloudflare Tunnel for remote access
backup.nix # Restic daily backups (S3 primary + manual offload)
monitoring.nix # Prometheus + Grafana (native NixOS services)
services/
openldap.nix # OpenLDAP — central identity provider
authelia.nix # Authelia — SSO gateway + accessControlRules option
gitea.nix # Gitea — Git server
gitea-runner.nix # Gitea Actions runner
nextcloud.nix # Nextcloud + PostgreSQL
phpldapadmin.nix # phpLDAPadmin — LDAP web UI
jellyfin.nix # Jellyfin — media server (disabled)
transmission.nix # Transmission — torrent client (disabled)
uptime-kuma.nix # Uptime Kuma + homey.monitoring.monitors option
ntfy.nix # Ntfy — push notification server (native NixOS)
mealie.nix # Mealie — recipe manager
paperless.nix # Paperless-ngx — document management
eurovote.nix # Eurovision Vote — Django voting app
hosts/
pi-main/
default.nix # Service selection + host-specific overrides
hardware.nix # Pi 4 boot, SD card labels, ARM platform
secrets/
.sops.yaml # Age key configuration
secrets.yaml # sops-encrypted secrets (commit only after encrypting)
PORTING.md # Step-by-step migration guide from the old Helm setup
```
## Services and URLs
All services live under `zakobar.com`.
| Service | URL | Auth | Runtime |
|---------|-----|------|---------|
| Authelia | `auth.zakobar.com` | Public (it is the auth portal) | container |
| Gitea | `git.zakobar.com` | Gitea-native (LDAP) | container |
| Nextcloud | `nextcloud.zakobar.com` | Nextcloud-native | container |
| Mealie | `mealie.zakobar.com` | Mealie-native (LDAP) | container |
| Paperless | `paperless.zakobar.com` | Authelia one_factor (SSO) | container |
| phpLDAPadmin | `ldapadmin.zakobar.com` | Authelia two_factor, admins only | container |
| Uptime Kuma | `uptime.zakobar.com` | Authelia two_factor, admins only | container |
| Grafana | `grafana.zakobar.com` | Authelia two_factor, admins only | NixOS |
| Ntfy | `ntfy.zakobar.com` | Bypass (ntfy token/password auth) | NixOS |
| Eurovision Vote | `eurovision-vote.zakobar.com` | Authelia one_factor (`/admin` two_factor) | NixOS |
| Jellyfin | `jellyfin.zakobar.com` | Jellyfin-native | container (disabled) |
| Transmission | `torrent.zakobar.com` | Authelia two_factor, admins only | container (disabled) |
## Networking
All containers join a private podman network named **`homey`**, created by the
`podman-homey-network` systemd service in `common.nix`. This provides:
- **DNS isolation** — containers reach each other by name (e.g. `openldap`,
`nextcloud-postgres`) without being exposed on the host network.
- **No port conflicts** — Caddy owns host ports 80/443; service containers map
only to `127.0.0.1:<port>`.
- **Defence in depth** — even if the firewall were misconfigured, services are
not bound to `0.0.0.0`.
Native NixOS services (not containers) listen on `127.0.0.1` directly:
| Service | Host port |
|---------|-----------|
| ntfy | 2586 |
| Eurovision Vote | 8007 |
| Prometheus | 9090 |
| Grafana | 3002 |
Container host-port mappings (all bound to `127.0.0.1`):
| Container | Host port | Container port |
|-----------|-----------|----------------|
| openldap | 389 | 389 |
| authelia | 9091 | 9091 |
| gitea | 3000 | 3000 |
| nextcloud | 8080 | 80 |
| nextcloud-postgres | 5432 | 5432 |
| phpldapadmin | 8081 | 80 |
| uptime-kuma | 3001 | 3001 |
| mealie | 9093 | 9000 |
| paperless | 8083 | 8000 |
| paperless-redis | (internal only) | 6379 |
| jellyfin | 8096 | 8096 |
| transmission | 9092 | 9091 |
Inter-container communication uses container names on the `homey` network
(e.g. authelia → `ldap://openldap:389`, nextcloud → `nextcloud-postgres:5432`).
Caddy (running on the host) proxies via `127.0.0.1:<host port>`.
## Storage Layout
All persistent data lives on the external HD at `/mnt/data/`:
```
/mnt/data/
openldap/
etc-ldap-slapd.d/ → /etc/ldap/slapd.d in container
var-lib-ldap/ → /var/lib/ldap in container
authelia/config/ → /config
gitea/data/ → /data
nextcloud/
html/ → /var/www/html
db/ → /var/lib/postgresql/data
db-dump/ → pg_dump output (pre-backup)
jellyfin/config/ → /config
media/movies|tvshows|... → shared media (read-only to jellyfin)
transmission/config/ → /config
uptime-kuma/ → /app/data
mealie/data/ → /app/data
paperless/
data/ → /usr/src/paperless/data (DB, index)
media/ → /usr/src/paperless/media (document files)
consume/ → /usr/src/paperless/consume (drop folder)
export/ → /usr/src/paperless/export
ntfy/
auth.db → ntfy user/token database (host path)
cache.db → ntfy message cache (host path)
attachments/ → file attachments (host path)
restic-cache/ → restic local cache
```
Grafana and Prometheus use system state dirs (`/var/lib/grafana`,
`/var/lib/prometheus2`) and are not backed up — dashboards are provisioned by
Nix and metrics are ephemeral.
The drive device path is set per-host in `hosts/<name>/default.nix` via
`homey.storage.device`. Use a `/dev/disk/by-label/` or `/dev/disk/by-id/`
path for stability.
## Build / Validate Commands
```bash
# Check flake structure and evaluate all hosts (no build)
nix flake check
# Dry-run: show what would change without applying
sudo nixos-rebuild dry-activate --flake .#pi-main
# Apply configuration
sudo nixos-rebuild switch --flake .#pi-main
# Build without switching (e.g. cross-compile on workstation)
nix build .#nixosConfigurations.pi-main.config.system.build.toplevel
# Show diff between running system and new config
nvd diff /run/current-system $(nix build --no-link --print-out-paths .#nixosConfigurations.pi-main.config.system.build.toplevel)
```
## Secret Management
Secrets are managed with [sops-nix](https://github.com/Mic92/sops-nix) and
age keys. The encrypted `secrets/secrets.yaml` is committed to the repo; the
age private key lives on the Pi at `/var/lib/sops-nix/key.txt`.
```bash
# Edit secrets (decrypts, opens $EDITOR, re-encrypts on save)
sops secrets/secrets.yaml
# Encrypt a plaintext secrets.yaml for the first time
sops --encrypt --in-place secrets/secrets.yaml
# Add a new host key (after generating it on the new machine)
# 1. Add the public key to secrets/.sops.yaml
# 2. Run:
sops updatekeys secrets/secrets.yaml
# Generate a new age key on a host
age-keygen -o /var/lib/sops-nix/key.txt
age-keygen -y /var/lib/sops-nix/key.txt # print public key
```
Secrets that must come from the old deployment (see `PORTING.md` for how to
extract them from the old k8s cluster):
- `openldap/admin_password`, `openldap/config_password`, `openldap/ro_password`
- `gitea/admin_password`
- `nextcloud/admin_password`, `nextcloud/postgres_password`
Everything else (authelia JWT/session/encryption keys, gitea JWT tokens,
restic password, Cloudflare tokens) can be generated fresh.
## Code Style Guidelines
### Nix
1. **Module pattern** — every service is an opt-in module with an `enable` option
(defaulting to `false` for optional services):
```nix
options.homey.myservice.enable = lib.mkEnableOption "My service";
config = lib.mkIf config.homey.myservice.enable { ... };
```
2. **`homeyConfig` specialArgs** — top-level site config (domain, org name,
timezone) is passed via `specialArgs` in `flake.nix` and accessed as
`homeyConfig` in every module. Do not hardcode domain/org strings.
3. **No secrets in the Nix store** — secrets are always read from sops-managed
files at runtime, never embedded in the built config. Use
`config.sops.secrets."key".path` to get the runtime path of a secret file.
4. **Secret injection pattern** — because `oci-containers` `environmentFiles`
is limited, use a `systemd ExecStartPre` script to write an ephemeral env
file at `/run/<service>-secrets.env` and reference it via `environmentFiles`.
Clean it up in `postStop`.
5. **`--network=homey`** — all containers join the private `homey` podman
network. Inter-container traffic uses container names as hostnames; host
access is via explicit `ports` mappings to `127.0.0.1:<port>`.
6. **Systemd ordering** — always express `after`/`requires` dependencies
explicitly. The external HD mount unit is `mnt-data.mount`; containers that
need storage must depend on it.
### Module Contribution Options
Several cross-cutting concerns are wired up via list options that any service
module can append to, rather than editing central files:
| Option | Declared in | Purpose |
|--------|-------------|---------|
| `homey.caddy.virtualHosts` | `caddy.nix` | Add a reverse-proxy vhost |
| `homey.storage.extraDirs` | `storage.nix` | Create tmpfiles dirs on the HD |
| `homey.backup.extraPaths` | `backup.nix` | Include a path in restic backups |
| `homey.monitoring.monitors` | `uptime-kuma.nix` | Add an Uptime Kuma HTTP monitor |
| `homey.authelia.accessControlRules` | `authelia.nix` | Add Authelia access-control rules |
Each service module declares its own entries. No central file edits needed.
**`homey.authelia.accessControlRules`** — each rule has:
- `priority` (int) — lower = earlier in the list. Authelia stops at the first
match, so more-specific rules (e.g. `subject: group:admins`) must precede
their catch-all counterparts. Assigned priority ranges by category:
- `0` — auth bypass (Authelia itself)
- `1019` — blanket bypasses (e.g. ntfy)
- `2049` — admin-only two_factor + deny pairs
- `5064` — open one_factor services
- `6579` — per-path rules (resources + subject combinations)
- `domain` (list of strings)
- `policy` — `bypass` | `one_factor` | `two_factor` | `deny`
- `subject` (optional list) — e.g. `[ "group:admins" ]`
- `resources` (optional list) — URL path regexes
### Adding a New Service
1. Create `modules/services/<name>.nix` following the existing module pattern.
2. Import it in `flake.nix` (in the `modules` list inside `mkHost`).
3. Enable it in `hosts/pi-main/default.nix`.
4. Inside the module's `config = lib.mkIf cfg.enable { ... }` block:
- **Caddy**: add `homey.caddy.virtualHosts = [{ subdomain = "…"; port = …; auth = true/false; }]`
- **Storage**: add `homey.storage.extraDirs = [{ path = "…"; }]` for each HD directory
- **Backup**: add `homey.backup.extraPaths = [ "${dataDir}/…" ]`
- **Authelia**: add `homey.authelia.accessControlRules = [{ priority = …; domain = […]; policy = "…"; }]`
- **Monitoring**: add `homey.monitoring.monitors = [{ name = "…"; url = "…"; interval = 60; }]`
5. Add any new secrets to `secrets/secrets.yaml` and document them.
### Updating or Regenerating Secrets
```bash
# Edit the encrypted file — sops opens $EDITOR
sops secrets/secrets.yaml
# Copy updated secrets to the Pi and rebuild
rsync secrets/secrets.yaml admin@pi-main:/path/to/homey/secrets/
ssh admin@pi-main 'sudo nixos-rebuild switch --flake /path/to/homey#pi-main'
```
### Debugging Containers
```bash
# List all running containers
podman ps
# Follow logs for a service
journalctl -fu podman-authelia.service
# Drop into a running container
podman exec -it authelia sh
# Restart a single service
sudo systemctl restart podman-gitea.service
# Check why a service failed to start
systemctl status podman-openldap.service
journalctl -u podman-openldap.service --since "5 min ago"
```
---
## Outstanding TODOs
These items are known gaps that need to be addressed before the setup is
production-ready:
- [ ] **`caddy.nix` — fix `vendorHash`**: The Caddy build with the Cloudflare
DNS plugin uses `lib.fakeHash` as a placeholder. After the first `nix build`,
replace it with the hash Nix reports in the error message.
- [ ] **`monitoring.nix` — Grafana dashboard hash**: The Node Exporter Full
dashboard `fetchurl` hash is a placeholder. Run:
```bash
nix store prefetch-file --hash-type sha256 \
https://grafana.com/api/dashboards/1860/revisions/37/download
```
and replace the hash in `modules/monitoring.nix`.
- [ ] **`secrets/secrets.yaml` — populate and encrypt**: Fill in all secret
values, then run `sops --encrypt --in-place secrets/secrets.yaml` before
committing. Secrets needed:
- From old k8s deployment: openldap passwords, gitea/nextcloud passwords
- Fresh: authelia JWT/session/encryption keys, gitea JWT tokens
- New services: `uptime-kuma/admin_password`, `ntfy/admin_password`,
`grafana/secret_key`, `ntfy/web_push_private_key`
- Backup: `restic/s3_access_key_id`, `restic/s3_secret_access_key`
- WiFi: `wifi/psk`
- [ ] **Cloudflare Tunnel**: Create the tunnel in the Zero Trust dashboard,
copy the tunnel token into secrets, and configure public hostnames for all
enabled services. See `modules/cloudflared.nix` for details.
- [ ] **Cloudflare Tunnel — add new services**: After the initial tunnel is set
up, add public hostnames for: `uptime`, `ntfy`, `grafana`, `mealie`,
`paperless`, `eurovision-vote`.
- [ ] **Second machine**: When ready, add `hosts/pi-secondary/` and uncomment
the `pi-secondary` entry in `flake.nix`. Services communicating cross-machine
should reference the primary Pi's LAN IP instead of `127.0.0.1`.
- [ ] **Jellyfin and Transmission**: Both modules exist but are disabled.
Enable in `hosts/pi-main/default.nix` when ready:
```nix
homey.jellyfin.enable = true;
homey.transmission.enable = true;
```
- [ ] **Backup — offload script**: Write `scripts/offload-backup.sh` for
manually copying snapshots to a local disk. Uses `restic copy` to clone from
the S3 repo into a local restic repo. See `TODO.org` for design notes.
### Post-Pi first boot
These items require the Pi to be built, flashed, and booted at least once.
- [ ] **`secrets/.sops.yaml` — add Pi age key**: After generating the age key
on the Pi (`age-keygen -o /var/lib/sops-nix/key.txt`), add the public key
to `.sops.yaml` alongside the existing PGP key, then run
`sops updatekeys secrets/secrets.yaml`.
- [ ] **`hosts/pi-main/hardware.nix` — verify SD card labels**: The file
assumes partition labels `NIXOS_SD` (root) and `FIRMWARE` (boot). Relabel
after flashing if they differ, or update the `fileSystems` entries.
- [ ] **Gitea LDAP auth**: After first start, configure LDAP authentication
in Gitea's admin panel (Admin → Authentication Sources → Add LDAP source).
Relevant settings:
- Host: `127.0.0.1`, Port: `389`, Security: Unencrypted
- Bind DN: `cn=readonly,dc=zakobar,dc=com`
- User search base: `ou=users,dc=zakobar,dc=com`
- [ ] **Nextcloud LDAP app**: After restoring the Nextcloud volume, verify
the LDAP Users and Contacts app is still configured correctly
(Admin → LDAP/AD Integration).
- [ ] **Ntfy VAPID keys**: Generate Web Push keys on the Pi:
```bash
sudo ntfy webpush keys
```
Set `homey.ntfy.webPushPublicKey` in `default.nix` and add the private key
to sops as `ntfy/web_push_private_key`.
- [ ] **Uptime Kuma monitors**: On first boot, `uptime-kuma-sync` will
automatically create all monitors declared via `homey.monitoring.monitors`.
Verify they appear correctly in the UI at `https://uptime.zakobar.com`.
- [ ] **Paperless admin token (iOS Shortcut)**: After first start, generate a
dedicated API token in the Paperless web UI (Profile → API Auth Token) for
the iOS Shortcut upload flow. The `/api/documents/post_document/` path
bypasses Authelia — the token is the only auth.