Files
homey/README.org
T
2026-05-10 23:56:01 +03:00

427 lines
13 KiB
Org Mode

#+title: Homey
A home environment for everyone!
* NixOS Deployment (active branch: nixos-port)
** Prerequisites
Before building, make sure the following are set in the repo:
- =hosts/pi-main/default.nix= — SSH public key, static IP, WiFi SSID
- =secrets/secrets.yaml= — all secrets populated and sops-encrypted
- WiFi password secret formatted as =wifi_psk=YourPassword= (see below)
** Adding / updating secrets
#+begin_src bash
sops secrets/secrets.yaml
#+end_src
Opens your editor with the decrypted file. Save and quit to re-encrypt.
The WiFi password entry must use the =wifi_psk== prefix so wpa_supplicant
can look up the value by name:
#+begin_src yaml
wifi/psk: "wifi_psk=YourActualWifiPassword"
#+end_src
** Phase 1 — Bootstrap image (flash this first)
The full =pi-main= config requires sops secrets, which require an age key
on the Pi — but the age key doesn't exist until after first boot. To
break the chicken-and-egg problem, flash a minimal bootstrap image first.
Before building, fill in the WiFi password in =flake.nix= in the
=pi-main-bootstrap= config (search for =WIFI_PASSWORD_HERE=):
#+begin_src nix
networks."Zakobar".psk = "your-actual-wifi-password";
#+end_src
Build the bootstrap SD image (requires =aarch64-linux= build capability —
either =boot.binfmt.emulatedSystems = ["aarch64-linux"]= on your
workstation, or an aarch64 remote builder):
#+begin_src bash
nix build .#nixosConfigurations.pi-main-bootstrap.config.system.build.sdImage \
--system aarch64-linux
#+end_src
Find your SD card device, then flash (double-check =/dev/sdX=!):
#+begin_src bash
lsblk
zstdcat result/sd-image/nixos-sd-image-*.img.zst | \
sudo dd of=/dev/sdX bs=4M status=progress conv=fsync
#+end_src
The Pi will boot at =192.168.1.100=, connect to =Zakobar= WiFi, and accept
SSH connections with your key. No services run yet.
** Phase 2 — Generate age key and re-encrypt secrets
#+begin_src bash
# SSH into the Pi
ssh admin@192.168.1.100
# Generate the age key
sudo age-keygen -o /var/lib/sops-nix/key.txt
# Print the public key — copy it
sudo age-keygen -y /var/lib/sops-nix/key.txt
#+end_src
Back on your workstation, add the public key to =secrets/.sops.yaml=
alongside the existing PGP key:
#+begin_src yaml
keys:
- &pi_main age1xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
creation_rules:
- path_regex: secrets/secrets.yaml$
key_groups:
- pgp:
- 076AA297579A0064
age:
- *pi_main
#+end_src
Then re-encrypt so the Pi can decrypt its own secrets:
#+begin_src bash
sops updatekeys secrets/secrets.yaml
#+end_src
** Phase 3 — Deploy the full config
#+begin_src bash
nixos-rebuild switch \
--flake .#pi-main \
--target-host admin@192.168.1.100 \
--build-host admin@192.168.1.100 \
--use-remote-sudo
#+end_src
The Pi builds its own config natively (no cross-compilation). sops-nix
will now decrypt all secrets and start all services.
You can also use the command:
#+begin_src bash
homey-deploy-rpi-main
#+end_src
** Ongoing deploys from workstation
All future config changes follow the same pattern:
1. Edit files on workstation
2. Run:
#+begin_src bash
homey-deploy-rpi-main
#+end_src
NixOS activates the new config on the Pi immediately, with an automatic
rollback if activation fails.
* Post-deploy setup
Some services require manual one-time configuration after the first deploy.
** Ntfy — push notifications
Ntfy's admin user is created automatically from sops on first start.
*** Step 1 — Generate VAPID keys (Web Push)
Run on the Pi *before* the first full deploy:
#+begin_src bash
ssh admin@192.168.1.100 'sudo ntfy webpush keys'
#+end_src
This prints a public key and a private key.
- Copy the *public key* into =hosts/pi-main/default.nix=:
#+begin_src nix
homey.ntfy.webPushPublicKey = "<public-key>";
homey.ntfy.webPushEmail = "mailto:you@zakobar.com";
#+end_src
- Add the *private key* to sops:
#+begin_src bash
sops secrets/secrets.yaml
# add: ntfy/web_push_private_key: <private-key>
#+end_src
The private key is injected at boot and never lands in the nix store.
*** Step 2 — Subscribe via Safari PWA (recommended for iOS)
1. Visit =https://ntfy.zakobar.com= in Safari and log in with the admin
password (=ntfy/admin_password= in =secrets/secrets.yaml=).
2. Go to *Account → Access Tokens → Create token* — give it a name and
copy the value.
3. Log in with the token, then tap *Share → Add to Home Screen*.
4. Open the app from the Home Screen (must be launched from there, not
Safari, to get push permission).
5. Subscribe to the =alerts= topic and grant notification permission when
prompted.
Web Push via the PWA uses Apple's APNs directly and is more reliable on
iOS than the native ntfy app's upstream relay.
** Uptime Kuma — notifications (two-deploy process)
Uptime Kuma monitors are created automatically by the sync script on first
deploy, but notification channels must be configured in the UI before they
can be attached to monitors. This requires two deploys:
*Deploy 1* — services are up, monitors exist, but no notifications assigned yet.
Then, in the Uptime Kuma UI (=https://uptime.zakobar.com=):
1. Go to *Settings → Notifications → Add Notification*.
2. Choose *ntfy* as the type and fill in:
- *Server URL*: =https://ntfy.zakobar.com=
- *Topic*: =alerts=
- *Token*: use the admin token (or create a dedicated one in ntfy)
3. Save — you do *not* need to manually assign it to any monitor.
*Deploy 2* — run =homey-deploy-rpi-main= again. The sync script will detect
the newly configured notification channel and attach it to every monitor
automatically.
Any notifications added to Uptime Kuma in the future will also be picked up
on the next deploy.
* Backing up
Backups use [[https://restic.net/][restic]] and run automatically via systemd on a daily schedule.
** Strategy — two tiers
1. *Primary (automatic)*: Daily backup to an S3-compatible bucket (Backblaze B2,
Wasabi, AWS S3, etc.). Restic deduplicates and encrypts before upload.
Retention: 7 daily, 4 weekly, 6 monthly snapshots.
2. *Offload (manual)*: Run =scripts/offload-backup.sh --target /path/to/disk=
to clone snapshots from the S3 repo onto a local disk (USB plugged into the
Pi, or a disk on your workstation). Uses =restic copy= so deduplication is
preserved on the target.
** What is backed up
All service data under =/mnt/data/=:
- =openldap/= — LDAP database and config
- =authelia/= — Authelia config and state
- =gitea/= — Gitea repositories and data
- =nextcloud/= — Nextcloud files + a =pg_dump= of the database
- =jellyfin/= — Jellyfin metadata (media files are excluded — re-downloadable)
- =transmission/= — Torrent client config
Nextcloud is placed into maintenance mode and postgres is =pg_dump='d before
each backup to ensure a consistent snapshot.
** First-time setup — initialize the repository
Restic requires a one-time =init= before the first backup can run. The
automated job will fail with "repository does not exist" until this is done.
Run on the Pi after the first deploy:
#+begin_src bash
# Note: use single quotes around the remote script to prevent local shell expansion
ssh admin@192.168.1.100 'sudo bash -c '"'"'
export AWS_ACCESS_KEY_ID=$(cat /run/secrets/restic/s3_access_key_id)
export AWS_SECRET_ACCESS_KEY=$(cat /run/secrets/restic/s3_secret_access_key)
export RESTIC_PASSWORD=$(cat /run/secrets/restic/password)
restic -r s3:https://s3.us-east-005.backblazeb2.com/zakobar-home-backup init
'"'"''
#+end_src
You only need to do this once. After =init= succeeds, the daily timer will
run normally. To trigger a backup immediately without waiting for 03:00:
#+begin_src bash
ssh admin@192.168.1.100 "sudo systemctl start restic-backups-homey.service"
#+end_src
** Configuration
Repository URL and credentials are set per-host:
#+begin_src nix
# hosts/pi-main/default.nix
homey.backup.repository = "s3:https://s3.us-west-002.backblazeb2.com/your-bucket";
#+end_src
S3 credentials live in =secrets/secrets.yaml= as =restic/s3_access_key_id= and
=restic/s3_secret_access_key=.
** Restore
#+begin_src bash
# List snapshots
restic -r s3:https://... snapshots
# Restore latest snapshot to /mnt/data
restic -r s3:https://... restore latest --target /mnt/data
# Restore a single service
restic -r s3:https://... restore latest --target /mnt/data --include /mnt/data/gitea
#+end_src
* Disaster Recovery
Full recovery from total host failure (dead Pi, dead SD card), assuming this
git repo and your workstation PGP key (=076AA297579A0064=) survive.
** Step 1 — Flash and boot a new Pi
Follow Phase 1 above to build and flash a fresh bootstrap image, then SSH in.
** Step 2 — Regenerate the age key and re-encrypt secrets
The old Pi's age key is gone with the dead machine. Your workstation PGP key
is the fallback and can still decrypt =secrets/secrets.yaml=.
On the Pi:
#+begin_src bash
sudo age-keygen -o /var/lib/sops-nix/key.txt
sudo age-keygen -y /var/lib/sops-nix/key.txt # copy this public key
#+end_src
On the workstation — replace the old age key in =secrets/.sops.yaml= with the
new public key, then re-encrypt:
#+begin_src bash
sops updatekeys secrets/secrets.yaml
git add secrets/.sops.yaml secrets/secrets.yaml
git commit -m "replace Pi age key after host failure"
#+end_src
** Step 3 — Deploy the full NixOS config
#+begin_src bash
nixos-rebuild switch \
--flake .#pi-main \
--target-host admin@192.168.1.100 \
--build-host admin@192.168.1.100 \
--use-remote-sudo
#+end_src
This brings up the OS and mounts =/mnt/data=. Services will fail to start
until data is restored — that is expected.
** Step 4 — Restore data from restic
Credentials are in =secrets/secrets.yaml= (=restic/password=,
=restic/s3_access_key_id=, =restic/s3_secret_access_key=).
#+begin_src bash
ssh admin@192.168.1.100
export RESTIC_REPOSITORY="s3:https://s3.us-east-005.backblazeb2.com/zakobar-home-backup"
export RESTIC_PASSWORD="..." # restic/password from secrets
export AWS_ACCESS_KEY_ID="..." # restic/s3_access_key_id
export AWS_SECRET_ACCESS_KEY="..." # restic/s3_secret_access_key
restic snapshots # verify repo is reachable
sudo restic restore latest --target /mnt/data
#+end_src
If restoring from a USB offload disk instead of S3:
#+begin_src bash
sudo restic -r /mnt/usb/homey-backup restore latest --target /mnt/data
#+end_src
** Step 5 — Restore the Nextcloud database
The raw Postgres data dir is excluded from restic; only the =pg_dump= SQL file
is backed up. After the data restore you will have
=/mnt/data/nextcloud/db-dump/nextcloud.sql= but an empty database. Import it:
#+begin_src bash
sudo systemctl start podman-nextcloud-postgres
# Wait ~10 s for Postgres to be ready, then:
podman exec -i nextcloud-postgres \
psql -U postgres nextcloud_db \
< /mnt/data/nextcloud/db-dump/nextcloud.sql
#+end_src
** Step 6 — Start services and verify
#+begin_src bash
sudo systemctl start podman-openldap podman-authelia podman-gitea podman-nextcloud
#+end_src
Manual checks after restart:
- *Gitea*: Admin → Authentication Sources — verify the LDAP source is present.
It lives in Gitea's database (restored from restic) so it should survive
automatically. Confirm by logging in with an LDAP user.
- *Nextcloud*: Admin → LDAP/AD Integration — confirm the LDAP app is still
configured. If not, re-enter the settings from the LDAP Configuration
section of this file.
** Key risks
| Risk | Consequence |
|------+-------------|
| External HD also fails | Restore all data from restic — Nextcloud files may be large |
| Workstation PGP key lost | Cannot decrypt =secrets/secrets.yaml= — passwords must be reset manually per service |
| USB offload not yet implemented | =scripts/offload-backup.sh= does not exist yet; S3 is the only working backup tier |
* Running commands in containers
All services run as podman containers. Use =podman exec= to run commands
inside them.
** General pattern
Containers are started by systemd as root, so they live in root's podman
context. All =podman= commands must be run with =sudo=.
#+begin_src bash
# List running containers
sudo podman ps
# Run a command in a container
sudo podman exec <container-name> <command>
# Run as a specific user
sudo podman exec -u <user> <container-name> <command>
# Interactive shell
sudo podman exec -it <container-name> sh
#+end_src
Container names match the service: =openldap=, =authelia=, =gitea=,
=nextcloud=, =nextcloud-postgres=, =jellyfin=, =transmission=.
** Nextcloud — running occ commands
=occ= must run as =www-data= inside the =nextcloud= container.
#+begin_src bash
# General form
sudo podman exec -u www-data nextcloud php occ <command>
# Examples
sudo podman exec -u www-data nextcloud php occ status
sudo podman exec -u www-data nextcloud php occ maintenance:mode --off
sudo podman exec -u www-data nextcloud php occ preview:generate-all -vvv
sudo podman exec -u www-data nextcloud php occ ldap:promote-group "admins"
#+end_src
Running without =-u www-data= will create files owned by root inside the
container, which breaks Nextcloud's file access.