Files
homey/README.org
T

333 lines
9.6 KiB
Org Mode

#+title: Homey
A home environment for everyone!
* NixOS Deployment (active branch: nixos-port)
** Prerequisites
Before building, make sure the following are set in the repo:
- =hosts/pi-main/default.nix= — SSH public key, static IP, WiFi SSID
- =secrets/secrets.yaml= — all secrets populated and sops-encrypted
- WiFi password secret formatted as =wifi_psk=YourPassword= (see below)
** Adding / updating secrets
#+begin_src bash
sops secrets/secrets.yaml
#+end_src
Opens your editor with the decrypted file. Save and quit to re-encrypt.
The WiFi password entry must use the =wifi_psk== prefix so wpa_supplicant
can look up the value by name:
#+begin_src yaml
wifi/psk: "wifi_psk=YourActualWifiPassword"
#+end_src
** Phase 1 — Bootstrap image (flash this first)
The full =pi-main= config requires sops secrets, which require an age key
on the Pi — but the age key doesn't exist until after first boot. To
break the chicken-and-egg problem, flash a minimal bootstrap image first.
Before building, fill in the WiFi password in =flake.nix= in the
=pi-main-bootstrap= config (search for =WIFI_PASSWORD_HERE=):
#+begin_src nix
networks."Zakobar".psk = "your-actual-wifi-password";
#+end_src
Build the bootstrap SD image (requires =aarch64-linux= build capability —
either =boot.binfmt.emulatedSystems = ["aarch64-linux"]= on your
workstation, or an aarch64 remote builder):
#+begin_src bash
nix build .#nixosConfigurations.pi-main-bootstrap.config.system.build.sdImage \
--system aarch64-linux
#+end_src
Find your SD card device, then flash (double-check =/dev/sdX=!):
#+begin_src bash
lsblk
zstdcat result/sd-image/nixos-sd-image-*.img.zst | \
sudo dd of=/dev/sdX bs=4M status=progress conv=fsync
#+end_src
The Pi will boot at =192.168.1.100=, connect to =Zakobar= WiFi, and accept
SSH connections with your key. No services run yet.
** Phase 2 — Generate age key and re-encrypt secrets
#+begin_src bash
# SSH into the Pi
ssh admin@192.168.1.100
# Generate the age key
sudo age-keygen -o /var/lib/sops-nix/key.txt
# Print the public key — copy it
sudo age-keygen -y /var/lib/sops-nix/key.txt
#+end_src
Back on your workstation, add the public key to =secrets/.sops.yaml=
alongside the existing PGP key:
#+begin_src yaml
keys:
- &pi_main age1xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
creation_rules:
- path_regex: secrets/secrets.yaml$
key_groups:
- pgp:
- 076AA297579A0064
age:
- *pi_main
#+end_src
Then re-encrypt so the Pi can decrypt its own secrets:
#+begin_src bash
sops updatekeys secrets/secrets.yaml
#+end_src
** Phase 3 — Deploy the full config
#+begin_src bash
nixos-rebuild switch \
--flake .#pi-main \
--target-host admin@192.168.1.100 \
--build-host admin@192.168.1.100 \
--use-remote-sudo
#+end_src
The Pi builds its own config natively (no cross-compilation). sops-nix
will now decrypt all secrets and start all services.
You can also use the command:
#+begin_src bash
homey-deploy-rpi-main
#+end_src
** Ongoing deploys from workstation
All future config changes follow the same pattern:
1. Edit files on workstation
2. Run:
#+begin_src bash
homey-deploy-rpi-main
#+end_src
NixOS activates the new config on the Pi immediately, with an automatic
rollback if activation fails.
* Backing up
Backups use [[https://restic.net/][restic]] and run automatically via systemd on a daily schedule.
** Strategy — two tiers
1. *Primary (automatic)*: Daily backup to an S3-compatible bucket (Backblaze B2,
Wasabi, AWS S3, etc.). Restic deduplicates and encrypts before upload.
Retention: 7 daily, 4 weekly, 6 monthly snapshots.
2. *Offload (manual)*: Run =scripts/offload-backup.sh --target /path/to/disk=
to clone snapshots from the S3 repo onto a local disk (USB plugged into the
Pi, or a disk on your workstation). Uses =restic copy= so deduplication is
preserved on the target.
** What is backed up
All service data under =/mnt/data/=:
- =openldap/= — LDAP database and config
- =authelia/= — Authelia config and state
- =gitea/= — Gitea repositories and data
- =nextcloud/= — Nextcloud files + a =pg_dump= of the database
- =jellyfin/= — Jellyfin metadata (media files are excluded — re-downloadable)
- =transmission/= — Torrent client config
Nextcloud is placed into maintenance mode and postgres is =pg_dump='d before
each backup to ensure a consistent snapshot.
** Configuration
Repository URL and credentials are set per-host:
#+begin_src nix
# hosts/pi-main/default.nix
homey.backup.repository = "s3:https://s3.us-west-002.backblazeb2.com/your-bucket";
#+end_src
S3 credentials live in =secrets/secrets.yaml= as =restic/s3_access_key_id= and
=restic/s3_secret_access_key=.
** Restore
#+begin_src bash
# List snapshots
restic -r s3:https://... snapshots
# Restore latest snapshot to /mnt/data
restic -r s3:https://... restore latest --target /mnt/data
# Restore a single service
restic -r s3:https://... restore latest --target /mnt/data --include /mnt/data/gitea
#+end_src
* Disaster Recovery
Full recovery from total host failure (dead Pi, dead SD card), assuming this
git repo and your workstation PGP key (=076AA297579A0064=) survive.
** Step 1 — Flash and boot a new Pi
Follow Phase 1 above to build and flash a fresh bootstrap image, then SSH in.
** Step 2 — Regenerate the age key and re-encrypt secrets
The old Pi's age key is gone with the dead machine. Your workstation PGP key
is the fallback and can still decrypt =secrets/secrets.yaml=.
On the Pi:
#+begin_src bash
sudo age-keygen -o /var/lib/sops-nix/key.txt
sudo age-keygen -y /var/lib/sops-nix/key.txt # copy this public key
#+end_src
On the workstation — replace the old age key in =secrets/.sops.yaml= with the
new public key, then re-encrypt:
#+begin_src bash
sops updatekeys secrets/secrets.yaml
git add secrets/.sops.yaml secrets/secrets.yaml
git commit -m "replace Pi age key after host failure"
#+end_src
** Step 3 — Deploy the full NixOS config
#+begin_src bash
nixos-rebuild switch \
--flake .#pi-main \
--target-host admin@192.168.1.100 \
--build-host admin@192.168.1.100 \
--use-remote-sudo
#+end_src
This brings up the OS and mounts =/mnt/data=. Services will fail to start
until data is restored — that is expected.
** Step 4 — Restore data from restic
Credentials are in =secrets/secrets.yaml= (=restic/password=,
=restic/s3_access_key_id=, =restic/s3_secret_access_key=).
#+begin_src bash
ssh admin@192.168.1.100
export RESTIC_REPOSITORY="s3:https://s3.us-east-005.backblazeb2.com/zakobar-home-backup"
export RESTIC_PASSWORD="..." # restic/password from secrets
export AWS_ACCESS_KEY_ID="..." # restic/s3_access_key_id
export AWS_SECRET_ACCESS_KEY="..." # restic/s3_secret_access_key
restic snapshots # verify repo is reachable
sudo restic restore latest --target /mnt/data
#+end_src
If restoring from a USB offload disk instead of S3:
#+begin_src bash
sudo restic -r /mnt/usb/homey-backup restore latest --target /mnt/data
#+end_src
** Step 5 — Restore the Nextcloud database
The raw Postgres data dir is excluded from restic; only the =pg_dump= SQL file
is backed up. After the data restore you will have
=/mnt/data/nextcloud/db-dump/nextcloud.sql= but an empty database. Import it:
#+begin_src bash
sudo systemctl start podman-nextcloud-postgres
# Wait ~10 s for Postgres to be ready, then:
podman exec -i nextcloud-postgres \
psql -U postgres nextcloud_db \
< /mnt/data/nextcloud/db-dump/nextcloud.sql
#+end_src
** Step 6 — Start services and verify
#+begin_src bash
sudo systemctl start podman-openldap podman-authelia podman-gitea podman-nextcloud
#+end_src
Manual checks after restart:
- *Gitea*: Admin → Authentication Sources — verify the LDAP source is present.
It lives in Gitea's database (restored from restic) so it should survive
automatically. Confirm by logging in with an LDAP user.
- *Nextcloud*: Admin → LDAP/AD Integration — confirm the LDAP app is still
configured. If not, re-enter the settings from the LDAP Configuration
section of this file.
** Key risks
| Risk | Consequence |
|------+-------------|
| External HD also fails | Restore all data from restic — Nextcloud files may be large |
| Workstation PGP key lost | Cannot decrypt =secrets/secrets.yaml= — passwords must be reset manually per service |
| USB offload not yet implemented | =scripts/offload-backup.sh= does not exist yet; S3 is the only working backup tier |
* Running commands in containers
All services run as podman containers. Use =podman exec= to run commands
inside them.
** General pattern
Containers are started by systemd as root, so they live in root's podman
context. All =podman= commands must be run with =sudo=.
#+begin_src bash
# List running containers
sudo podman ps
# Run a command in a container
sudo podman exec <container-name> <command>
# Run as a specific user
sudo podman exec -u <user> <container-name> <command>
# Interactive shell
sudo podman exec -it <container-name> sh
#+end_src
Container names match the service: =openldap=, =authelia=, =gitea=,
=nextcloud=, =nextcloud-postgres=, =jellyfin=, =transmission=.
** Nextcloud — running occ commands
=occ= must run as =www-data= inside the =nextcloud= container.
#+begin_src bash
# General form
sudo podman exec -u www-data nextcloud php occ <command>
# Examples
sudo podman exec -u www-data nextcloud php occ status
sudo podman exec -u www-data nextcloud php occ maintenance:mode --off
sudo podman exec -u www-data nextcloud php occ preview:generate-all -vvv
sudo podman exec -u www-data nextcloud php occ ldap:promote-group "admins"
#+end_src
Running without =-u www-data= will create files owned by root inside the
container, which breaks Nextcloud's file access.