Working NixOS port: all core services operational

- Fix Caddy cfProxy helper for cloudflared http:// vhosts (X-Forwarded-Proto)
- Fix Authelia LDAP bind (readonly user ACL + password sync)
- Add gitea-admin-setup oneshot service to survive rebuilds
- Update Authelia forward_auth with header_up X-Forwarded-Proto https
- Update TODO.org with completed tasks and LDAP config details
- Remove old Helm/k8s artifacts (Chart.yaml, templates/, values/, scripts)
- Add result to .gitignore

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Aner Zakobar
2026-04-23 14:46:21 +03:00
parent 05619d12fc
commit 0b73d493d8
22 changed files with 1410 additions and 355 deletions
+321
View File
@@ -0,0 +1,321 @@
#+TITLE: Caddy, Cloudflare Tunnel & TLS Setup
#+DATE: 2026-04-23
#+AUTHOR: homey project
#+OPTIONS: toc:2 num:t
* Overview
This document describes the TLS and reverse-proxy architecture for the homey
self-hosted stack, the problems encountered while getting it working, and the
final configuration that resolved them. It is intended as a reference for
future debugging and for adding new services.
** Traffic flow
#+BEGIN_EXAMPLE
Browser
│ HTTPS (TLS terminated by Cloudflare edge, *.zakobar.com cert)
Cloudflare edge (anycast IP)
│ QUIC/HTTP2 tunnel (outbound from Pi, no open inbound ports)
cloudflared daemon on Pi (systemd: cloudflared-tunnel.service)
│ plain HTTP on loopback http://localhost:80
Caddy reverse proxy (systemd: caddy.service, port 80 + 443)
│ proxies to backend by Host header
Service container (podman, port on 127.0.0.1)
#+END_EXAMPLE
Key points:
- TLS to the browser is provided entirely by Cloudflare's Universal SSL cert
(~*.zakobar.com~), not by the Pi's Let's Encrypt cert.
- The Pi's Let's Encrypt cert (~*.zakobar.com~ via DNS-01) is used only for
direct LAN access (bypassing the tunnel).
- The tunnel leg (cloudflared → Caddy) is plain HTTP on loopback — this is
safe because both endpoints are the same machine.
* Components
** Caddy (~modules/caddy.nix~)
Caddy runs as a NixOS service (~services.caddy~) using a custom build that
includes the ~caddy-dns/cloudflare~ plugin for DNS-01 ACME challenges.
*** Custom build
The nixpkgs ~caddy~ package does not include the Cloudflare DNS plugin by
default. It is built using the ~withPlugins~ passthru function (backed by
xcaddy):
#+BEGIN_SRC nix
caddyWithCloudflare = pkgs.caddy.withPlugins {
plugins = [
"github.com/caddy-dns/cloudflare@v0.2.4"
];
hash = "sha256-...";
};
#+END_SRC
The ~hash~ is a fixed-output derivation hash that must be updated whenever
the plugin version changes. Use ~lib.fakeHash~ to trigger a build failure
that prints the correct hash, then substitute it.
*** API token injection
The Cloudflare API token is stored in sops (~cloudflare/api_token~) and
injected into the Caddy process via ~systemd LoadCredential~:
#+BEGIN_SRC nix
serviceConfig.LoadCredential =
"cloudflare_api_token:${config.sops.secrets."cloudflare/api_token".path}";
ExecStart = lib.mkForce [
""
(pkgs.writeShellScript "caddy-start" ''
export CLOUDFLARE_API_TOKEN=$(cat "$CREDENTIALS_DIRECTORY/cloudflare_api_token")
exec caddy run --environ --config /etc/caddy/caddy_config --adapter caddyfile
'')
];
#+END_SRC
*** Virtual hosts — dual HTTP/HTTPS entries
Each service has *two* Caddyfile vhost entries:
| Entry | Purpose |
|---|---|
| ~git.zakobar.com~ | HTTPS — for direct LAN access; Caddy handles TLS |
| ~http://git.zakobar.com~ | HTTP — for cloudflared on loopback; no redirect |
Caddy's default behaviour is to automatically redirect HTTP → HTTPS for any
hostname that has a matching HTTPS vhost. By explicitly defining an
~http://~ vhost, that redirect is suppressed and cloudflared gets a direct
200 response instead of a redirect loop.
Without the ~http://~ vhost, accessing via the tunnel produces:
~ERR_TOO_MANY_REDIRECTS~ in the browser because cloudflared follows the 308
back to HTTP indefinitely.
*** Global config
#+BEGIN_SRC caddyfile
{
email admin@zakobar.com
acme_dns cloudflare {env.CLOUDFLARE_API_TOKEN}
}
#+END_SRC
The ~acme_dns~ directive in the global block tells Caddy to use DNS-01
challenges for *all* HTTPS vhosts. This allows wildcard and multi-level
subdomain certs to be issued without any inbound port 80 requirement.
** Cloudflare Tunnel (~modules/cloudflared.nix~)
cloudflared runs as a plain systemd service using the token-based tunnel
approach (~cloudflared tunnel run --token~). No local credentials file or
config file is needed — just the tunnel token from the Zero Trust dashboard.
*** Tunnel configuration (Zero Trust dashboard)
One wildcard public hostname entry covers all services:
| Field | Value |
|---|---|
| Hostname | ~*.zakobar.com~ |
| Service | ~http://localhost:80~ |
| No TLS Verify | off (not needed for HTTP) |
| HTTP Host Header | (empty — cloudflared forwards the real Host header) |
| Origin Server Name | (empty — not needed for HTTP) |
cloudflared automatically forwards the incoming ~Host~ header (e.g.
~git.zakobar.com~) to Caddy, which uses it to select the correct vhost and
backend.
*** DNS records
A single wildcard CNAME record in Cloudflare DNS covers all subdomains:
#+BEGIN_EXAMPLE
*.zakobar.com CNAME <tunnel-id>.cfargotunnel.com (proxied, orange cloud)
#+END_EXAMPLE
This means new services require no DNS changes — only a new Caddy vhost.
*** Cloudflare SSL/TLS mode
Set to *Full (strict)* in the Cloudflare dashboard (SSL/TLS → Overview).
| Mode | Meaning |
|---|---|
| Off | No HTTPS to browser |
| Flexible | HTTPS to browser, HTTP to origin |
| Full | HTTPS to browser, HTTPS to origin (cert not validated) |
| Full (strict) | HTTPS to browser, HTTPS to origin (cert must be valid) |
Full (strict) works here because Cloudflare terminates TLS at its own edge
using its Universal cert, and the origin (cloudflared → Caddy) uses plain
HTTP which Cloudflare does not validate in this tunnel architecture.
* Problems Encountered & How They Were Resolved
** 1. ~caddy-dns/cloudflare~ rejected ~cfut_~ token format
*Symptom:*
#+BEGIN_EXAMPLE
provision dns.providers.cloudflare: API token 'cfut_...' appears invalid;
ensure it's correctly entered and not wrapped in braces nor quotes
#+END_EXAMPLE
*Cause:*
Cloudflare introduced new token formats with a ~cfut_~ (user token) or
~cfat_~ (account token) prefix. These tokens are 54 characters long. The
~caddy-dns/cloudflare~ plugin had a validation regex ~{35,50}~ that rejected
tokens longer than 50 characters, failing before even making an API call.
*Fix:*
The fix was merged into the plugin's master branch as commit ~a8737d0~ and
included in the ~v0.2.4~ tag (despite the tag previously being associated
with an older tree — the proxy confirmed ~v0.2.4~ resolves to ~a8737d0~).
Updating the ~hash~ in ~caddy.nix~ to the value produced by ~lib.fakeHash~
forced a fresh fetch of the corrected ~v0.2.4~ tree:
#+BEGIN_SRC nix
plugins = [ "github.com/caddy-dns/cloudflare@v0.2.4" ];
hash = lib.fakeHash; # replace with hash from build error output
#+END_SRC
Run ~nix build .#nixosConfigurations.pi-main.config.system.build.toplevel~,
copy the ~got:~ hash from the error, substitute it, and rebuild.
** 2. cloudflared ~tls: internal error~ (SNI mismatch)
*Symptom:*
#+BEGIN_EXAMPLE
Unable to reach the origin service: remote error: tls: internal error
originService=https://localhost:443
#+END_EXAMPLE
*Cause:*
cloudflared connected to ~https://localhost:443~ without sending an SNI
(Server Name Indication) hostname in the TLS ClientHello. Caddy could not
match any vhost, had no certificate for ~localhost~, and aborted the
handshake with a TLS internal error.
Setting the ~HTTP Host Header~ override in the dashboard fixes the HTTP
layer but does *not* affect the TLS SNI, which is negotiated before HTTP
headers are exchanged.
Setting the ~Origin Server Name~ field does set the SNI, but for a wildcard
rule (~*.zakobar.com~) the dashboard only accepts a static value, not a
dynamic placeholder — so it cannot be used for a catch-all.
*Fix:*
Switch the tunnel service from ~https://localhost:443~ to
~http://localhost:80~. The internal leg does not need TLS (loopback
interface, same machine). Caddy's HTTP vhosts handle the requests directly.
** 3. Cloudflare edge TLS handshake failure (~*.home.zakobar.com~)
*Symptom:*
#+BEGIN_EXAMPLE
TLS connect error: error:0A000410:SSL routines::ssl/tls alert handshake failure
#+END_EXAMPLE
*Cause:*
The domain was originally configured as ~home.zakobar.com~ (base domain),
making all services two levels deep: ~git.home.zakobar.com~,
~auth.home.zakobar.com~, etc. Cloudflare's free Universal SSL certificate
covers only one level of wildcard: ~*.zakobar.com~. It does *not* cover
~*.home.zakobar.com~ (two levels). The Cloudflare edge had no certificate to
present to browsers for these hostnames, causing a TLS handshake failure
before the request ever reached the tunnel.
*Fix:*
Move all services to single-level subdomains under ~zakobar.com~
(~git.zakobar.com~, ~auth.zakobar.com~, etc.). In the NixOS config this
required only one line change — the ~domain~ field in ~flake.nix~:
#+BEGIN_SRC nix
domain = "zakobar.com"; # was "home.zakobar.com"
#+END_SRC
All modules reference ~homeyConfig.domain~ and updated automatically on
rebuild. Tunnel hostnames and DNS records in the Cloudflare dashboard were
updated to match.
** 4. ~ERR_TOO_MANY_REDIRECTS~ via tunnel
*Symptom:*
Browser shows ~ERR_TOO_MANY_REDIRECTS~ when accessing any service through
the Cloudflare tunnel.
*Cause:*
cloudflared was talking to Caddy over plain HTTP (~http://localhost:80~).
Caddy's default behaviour is to issue a 308 permanent redirect from HTTP to
HTTPS for any hostname that has a matching HTTPS vhost. cloudflared followed
the redirect back to ~http://localhost:80~, which redirected again,
indefinitely.
*Fix:*
Add explicit ~http://~ vhost entries in ~caddy.nix~ for every service. When
Caddy has an explicit HTTP vhost for a hostname, it serves it directly
without redirecting:
#+BEGIN_SRC nix
"git.${domain}" = {
extraConfig = "reverse_proxy localhost:3000";
};
"http://git.${domain}" = { # ← suppresses HTTP→HTTPS redirect
extraConfig = "reverse_proxy localhost:3000";
};
#+END_SRC
* Adding a New Service
To expose a new service through the tunnel:
1. Create ~modules/services/<name>.nix~ following the module pattern.
2. Add both a plain and ~http://~ vhost in ~modules/caddy.nix~:
#+BEGIN_SRC nix
"<name>.${domain}" = {
extraConfig = "reverse_proxy localhost:<port>";
};
"http://<name>.${domain}" = {
extraConfig = "reverse_proxy localhost:<port>";
};
#+END_SRC
3. No DNS or tunnel changes needed — the wildcard CNAME and wildcard tunnel
rule (~*.zakobar.com~) cover new subdomains automatically.
4. Rebuild and switch: ~sudo nixos-rebuild switch --flake .#pi-main~
* Certificate Details
** Let's Encrypt cert (LAN access)
- Issued per-hostname by Caddy via DNS-01 ACME using the Cloudflare API.
- Covers each hostname individually (e.g. ~git.zakobar.com~).
- Stored in ~/var/lib/caddy/.local/share/caddy/certificates/~.
- Used only when accessing services directly on the LAN (bypassing tunnel).
- Auto-renewed by Caddy.
** Cloudflare Universal SSL cert (tunnel / remote access)
- Issued by Google Trust Services for ~*.zakobar.com~.
- Managed entirely by Cloudflare — no action required on the Pi.
- Covers all single-level subdomains (~git.zakobar.com~, ~auth.zakobar.com~, etc.).
- Does *not* cover two-level subdomains (~git.home.zakobar.com~) — this was
the root cause of problem #3 above.
* Quick Reference: Debugging Checklist
| Symptom | Where to look | Command |
|---|---|---|
| 502 Bad Gateway | cloudflared logs | ~journalctl -u cloudflared-tunnel -n 50~ |
| 502 Bad Gateway | Caddy → backend | ~curl http://localhost:<port>/~ |
| TLS internal error | SNI / cert issue | ~curl -sv --resolve host:443:127.0.0.1 https://host/~ |
| Too many redirects | HTTP vhost missing | check ~http://~ entries in caddy.nix |
| Handshake failure at edge | Cloudflare cert scope | check SSL/TLS → Edge Certificates |
| Token appears invalid | plugin version | check ~caddy-dns/cloudflare~ version vs token format |
| Caddy won't start | token / config error | ~journalctl -u caddy --since "5 min ago"~ |