Deployment
Production deployment with Caddy + basic auth
For real-world operation, use the production stack under deployment/ in the repo. It puts a Caddy reverse proxy in front of the Rust service with:
- Automatic TLS via Let’s Encrypt (HTTP/2 and HTTP/3 on by default)
- Basic auth on the UI and API
- Postgres and ClickHouse on the internal docker network — no published ports
- ClickHouse memory-capped at ~2 GB (see
deployment/clickhouse-config.xml)
Setup:
cd deployment
cp .env.example .env
$EDITOR .env # set domain, ACME email, bcrypt hash, DB passwords, KEK
docker compose up -d
deployment/README.md is the authoritative source for setup, user management, password rotation, backups, and troubleshooting.
Authentication boundary
The Rust service ships an in-binary auth stack (GitHub OAuth + opaque API tokens; magic-link sign-in is gated by config). The native auth is the boundary; a basic-auth layer in front of Caddy would double-prompt. Single-tenant deploys behave the same way — sign up as the first user and the operator surface is yours.
/healthz and /readyz are intentionally exposed without auth so
uptime probes, load balancers, and orchestrators can hit them.
/metrics on the public domain returns 404 — scrape it on the internal
docker network instead.
The public status page (/status, /status/*, /api/public/*,
/static/*, /robots.txt, /favicon.ico) is also unauthenticated by
design — see Public status surface below.
See Authentication for the in-binary flow.
Email provider (Resend)
Transactional email (invitations, magic-link sign-in) goes through the
EmailSender trait. Production uses Resend; dev
and test default to the log provider, which writes the action URL to
the tracing log so you can copy-paste it into a browser.
Setup:
-
Create a Resend account and verify your sending domain. Resend will give you DKIM and DMARC records to add to DNS.
-
Generate an API key with
emails.sendpermission only. -
Configure the service:
[email] provider = "resend" from_name = "Acme Status" from_address = "no-reply@status.acme.test" [email.resend] api_key = "re_…"Or via env:
UPTIMEPAGE_EMAIL__PROVIDER=resend,UPTIMEPAGE_EMAIL__RESEND__API_KEY=re_…. -
auth.public_base_urlmust be set to the externally-reachable origin (e.g.https://status.acme.test); the value is embedded in the links the recipient receives.
The factory rejects boot when provider = "resend" is set without a
non-empty API key — fail-fast over send-time surprise.
Public status surface
The Caddyfile carries an @public matcher that short-circuits basic_auth for the public status paths and adds a per-IP rate limit (60 req/min) via the caddy-ratelimit plugin. The stock caddy:2-alpine image doesn’t include that plugin, so the production deployment uses a custom custom-caddy:2 image built with xcaddy:
docker build -t custom-caddy:2 - <<'EOF'
FROM caddy:2-builder AS builder
RUN xcaddy build --with github.com/mholt/caddy-ratelimit
FROM caddy:2-alpine
COPY --from=builder /usr/bin/caddy /usr/bin/caddy
EOF
Then point the caddy service in deployment/docker-compose.yml at custom-caddy:2. Full procedure (including the opt-out path that drops the rate-limit block) is in deployment/README.md.
The same custom image carries two more per-IP zones: auth_endpoints (10/min on /auth/*, /api/v1/me, invitation accept) and org_creation (3 per 24 h on POST /api/v1/orgs). These are the edge tier; the per-org / per-user budgets the service enforces from each org’s plan are the Quotas & rate limits tier — complementary, since behind the proxy the app sees only the proxy as the peer.
Per-org subdomains (SaaS)
When tenancy.subdomain_public_routes = true, each org’s page is served at {slug}.{public_status.base_domain} (apex-wildcard shape). That needs:
- a wildcard DNS record
*.{domain}pointing at the host (plus explicit A/AAAA records for any operator subdomain —app,mail, etc. — which take precedence over the wildcard); - a wildcard TLS cert for
*.{domain}. HTTP-01 can’t validate a wildcard, so the custom Caddy image also bundlescaddy-dns/hetznerand solves the ACME DNS-01 challenge using aHETZNER_DNS_API_TOKEN(zone-edit scope) from.env. The operator host (app.{domain}) is kept on its own per-host HTTP-01 cert in a separate Caddyfile block so a wildcard-key compromise does not reach the operator surface.
The wildcard means a new org’s page works the moment its owner enables it — no per-org DNS or cert step. The end-to-end runbook (Hetzner zone setup, token scope, building the image, verifying the wildcard cert) is in deployment/README.md. The model — host routing, branding, opt-in gating, cookie scoping — is in Per-org status pages.
For the operator workflow (enabling components, narrating incidents, scheduling maintenance) see Public status page.
Docker
docker compose up -d brings up Postgres 17, ClickHouse 26.3, and the monitor on the same network. Compose env vars wire the monitor to the stack:
UPTIMEPAGE_STORAGE__POSTGRES__URL: postgres://monitor:monitor@postgres:5432/monitor
UPTIMEPAGE_STORAGE__CLICKHOUSE__URL: http://clickhouse:8123
UPTIMEPAGE_STORAGE__CLICKHOUSE__USER: monitor
UPTIMEPAGE_STORAGE__CLICKHOUSE__PASSWORD: monitor
UPTIMEPAGE_OBSERVABILITY__LOG_FORMAT: json
The runtime image is gcr.io/distroless/static-debian12:nonroot for a minimal attack surface, no shell, and no glibc. Built from a static musl binary via rust:1-alpine. Final image is 16 MB — both uptimepage and loadtest binaries fit in the same image.
Bind addresses
Defaults are loopback (127.0.0.1:8080 API, 127.0.0.1:9090 metrics). Override via env for non-loopback exposure:
UPTIMEPAGE_SERVER__API_BIND=0.0.0.0:8080 \
UPTIMEPAGE_SERVER__METRICS_BIND=0.0.0.0:9090 \
./uptimepage
There is no built-in auth on the API port. Front it with a proxy or keep it on a private network. The ready-made Caddy stack under deployment/ does this for you.
Metrics shipping (Grafana Cloud)
The Prometheus /metrics endpoint can be shipped to Grafana Cloud by a
Grafana Alloy sidecar. It is opt-in: the compose stack only starts it
under the metrics profile (docker compose --profile metrics up -d),
so the default deployment is unchanged. Credentials are read from .env
(gitignored) and never written into deployment/config.alloy.
deployment/README.md (“Metrics”) is the authoritative setup, including
how to obtain the Grafana Cloud URL/token, the internal-network bind, the
ready-made dashboard, and how to verify ingestion.
Migrations
- Postgres:
migrations/postgres/*.sql, applied at startup viasqlx::migrate!(tracked in_sqlx_migrations) - ClickHouse:
migrations/clickhouse/*.sql, applied idempotently viaCREATE … IF NOT EXISTSat startup
No external migrator. The app owns its schema lifecycle symmetrically.
Resource sizing
checker.max_concurrent_checkscaps simultaneous in-flight checks- Per-check memory: small (a tokio task + an in-flight hyper request + bookkeeping)
- The practical ceiling is set by file descriptors and ephemeral ports, not RAM
- At 50k concurrent checks against external targets, RSS sits around 200-400 MB depending on response sizes
- The optional
metricsprofile adds a Grafana Alloy container (~100 MB RSS plus a small bounded remote-write WAL volume) — account for it when sizing the host if you enable it
Graceful shutdown
The binary listens for SIGINT and SIGTERM, cancels the scheduler and batcher via a shared CancellationToken, awaits both background tasks, and exits within 10 s. The batcher’s cancel branch drains any pending results before returning. A warning is logged if the deadline is exceeded.