This is a mono repository for my home infrastructure and Kubernetes cluster. I try to adhere to Infrastructure as Code (IaC) and GitOps practices using tools like Komodo, OpenTofu, Kubernetes, Flux, Renovate, and GitHub Actions.
Note
Old ArgoCD is not maintained anymore, but it is available here for reference.
My Kubernetes cluster is deployed with Talos. This is a semi-hyper-converged cluster, workloads and block storage are sharing the same available resources on my nodes while I have a separate server with ZFS for NFS/SMB shares, bulk file storage and backups.
There is a great template made by onedr0p if you want to try and follow along with some of the practices I use here.
- Networking & Service Mesh: cilium provides eBPF-based networking, cloudflared secures ingress traffic via Cloudflare, and external-dns keeps DNS records in sync automatically. All egress traffic is carefuly filtered using network policies.
- Security & Secrets: cert-manager automates SSL/TLS certificate management. For secrets, I use external-secrets with self-hosted HashiCorp Vault to inject secrets into Kubernetes.
- Storage & Data Protection: rook provides distributed storage for persistent volumes, with volsync handling backups and restores. spegel improves reliability by running a stateless, cluster-local OCI image mirror.
- Automation & CI/CD: actions-runner-controller runs self-hosted GitHub Actions runners directly in the cluster for continuous integration workflows.
Flux watches the clusters in my kubernetes folder (see Directories below) and makes the changes to my clusters based on the state of my Git repository.
The way Flux works for me here is it will recursively search the kubernetes/clusters/<cluster name> folder
until it finds the most top level kustomization.yaml per directory and then apply all the resources listed in it.
That aforementioned kustomization.yaml will generally only have a namespace resource and per-cluster Flux kustomization
for subset of apps used in said cluster. Under the control of those Flux kustomizations there will be a HelmRelease
or other resources related to the application which will be applied.
Renovate watches my entire repository looking for dependency updates, when they are found a PR is automatically created. When some PRs are merged Flux applies the changes to my cluster.
Machines which are not feesible to be maintained by kubernetes (like NAS), are managed by Komodo
and Docker Compose files. Directories are organized similar to flux flow - there are global
stacks with application configuration meant to be shared among machines, and hosts configurations with fine-tuned
per-machine options.
This Git repository contains the following directories.
📁 bootstrap # initial set of files necessary to kickstart the cluster
📁 docker
├── 📁 hosts # per-host docker compose komodo configurations
└── 📁 stacks # application templates with base rules shared among machines
📁 kubernetes
├── 📁 apps # application templates with base rules shared among clusters
├── 📁 clusters # per-cluster configurations of said apps
└── 📁 components # re-useable kustomize components
📁 opentofu # opentofu plans for external services like cloudflare
📁 talos # per-cluster talos configurationsWhile most of my infrastructure and workloads are self-hosted I do rely upon the cloud for certain key parts of my setup. This saves me from having to worry about three things. (1) Dealing with chicken/egg scenarios, (2) services I critically need whether my cluster is online or not and (3) The "hit by a bus factor" - what happens to critical apps (e.g. Email, Password Manager, Photos) that my family relies on when I no longer around.
| Service | Use | Cost |
|---|---|---|
| BorgBase | Borg Backups | $80/yr |
| Cloudflare | Services exposed externally | Free |
| GitHub | Hosting this repository and continuous integration/deployments | Free |
| healthchecks.io | Heartbeats monitoring | Free |
| Migadu | Email hosting | $19/yr |
| NextDNS | Ad filtering | ~$20/yr |
| Pushover | Kubernetes Alerts and application notifications | $5 OTP |
| Total: ~$10/mo |
Click to see a high-level network diagram
graph TD
%% Class Definitions
classDef wan fill:#f87171,stroke:#fff,stroke-width:2px,color:#fff,font-weight:bold;
classDef core fill:#60a5fa,stroke:#fff,stroke-width:2px,color:#fff,font-weight:bold;
classDef agg fill:#34d399,stroke:#fff,stroke-width:2px,color:#fff,font-weight:bold;
classDef switch fill:#a78bfa,stroke:#fff,stroke-width:2px,color:#fff,font-weight:bold;
classDef device fill:#facc15,stroke:#fff,stroke-width:2px,color:#000,font-weight:bold;
classDef vlan fill:#1f2937,stroke:#fff,stroke-width:1px,color:#fff,font-size:12px;
%% Nodes
WAN[🛜 netia<br/>1Gbps/300Mbps WAN]:::wan
UCG[📦 UCG Ultra]:::core
AGG[🔗 USW Pro Max 16 PoE]:::agg
NAS[💾 NAS<br/>1 Server]:::device
KUBE[☸️ Kubernetes<br/>3 Nodes]:::device
SW[🔌 USW Flex 2.5G]:::switch
DEV[💻 Devices]:::device
WIFI[📶 WiFi Clients]:::device
%% Subgraph for VLANs
subgraph VLANs [VLANs]
direction TB
HOME[Home Network<br/>192.168.2.0/24]:::vlan
IOTNOWAN["IoT Network (No WAN)<br/>192.168.3.0/24"]:::vlan
IOTWAN["IoT Network (WAN)<br/>192.168.4.0/24"]:::vlan
KUBERNETES[Kubernetes Network<br/>192.168.42.0/24]:::vlan
VPN[VPN Network<br/>192.168.69.0/24]:::vlan
GUEST[Guest Network<br/>192.168.99.0/24]:::vlan
MGMT[Management Network<br/>192.168.254.0/24]:::vlan
end
style VLANs fill:#111,stroke:#fff,stroke-width:2px,rx:0,ry:0,padding:20px;
%% Links
WAN -.->|WAN| UCG
UCG --> AGG
AGG -- 2x10G LACP --- NAS
AGG --> DEV
AGG --> WIFI
AGG -- 2.5G --- SW
SW --> KUBE
%% Style the bonded links thicker
linkStyle 2 stroke-width:4px,stroke:34d399;
%% Move VLANs below the graph, to the middle
KUBE -.-> KUBERNETES
linkStyle 7 stroke-width:0,opacity:0;
In my cluster there are two instances of ExternalDNS running.
One for syncing private DNS records to my UCG Ultra using ExternalDNS webhook provider for UniFi,
while another instance syncs public DNS to Cloudflare. This setup is managed by creating ingresses with two specific
classes: internal for private DNS and external for public DNS. The external-dns instances then syncs
the DNS records to their respective platforms accordingly.
| Device | Num | OS Disk Size | Data Disk Size | Ram | OS | Function |
|---|---|---|---|---|---|---|
| Intel NUC12WSHi5 | 3 | 512GB NVME | 1TB (rook-ceph) | 64GB | Talos | Kubernetes |
| AMD Ryzen + GB B550I Aorus Pro AX | 1 | 1TB SSD | 2x26TB ZFS (mirrored) + 4TB SSD | 64GB | TrueNAS SCALE | NFS + Backup Server |
| JetKVM + AIMOS HDMI KVM Switch | 1 | - | - | - | - | KVM for Kubernetes |
| UniFi UCG Ultra | 1 | - | - | - | - | Router |
| UniFi USW-Pro-Max-16-PoE | 1 | - | - | - | - | 1Gb+2.5Gb PoE Switch |
| UniFi Flex Mini 2.5G | 1 | - | - | - | - | 2.5Gb k8s Switch |
Thanks to all the people who donate their time to the Home Operations Discord community. Be sure to check out kubesearch.dev for ideas on how to deploy applications or get ideas on what you could deploy.

