Kubernetes on Backend Engineering Strategy Tools

Image Tooling

Thu, 18 Jun 2026 00:00:00 +0000

Versioned, multi-arch Docker images for Kubernetes workflows — built with Dagger, published to Docker Hub, triggered by a version tag.

The motivation is in Shared Tooling Images: one image, consistent versions, three contexts — CI, local, colleagues.

Images

GitHub repo	Docker Hub	Contents
`image-tooling`	`best-tools/tooling-k8s`	kubectl, helm, kustomize, argocd CLI, k9s, jq, yq
`image-tooling`	`best-tools/tooling-k8s-aws`	`tooling-k8s` + AWS CLI
`image-tooling`	`best-tools/tooling-k8s-openstack`	`tooling-k8s` + OpenStack CLI
`image-buildx`	`best-tools/buildx`	CI builder — Docker buildx, AWS CLI, Dagger CLI
`image-pandoc`	`best-tools/pandoc`	PDF generation — pandoc + TeX Live

All images publish as multi-arch manifests: linux/amd64 + linux/arm64.

Quick start

Interactive shell with kubeconfig mounted:

docker run -it --rm \
 -v ~/.kube:/mnt/kube:ro \
 -v $(pwd):/work \
 -w /work \
 docker.io/best-tools/tooling-k8s:latest

The image entry point symlinks /mnt/kube → /root/.kube on startup, so kubectl picks it up immediately.

Shell alias for daily use:

alias k8s='docker run -it --rm \
 -v ~/.kube:/mnt/kube:ro \
 -v $(pwd):/work -w /work \
 docker.io/best-tools/tooling-k8s:latest'

k8s helm lint .
k8s kubectl get pods -n argocd

In CI (GitHub Actions):

- name: Lint chart
 run: docker run --rm -v ${{ github.workspace }}:/work -w /work docker.io/best-tools/tooling-k8s:latest helm lint .

Or reference the image directly as the job container — no install step needed.

Setup (contributors / maintainers)

Credentials are set once as GitHub org-level secrets and inherited by all image-* repos automatically.

Secret	Where to get it
`DOCKERHUB_TOKEN`	hub.docker.com → Account → Security → Access Tokens (Read, Write, Delete)
`DAGGER_CLOUD_TOKEN`	cloud.dagger.io → Organisation → Tokens

Path: github.com/Backend-Engineering-Strategy-Tools → Settings → Secrets and variables → Actions → New organisation secret.

Releasing

git tag -a v1.0.0 -m "Release v1.0.0"
git push origin v1.0.0

The GitHub Actions workflow triggers on v*.*.* tags, calls dagger call publish-multi-arch, and pushes both best-tools/<image>:v1.0.0 and best-tools/<image>:latest to Docker Hub. Pipeline trace at cloud.dagger.io.

Gardener on Cleura

Tue, 16 Jun 2026 00:00:00 +0000

Getting hands-on with Gardener on Cleura — a European OpenStack cloud — ahead of using it professionally. The focus is on the networking and traffic ingress side: how does a Gardener shoot cluster on OpenStack expose services, what does the LoadBalancer path actually look like, and when does ingress apply versus when it does not.

The test application is a Minecraft server with Velocity proxy — useful precisely because it is raw TCP rather than HTTP, which forces the full LoadBalancer path rather than an ingress shortcut.

→ Gardener on Cleura — technical notes

Steps

1 — Shoot cluster

Provision a Gardener shoot cluster on Cleura. Cleura wraps Gardener behind their own REST API — gardenctl and the Gardener Terraform provider require the garden cluster kubeconfig, which Cleura does not expose. Cluster lifecycle goes through their REST API instead.

→ Provisioning via Cleura REST API
→ Cleura docs issue #533 — IaC and gardenctl access

2 — Minecraft via standard LoadBalancer

Deploy itzg/minecraft-server as a StatefulSet with a plain LoadBalancer service for TCP 25565 — the direct Octavia path, no Gateway involved. Gets the server running quickly and confirms TCP exposure works on Cleura independently.

Internet
 |
TCP 25565
 |
Octavia LB (direct LoadBalancer service)
 |
Minecraft Pod (itzg/minecraft-server)
 |
PVC (Cinder)

Manifests: stateful_set.yaml · service.yaml

Apply directly from this repo:

kubectl apply -k "https://github.com/Backend-Engineering-Strategy-Tools/site//static/scripts/mc?ref=main"

Check the rollout and grab the external IP:

kubectl get all -l app=mc-example

kubectl get svc mc-example \
 -o jsonpath='{.status.loadBalancer.ingress[0].ip}:{.spec.ports[0].port}'

2.5 — Migrate to Helm chart

Swap the raw manifests for the itzg/minecraft-server-charts Helm chart — actively maintained, covers server type, persistence, RCON, backups, and extra ports (BlueMap, Dynmap). The raw YAML stays useful as a reference for the underlying shape.

Manifests: values.yaml

helm repo add minecraft-server-charts https://itzg.github.io/minecraft-server-charts/
helm upgrade --install mc minecraft-server-charts/minecraft \
 -f https://backend-engineering-strategy-tools.github.io/site/scripts/mc-helm/values.yaml

Check the rollout and grab the external IP:

kubectl get all -l app.kubernetes.io/instance=mc

kubectl get svc mc-example -o jsonpath='{.status.loadBalancer.ingress[0].ip}:{.spec.ports[0].port}'

3 — Envoy Gateway

Deploy Envoy Gateway into the shoot cluster — the CNCF implementation of the Kubernetes Gateway API. The NGINX Ingress Controller is deprecated; Gateway API is the forward path with a standardised spec for both HTTP and TCP.

Envoy Gateway exposes a single LoadBalancer service via Octavia. Everything routes through it.

4 — HTTPRoute, certificates, and BlueMap

Deploy BlueMap — a Minecraft mod that renders the world as a live 3D web map served over HTTP. Route it through the Gateway with a HTTPRoute and wire cert-manager to provision a Let’s Encrypt certificate.

A real HTTP service with a real use, not a throwaway test page. Validates the full HTTP + TLS path before touching the game server.

5 — Migrate to TCPRoute

Migrate the TCP service to a TCPRoute through Envoy Gateway. TCPRoute is in the Gateway API experimental channel — this step validates that a single Gateway handles both HTTP and raw TCP.

Internet
 |
Octavia LB (one Gateway LoadBalancer)
 |
Envoy Gateway
 |
+------------------------------+------------------------------+
| |
HTTPRoute → BlueMap TCPRoute → Minecraft

6 — Velocity (if needed)

Add Velocity as a TCP proxy in front of the Minecraft server if multi-server routing becomes relevant — lobby, modded, survival as separate backends. Skip if a single server is enough.

→ Minecraft project

7 — Plugin pipeline

A colleague is building a Minecraft plugin. The goal is a Dagger pipeline with GitHub Actions — the same build running locally and in CI, covering the JVM toolchain and packaging steps.

8 — AI

Something with NPC behaviour, a bot, or plugin-side automation. Low priority, high fun.

IaC gap

Cleura does not expose the garden cluster kubeconfig. That one limitation closes off the entire Gardener tooling ecosystem: gardenctl requires it, the Gardener Terraform provider requires it, and any Crossplane provider built on the Gardener API would require it too. There is no HCL path here.

What remains is Cleura’s own REST API — which is fine for interactive use but falls short the moment you want to drive cluster lifecycle from a pipeline. A bash script wrapping curl and jq works, and that is what cleura-shoot.sh does, but it is a workaround rather than a solution. No state, no plan, no diff — just imperative API calls.

Options if this needs to graduate beyond a script:

Crossplane provider-http — can wrap the REST API declaratively, but has no native polling or deletion hooks, so the reconciliation story is awkward
Custom Terraform provider — full plan/apply semantics, but requires writing a Go provider from scratch
Pulumi dynamic provider — similar effort, Python or TypeScript

A feature request for gardenctl access or a native IaC provider has been filed with Cleura (→ cleura/docs#533). Until something changes there, the bash script is as good as it gets.

Status

Step	Status
1 — Shoot cluster on Cleura	done
2 — Minecraft via LoadBalancer (itzg)	planned
2.5 — Migrate to Helm chart	planned
3 — Envoy Gateway	planned
4 — HTTPRoute + cert-manager + BlueMap	planned
5 — Migrate to TCPRoute	planned
6 — Velocity	planned
7 — Plugin pipeline (Dagger)	planned
8 — AI	planned

Building this out — notes will expand as each step lands.

Gardener on Cleura

Tue, 16 Jun 2026 00:00:00 +0000

Gardener is a Kubernetes-as-a-Service framework that runs on Kubernetes and manages the lifecycle of other clusters declaratively. Rather than managing control planes by hand, Gardener treats clusters as a resource — defined, created, upgraded, and deleted via the Gardener API.

Concepts

Gardener uses three layers:

Layer	What it is
Garden cluster	Runs Gardener itself — the management control plane
Seed cluster	Hosts the control planes of shoot clusters (as pods)
Shoot cluster	The cluster you actually use — nodes run on the target cloud

The shoot cluster’s API server does not run on the shoot nodes. It runs as a pod inside the seed cluster. From the outside it behaves like any other Kubernetes cluster; internally the control plane is isolated from the data plane.

Shoot clusters are defined as Shoot resources applied to the garden cluster:

apiVersion: core.gardener.cloud/v1beta1
kind: Shoot
metadata:
 name: my-cluster
 namespace: garden-my-project
spec:
 cloudProfileName: openstack
 region: sto2
 provider:
 type: openstack
 workers:
 - name: worker-pool
 machine:
 type: l2.c2r4
 minimum: 1
 maximum: 3
 kubernetes:
 version: "1.30"
 networking:
 type: calico
 pods: 100.128.0.0/11
 nodes: 10.250.0.0/16
 services: 100.112.0.0/13

Shoot cluster on Cleura

Cleura is a European OpenStack provider. Gardener provisions shoot nodes as OpenStack VMs via the OpenStack machine controller.

Key integrations:

Component	Implementation
Node provisioning	OpenStack VMs via Gardener machine controller
Load balancers	Octavia via cloud-controller-manager
Block storage	Cinder via CSI driver
DNS	Manual or external-dns
CNI	Calico (default) or configurable

Gardener on Cleura does not provide an ingress controller or API gateway — these are brought in separately.

Networking

Gardener manages the cluster network configuration as part of the shoot spec. Pod, node, and service CIDRs are defined at cluster creation and must not overlap with the OpenStack network.

On Cleura, nodes get OpenStack floating IPs for egress. Pod-to-pod traffic stays within the cluster overlay network (Calico by default). Traffic entering from outside the cluster goes through a LoadBalancer service — either directly for raw TCP, or via a gateway controller for HTTP.

Ingress — classic vs Gateway API

The classic Kubernetes Ingress resource is HTTP-only, has no TCP support, and its feature set varies across implementations via non-standard annotations. The NGINX Ingress Controller — the most widely used implementation — is deprecated; NGINX now focuses on their Gateway API implementation instead.

The Kubernetes Gateway API is the forward path — a set of CRDs (Gateway, HTTPRoute, TCPRoute, TLSRoute) with a standardized spec and first-class support for both HTTP and TCP.

Resource	Protocol	API	Status
`Ingress`	HTTP only	Kubernetes	Stable, legacy
`HTTPRoute`	HTTP/HTTPS	Gateway API	Stable
`TCPRoute`	Raw TCP	Gateway API	Experimental
`TLSRoute`	TLS passthrough	Gateway API	Experimental

Envoy Gateway

Envoy Gateway is the CNCF implementation of the Kubernetes Gateway API using Envoy as the data plane. It supports HTTPRoute, TCPRoute, and TLSRoute through a single Gateway resource — one entry point, both protocols.

Octavia LB ← one LoadBalancer service per Gateway listener
 |
Envoy Gateway pod
 |
+------------------+------------------+
| |
HTTPRoute → ClusterIP pods TCPRoute → ClusterIP pods

Envoy Gateway is deployed into the shoot cluster and exposes a LoadBalancer service via Octavia, the same as any other service. The Gateway API resources then declare what routes through it.

TCPRoute — declaring TCP services

TCPRoute attaches to a Gateway listener and routes raw TCP traffic to a backend service. This is how a non-HTTP workload (e.g. a game server, a database proxy, a custom protocol service) gets exposed through the Gateway API rather than a standalone LoadBalancer service.

apiVersion: gateway.networking.k8s.io/v1alpha2
kind: TCPRoute
metadata:
 name: my-tcp-service
 namespace: my-app
spec:
 parentRefs:
 - name: my-gateway
 sectionName: tcp-listener
 rules:
 - backendRefs:
 - name: my-service
 port: 1234

The corresponding Gateway listener:

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
 name: my-gateway
 namespace: my-app
spec:
 gatewayClassName: envoy-gateway
 listeners:
 - name: tcp-listener
 protocol: TCP
 port: 1234
 - name: http-listener
 protocol: HTTP
 port: 80

One Gateway, both protocols declared explicitly. The TCPRoute API is in the experimental channel and requires opting in when installing Envoy Gateway.

HTTPRoute — HTTP services

HTTPRoute handles HTTP and HTTPS traffic with routing by hostname, path, header, or method — without annotations.

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
 name: my-http-service
 namespace: my-app
spec:
 parentRefs:
 - name: my-gateway
 sectionName: http-listener
 hostnames:
 - my-app.example.com
 rules:
 - matches:
 - path:
 type: PathPrefix
 value: /
 backendRefs:
 - name: my-service
 port: 8080

LoadBalancer — direct TCP via Octavia

For cases where a TCPRoute is not appropriate (or the Gateway API experimental channel is not enabled), a LoadBalancer service provisions an Octavia LB directly:

apiVersion: v1
kind: Service
metadata:
 name: my-tcp-service
 namespace: my-app
spec:
 type: LoadBalancer
 selector:
 app: my-app
 ports:
 - port: 1234
 targetPort: 1234
 protocol: TCP

Annotations control Octavia behaviour — timeouts, health check parameters, internal vs external. These are provider-specific and not standardised across OpenStack deployments.

Storage

Cinder block volumes are available via the CSI driver. A PersistentVolumeClaim provisions a Cinder volume automatically using the cluster’s default storage class.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
 name: my-data
spec:
 accessModes:
 - ReadWriteOnce
 resources:
 requests:
 storage: 20Gi

Cinder volumes are ReadWriteOnce — they attach to a single node. For stateful workloads, use StatefulSet rather than Deployment to get stable volume binding across pod restarts.

Provisioning a shoot cluster on Cleura

Cleura wraps Gardener behind their own REST API at rest.cleura.cloud. The garden cluster kubeconfig is not exposed — gardenctl does not work directly. Cluster lifecycle is managed through HTTP calls.

Authentication

Every call requires a token obtained once per session:

curl -s -X POST https://rest.cleura.cloud/auth/v1/tokens \
 -H "Content-Type: application/json" \
 -d '{"auth": {"login": "you@example.com", "password": "yourpass"}}' \
 | jq '{token: .token}'

Pass X-AUTH-LOGIN and X-AUTH-TOKEN headers on all subsequent calls.

Bootstrap (once per project/region)

Before creating any clusters, the project must be bootstrapped — this wires up the OpenStack credentials that Gardener uses to provision nodes:

curl -X POST \
 https://rest.cleura.cloud/gardener/v1/public/secret/kna1/{projectId}/bootstrap \
 -H "X-AUTH-LOGIN: ..." -H "X-AUTH-TOKEN: ..."

Safe to call repeatedly; idempotent.

Create a shoot cluster

curl -X POST \
 https://rest.cleura.cloud/gardener/v1/public/shoot/kna1/{projectId} \
 -H "X-AUTH-LOGIN: ..." -H "X-AUTH-TOKEN: ..." \
 -H "Content-Type: application/json" \
 -d '{
 "shoot": {
 "name": "my-cluster",
 "kubernetes": {"version": "1.31.0"},
 "provider": {
 "infrastructureConfig": {"floatingPoolName": "ext-net"},
 "workers": [{
 "name": "default",
 "machine": {
 "type": "4C-8GB-50GB",
 "image": {"name": "ubuntu", "version": "22.4.20230301"}
 },
 "minimum": 1,
 "maximum": 3,
 "volume": {"size": "50Gi"}
 }]
 }
 }
 }'

Poll until ready

curl https://rest.cleura.cloud/gardener/v1/public/shoot/kna1/{projectId}/my-cluster \
 -H "X-AUTH-LOGIN: ..." -H "X-AUTH-TOKEN: ..." \
 | jq '.lastOperation | {state, description, progress}'

Poll until lastOperation.state == "Succeeded". Takes roughly 10–15 minutes on first provision.

Fetch kubeconfig

The Cleura docs reference two kubeconfig paths — GET /kubeconfig (lowercase) and POST /Kubeconfig (uppercase, different casing). Neither worked reliably in practice. The endpoint that actually returns a kubeconfig is:

curl -s -X POST \
 https://rest.cleura.cloud/gardener/v1/public/shoot/kna1/{projectId}/my-cluster/adminkubeconfig \
 -H "X-AUTH-LOGIN: ..." -H "X-AUTH-TOKEN: ..." \
 -H "Content-Type: application/json" \
 -d '{"config": {"expirationSeconds": 3600}}' \
 | jq -r > my-cluster-kubeconfig.yaml

The expirationSeconds field controls credential lifetime. A bug report has been filed with Cleura about the endpoint inconsistency — the adminkubeconfig path is not documented.

Path	Method	Documented	Works
`/kubeconfig`	GET	yes	unclear
`/Kubeconfig`	POST	yes	unclear
`/adminkubeconfig`	POST	no	yes

→ Cleura docs issue #534 — kubeconfig endpoint inconsistencies in Gardener REST API

Script

A bash script wrapping the full workflow (list, create, wait, kubeconfig, delete) is available: cleura-shoot.sh

export CLEURA_LOGIN="you@example.com"
export CLEURA_PASSWORD="yourpass"

./cleura-shoot.sh list
./cleura-shoot.sh create my-cluster
./cleura-shoot.sh wait my-cluster
./cleura-shoot.sh kubeconfig my-cluster
./cleura-shoot.sh delete my-cluster

IaC options

No native Terraform provider exists for Cleura’s Gardener REST API. The Gardener Terraform provider (registry.terraform.io/providers/gardener/gardener) requires the garden cluster kubeconfig, which Cleura does not expose. Options:

Approach	Notes
Bash + curl	Minimal deps — just `curl` and `jq`
Crossplane `provider-http`	Declarative, Kubernetes-native, reconciliation loop
Custom Terraform provider	Full `plan`/`apply` semantics — requires Go provider development
Pulumi custom dynamic provider	Python/TypeScript, similar effort to custom Terraform provider

Kubernetes Policy

Mon, 08 Jun 2026 00:00:00 +0000

Kubernetes has three distinct policy enforcement mechanisms. They sit at the same point in the request lifecycle — the admission controller — but differ in language, capability, and operational complexity.

	Kyverno	Gatekeeper (OPA)	ValidatingAdmissionPolicy
Language	YAML/JMESPath	Rego	CEL
Native to K8s	No (CRD)	No (CRD)	Yes (built-in)
Validate	Yes	Yes	Yes
Mutate	Yes	Limited	Yes (1.32+)
Generate	Yes	No	No
Image verify	Yes	No	No
GA since	—	—	K8s 1.30
Good for	Full-featured, K8s-native feel	Rego-first teams, policy-as-data	Simple rules, no extra install

ValidatingAdmissionPolicy (VAP)

Added in Kubernetes 1.26, GA in 1.30. Policies are built into the API server — no admission controller to deploy or maintain. Policy is written in CEL (Common Expression Language), a simple expression language also used in Kubernetes’ x-kubernetes-validations CRD validation.

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
 name: "require-run-as-non-root"
spec:
 failurePolicy: Fail
 matchConstraints:
 resourceRules:
 - apiGroups: ["apps"]
 apiVersions: ["v1"]
 operations: ["CREATE", "UPDATE"]
 resources: ["deployments"]
 validations:
 - expression: >
 object.spec.template.spec.securityContext.runAsNonRoot == true
 message: "Pods must run as non-root"
---
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicyBinding
metadata:
 name: "require-run-as-non-root-binding"
spec:
 policyName: "require-run-as-non-root"
 validationActions: [Deny]
 matchResources:
 namespaceSelector:
 matchLabels:
 enforce-policy: "true"

The ValidatingAdmissionPolicyBinding scopes where a policy applies — cluster-wide, specific namespaces, or by label selector.

CEL basics

CEL expressions have access to object (the incoming resource), oldObject (for updates), request (metadata, user, etc.), and params (a referenced ConfigMap or CRD for parameterisation).

# Simple field check
object.spec.replicas <= 10

# Nested optional field (use ?. for optional traversal)
object.spec.template.spec.?securityContext.?runAsNonRoot == optional.of(true)

# List comprehension — all containers must have limits
object.spec.template.spec.containers.all(c,
 has(c.resources) && has(c.resources.limits)
)

# String operations
object.metadata.name.startsWith("prod-")

MutatingAdmissionPolicy

Added in Kubernetes 1.32 (alpha). Brings CEL-based mutation — set defaults, inject labels, patch fields — without Kyverno or a webhook. Still early; not production-ready yet.

Gatekeeper

OPA running as a Kubernetes admission controller. Policies are written in Rego and stored as ConstraintTemplate CRDs. The separation between template (the Rego logic) and constraint (the enforcement configuration + parameters) is the key design pattern.

apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
 name: k8srequiredlabels
spec:
 crd:
 spec:
 names:
 kind: K8sRequiredLabels
 validation:
 openAPIV3Schema:
 properties:
 labels:
 type: array
 items: {type: string}
 targets:
 - target: admission.k8s.gatekeeper.sh
 rego: |
 package k8srequiredlabels

 violation[{"msg": msg}] {
 provided := {label | input.review.object.metadata.labels[label]}
 required := {label | label := input.parameters.labels[_]}
 missing := required - provided
 count(missing) > 0
 msg := sprintf("missing required labels: %v", [missing])
 }
---
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
 name: require-team-label
spec:
 match:
 kinds:
 - apiGroups: [""]
 kinds: ["Namespace"]
 parameters:
 labels: ["team", "environment"]

Gatekeeper also supports audit mode — it continuously evaluates existing resources against policies and surfaces violations without blocking. Useful for measuring compliance against policies you’re not yet ready to enforce.

Gatekeeper vs Kyverno

Kyverno is better if your team does not know Rego and wants policies that look like Kubernetes manifests. Gatekeeper is better if you are already invested in OPA/Rego and want a single policy language across K8s and non-K8s surfaces (via Conftest).

The K8s-native VAP is the right default for simple validation rules on new clusters — no extra install, but it does not cover mutation (until 1.32+), generation, or image verification.

Resources

Gatekeeper documentation
Gatekeeper library — ready-made ConstraintTemplates
ValidatingAdmissionPolicy docs
CEL in Kubernetes
Kyverno — separate note

Crossplane

Wed, 03 Jun 2026 00:00:00 +0000

Crossplane is Kubernetes-native infrastructure management. Where Terraform runs as a CLI tool that applies changes and exits, Crossplane runs as a controller inside a Kubernetes cluster and continuously reconciles infrastructure — the same control loop model as Kubernetes itself.

Cloud resources become Kubernetes objects. You kubectl apply an RDS instance the same way you apply a Deployment. Crossplane’s controllers watch those objects and make the API calls to converge actual infrastructure to the desired state.

Core concepts

Providers extend Crossplane with CRDs for a specific cloud. provider-aws adds Kubernetes resources for every AWS service — S3 buckets, RDS instances, VPCs. Apply a provider, get hundreds of new resource types.

Managed Resources (MRs) are the individual cloud resources:

apiVersion: s3.aws.upbound.io/v1beta1
kind: Bucket
metadata:
 name: my-assets
spec:
 forProvider:
 region: eu-central-1
 tags:
 Environment: prod

Crossplane creates this bucket and keeps it in sync. If someone deletes it outside of Crossplane, the controller recreates it.

Composite Resources (XRs) are the powerful part. You define your own CRDs — a Platform or DatabaseCluster — that compose multiple managed resources. A developer applies a DatabaseCluster and gets an RDS instance, a subnet group, a parameter group, and security groups, all wired together, without needing to know any of the details.

XRDs (Composite Resource Definitions) define the schema for composite resources — what fields the developer sees, what defaults apply.

Compositions define how a composite resource maps to managed resources — the implementation behind the abstraction.

The platform engineering model

Crossplane’s real value is as a platform layer. A platform team owns the Compositions — they define what a “compliant database” or “standard app environment” looks like. Dev teams consume the simplified abstractions without touching the underlying cloud resources.

Self-service infrastructure with guardrails baked in.

vs Terraform

Crossplane and Terraform are not direct alternatives — they solve the problem differently.

Terraform is a CLI tool: run plan, review, apply, exit. State is a file. Good for human-in-the-loop workflows and one-off provisioning.

Crossplane is a control plane: always running, always reconciling. Better for continuous enforcement and self-service platforms. More complex to set up and operate.

In practice: Terraform for provisioning foundational infrastructure (clusters, networks, accounts). Crossplane for what runs on top of the cluster — letting application teams provision their own databases, queues, and object storage through Kubernetes-native APIs.

Upbound

The commercial platform behind Crossplane. Managed control plane hosting, a marketplace of providers and compositions, and tooling for building and publishing your own platform APIs. Worth evaluating if you are building a serious internal platform.

Learning curve

Steep. You need to understand Kubernetes controllers, CRDs, and the Crossplane composition model before you can be productive. The payoff is a genuinely powerful platform abstraction — but it is not a beginner tool.

A good framing: Crossplane is a digital twin of your infrastructure. The cluster holds the desired state of everything — cloud resources, application configuration, other tools — and continuously reconciles reality to match it.

Genuinely cool and worth learning if you have a cluster. The provider model has expanded well beyond cloud infrastructure — from v2 onwards Crossplane can manage applications, not just infra. There are also providers for Ansible and Terraform/OpenTofu, which means Crossplane can be the orchestration layer that drives other IaC tools. One control plane to rule them all.

The prerequisite is the cluster itself. If you already run Kubernetes, Crossplane is a natural extension of the same model you already operate. If you do not, it is not the tool to start with.

ASGARD — the blade cluster

Fri, 15 May 2026 00:00:00 +0000

ASGARD (SYS-007) is the HP BladeSystem C7000 with 16× BL460c Gen8 blades. The reason to use it is profile switching: boot a blade as a Slurm compute node, run the experiment, reimage it as a Talos worker, run the next one. The same iPXE boot menu already set up for ODEN works here — the C7000 Onboard Administrator lets you configure boot order per blade slot, so switching roles is a BIOS setting and a PXE entry, not a reinstall.

Power reality

Before committing to blades as the permanent always-on platform, it’s worth being honest about the enclosure overhead. The C7000 has fixed costs regardless of how many blades are populated: 10 fans, dual OA modules, 2 interconnect switches, backplane management. It doesn’t scale down gracefully.

Setup	Approx power
C7000 enclosure alone (no blades)	200–400W
C7000 + 1 blade	350–550W
C7000 + 3 blades	500–800W
ODEN alone (1U M3, Talos)	100–150W
HEIMDAL alone (Sun X4150, router)	150–200W
ODEN + HEIMDAL	250–350W

Two pizza boxes beat three blades in the enclosure on power. The overhead only amortises at 8+ populated slots. For a permanent minimal setup, the 1U rack servers win. For experiments where you want to run 8–16 nodes at once, ASGARD earns its place.

What each role actually needs

Role	RAM	Disk	Network	Limiting factor
Talos / K8s worker	32–64GB	1× OSD disk	1GbE fine	RAM — current blades too thin
OpenStack compute	32–64GB	local ephemeral	1GbE fine	RAM
OpenStack control	32GB+	small	1GbE fine	RAM
Slurm compute	as much as possible	fast scratch	1GbE mediocre	network
Ceph OSD	16–32GB	more / bigger disks	1GbE	disk count

The network note matters for Slurm: blade LOM connects to the enclosure switch backplane at 1GbE, not 10GbE. The switch has 10GbE uplinks going out, but blade-to-blade traffic inside the enclosure goes through the switch at 1GbE. For Talos and OpenStack this is fine. For MPI jobs exchanging large datasets between Slurm nodes it’s a real bottleneck — HPC wants InfiniBand, which the empty interconnect bays 5–8 could take (plus matching mezzanine cards in each blade), but that’s a separate cost. For learning Slurm, 1GbE is workable.

Current blade state

Most blades are underpowered for any of the roles above. CPUs are also unknown across all 16 slots — the OA web GUI reports CPU model and core count per blade and should be checked first. The E5-2600 v1 range runs from E5-2603 (4c, 80W) to E5-2690 (8c/16t, 135W), which matters significantly for role assignment.

Slot	RAM	Disk
BLD-001	4GB	2× 146GB SAS
BLD-002	14GB (mixed, odd count)	—
BLD-003	32GB	2× 300GB SAS
BLD-004	8GB	—
BLD-005	8GB	1× 146GB + 1× 300GB SAS
BLD-006	8GB	2× 300GB SAS
BLD-007	8GB	2× 900GB SAS
BLD-008	16GB	2× 300GB SAS
BLD-009	8GB	—
BLD-010	8GB	2× 300GB SAS
BLD-011	8GB	2× 300GB SAS
BLD-012	8GB	2× 300GB SAS
BLD-013	32GB	—
BLD-014	8GB	—
BLD-015	8GB	2× 300GB SAS
BLD-016	8GB	—

BLD-003 and BLD-013 are already at 32GB and are natural candidates for control-plane or master roles once CPUs are confirmed.

Suggested configuration from existing stock

Available spare hardware:

14× RAM-007 (8GB DDR3 1600MHz ECC Reg) — unassigned
2× HDD-004 (120GB SATA SSD) — spare
6× HDD-002 (146GB 10K SAS) — spare
Embedded P220i on each blade (can be set to JBOD/passthrough for Ceph)

“Fat” nodes × 2 — Talos control plane, OpenStack control, Slurm master: Add 4× RAM-007 to each blade. From a base of 8–16GB that gives ~40GB. Candidates: BLD-006 and BLD-010, both have 2× 300GB SAS for local storage. Costs 8 of 14 spare sticks. Install a spare 120GB SSD as boot disk in each.

“Medium” nodes × 3 — Talos workers, OpenStack compute, Slurm compute: Add 2× RAM-007 to each → 24GB from the 8GB base. Candidates: BLD-008 (already 16GB, gets to 32GB), BLD-011, BLD-012. All three have 300GB SAS for scratch or Ceph OSDs. Costs the remaining 6 spare sticks.

Rest — thin compute, storage expansion, or powered off: Leave at current RAM. BLD-007’s 900GB SAS pair is better used elsewhere (see below). BLD-003 and BLD-013 at 32GB can step up to fat-node role once CPUs are confirmed.

That leaves 5 blades properly kitted and 11 available for experiments or idle.

BL460c Gen8 DIMM rule: populate per-CPU symmetrically — pairs or quads per memory channel — for best throughput. Don’t mix odd counts.

Storage — what moves where

Pull the 900GB SAS drives from BLD-007 now. HDD-013 (HGST 900GB) and HDD-014 (Toshiba 900GB) are the two largest drives in the blade pool and they’re sitting in a blade that may end up as a thin compute worker. Move them into ODEN or LOKE as permanent Ceph OSDs. This immediately gives the always-on cluster substantially more storage than the current 120GB SSDs.

MIMIR (SYS-004, 15× 1TB SAS) is the Ceph expansion story for later. To connect it: install CTRL-006 (ServeRAID-8e, have 2 unplaced) into a server with a free PCIe slot, then cable it with a SFF-8470 → SFF-8088 cable (not currently owned, inexpensive). TOR is the natural host — it already has CTRL-003 in HBA mode and free PCIe slots. Not urgent, but the hardware is almost all there.

What	Goes to	When
900GB SAS ×2 from BLD-007	ODEN or LOKE, permanent Ceph OSDs	Now
120GB SSD ×2 spare	BLD fat node boot disks	Before Talos on blades
300GB SAS in blades	Local scratch or blade Ceph OSDs	During ASGARD experiments
MIMIR 15× 1TB SAS	TOR via CTRL-006, Ceph expansion	Later (needs cable)

Three things to do before blades can boot anything

Identify CPUs. Connect to the OA management port, open the web GUI, check CPU model per slot. Ten minutes. Everything else depends on this.
Network uplink. The blade switches in bays 1 and 2 have 4× RJ45 1GbE uplinks (ports 22–25). Run a patch cable from one to any available switch — MODI, MAGNI, whatever’s reachable from the cable box. That’s enough for blades to reach DHCP and iPXE.
RAM redistribution. Pull the 14 spare RAM-007 sticks and install into the chosen fat and medium nodes per the profile above.

The permanent vs experiment split

Always on (~300–400W total):
 HEIMDAL → OPNsense router, Sun X4150, ~150–200W
 ODEN → Talos, Minecraft + small services, ~100–150W
 LOKE → 2nd Talos node (needs RAM-007 × 8 + SSD boot), ~100–150W

Experiments (fire up, learn, power off):
 ASGARD → 3–16 blades for Slurm / OpenStack / larger Talos cluster
 TYR+TOR+FREJA → Proxmox cluster (M1 DDR2, temporary)

Once the Proxmox experiment wraps, TYR, TOR, and FREJA can be powered down permanently. If ASGARD blades eventually become the long-term compute platform, OPNsense can move to a VM on a blade at that point — but not before the blades are stable and trusted. Don’t consolidate the router onto experimental infrastructure.

BGP

Thu, 14 May 2026 00:00:00 +0000

BGP (Border Gateway Protocol) is the routing protocol that holds the internet together. Every major network operator uses it to advertise which IP prefixes they own and to exchange that information with peers. In a homelab context the scale is different but the mechanics are the same.

BGP is a path-vector protocol: each router advertises routes along with the path (sequence of ASNs) taken to reach them. Routers choose the best path based on a set of attributes and policy rules, then advertise that path to their peers.

eBGP vs iBGP

eBGP (external BGP) — sessions between routers in different autonomous systems. Each party has a different ASN. This is what you configure between VyOS and OPNsense, and between VyOS and MetalLB.

iBGP (internal BGP) — sessions between routers in the same autonomous system. Used inside large networks to distribute external routes internally. Not relevant for a basic homelab setup.

ASNs for private use

Autonomous System Numbers in the range 64512–65534 are reserved for private use (RFC 6996) — the same concept as RFC 1918 private IP addresses. Assign one to each participant in your BGP topology:

Participant	Example ASN
OPNsense	64512
VyOS	64513
MetalLB (Talos cluster)	64514

Why BGP for Kubernetes LoadBalancer IPs

Kubernetes LoadBalancer services need something external to the cluster to route traffic to them. In a cloud environment the cloud provider handles this automatically. On bare metal you need to do it yourself.

Two common approaches with MetalLB:

L2 mode — MetalLB uses ARP (IPv4) or NDP (IPv6) to announce service IPs directly on the LAN. Simple to set up. Limitations: only one node handles traffic for each IP at a time (no real load balancing at the network layer), and the service IP must be in the same subnet as the nodes.

BGP mode — MetalLB establishes a BGP session with an upstream router (VyOS, for example) and announces service IPs as /32 prefixes. The router learns the route and can ECMP across all nodes that are advertising it. More correct: actual load balancing, no subnet constraint, clean separation between cluster and network layer.

The tradeoff is that BGP mode requires a BGP-capable router in the path, which is why VyOS exists in this topology.

Testing with a real BGP network

DN42 is a community-run experimental network that simulates the real internet using actual BGP, DNS, and whois infrastructure. Participants connect via WireGuard or other tunnels and peer with each other using real BGP sessions and real (private-range) ASNs. A good way to practice BGP outside the homelab without needing a production ASN.

VyOS — the BGP peer router
VyOS + BGP experiment — the actual setup in this homelab

Ceph

Thu, 14 May 2026 00:00:00 +0000

Ceph is an open-source distributed storage platform providing object, block, and file storage in a single unified system. It runs across multiple nodes and has no single point of failure.

The core idea: data is not stored on specific disks on specific nodes. Instead, the CRUSH algorithm distributes data across all available OSDs (Object Storage Daemons) based on a placement map. Add nodes and the cluster rebalances automatically. Lose a node and Ceph re-replicates from surviving copies without operator intervention.

Storage types

Type	Interface	Typical use
Block (RBD)	Kernel block device / iSCSI	Kubernetes PVCs, VM disks
Object (RGW)	S3-compatible API	Backups, artifacts, media
File (CephFS)	POSIX filesystem / NFS	Shared filesystems, home dirs

For Kubernetes workloads, RBD block storage via a StorageClass is the common path.

Components

MON (Monitor) — maintains the cluster map; quorum-based, needs an odd number (typically 3 or 5). Not a data path.

OSD (Object Storage Daemon) — one per disk; handles actual data reads/writes and replication.

MGR (Manager) — collects metrics, hosts the dashboard, runs modules (balancer, alertmanager, etc.).

MDS (Metadata Server) — only required for CephFS; manages the filesystem namespace.

Single-node constraint

A single-node Ceph cluster can be made to run (allowMultiplePerNode: true in Rook, replication size: 1), but it provides no actual redundancy. There is nothing to replicate to. This is fine for testing concepts; it is not a valid storage setup for anything you care about.

Ceph documentation
Rook — Kubernetes operator that manages Ceph clusters inside K8s
Proxmox — Ceph is a native storage backend in Proxmox clusters
Rook + Ceph in the homelab

Rook

Thu, 14 May 2026 00:00:00 +0000

Rook is a Kubernetes operator that deploys and manages storage systems — primarily Ceph — as native Kubernetes resources. The distinction: Ceph is the storage system; Rook is the Kubernetes wiring around it.

Without Rook you would run Ceph manually (or via cephadm) and then configure the Kubernetes CSI driver separately. Rook collapses that into CRDs and handles the full lifecycle: deployment, configuration, expansion, upgrades, and failure recovery.

How it works

Rook introduces several CRDs:

CephCluster — declares the cluster: which nodes, which disks to use as OSDs, replication settings.

CephBlockPool — defines a Ceph pool (replication factor, failure domain). Maps to an RBD pool.

StorageClass — references a CephBlockPool and enables dynamic PVC provisioning. Kubernetes workloads request storage; Rook/Ceph fulfils it.

CephFilesystem — deploys CephFS + MDS for POSIX shared filesystem access.

CephObjectStore — deploys the Ceph RGW S3-compatible object storage gateway.

Typical install sequence

kubectl apply -f https://raw.githubusercontent.com/rook/rook/refs/tags/v1.17.9/deploy/examples/crds.yaml
kubectl apply -f https://raw.githubusercontent.com/rook/rook/refs/tags/v1.17.9/deploy/examples/common.yaml
kubectl apply -f https://raw.githubusercontent.com/rook/rook/refs/tags/v1.17.9/deploy/examples/operator.yaml

Then apply a CephCluster manifest declaring your storage topology, followed by CephBlockPool and StorageClass for PVC support.

Single-node considerations

A single-node setup requires allowMultiplePerNode: true in the CephCluster spec (MONs, MGR, and OSDs all land on the same node). Replication size must be set to 1 — there is nowhere else to replicate. This works for experimentation; it is not a production configuration. See Ceph for details on the replication model.

Rook documentation
Ceph — the underlying storage system
Rook + Ceph in the homelab

Rook + Ceph on ODEN

Thu, 14 May 2026 00:00:00 +0000

Attempting to add persistent block storage to the ODEN single-node Talos cluster using Rook and Ceph. This did not fully succeed — the setup reached the point of a bound PVC and a working write test, but the cluster was not left in a clean stable state. Notes are here for completeness.

This builds on the Talos cluster setup on ODEN.

Hardware

ODEN has five storage devices:

Device	Type	Size	Role
`/dev/sdb`	Kingston SA400S3 SSD (SATA)	120 GB	Boot disk — leave alone
`/dev/nvme0n1`	Samsung 970 EVO NVMe	500 GB	OSD
`/dev/sdc`	Kingston SA400S3 SSD (SATA)	120 GB	OSD
`/dev/sdd`	Kingston SA400S3 SSD (SATA)	120 GB	OSD
`/dev/sde`	Kingston SA400S3 SSD (SATA)	120 GB	OSD

Do not add /dev/sdb to Ceph. It is the boot disk.

Step 1 — Install the Rook operator

kubectl apply -f https://raw.githubusercontent.com/rook/rook/refs/tags/v1.17.9/deploy/examples/crds.yaml
kubectl apply -f https://raw.githubusercontent.com/rook/rook/refs/tags/v1.17.9/deploy/examples/common.yaml
kubectl apply -f https://raw.githubusercontent.com/rook/rook/refs/tags/v1.17.9/deploy/examples/operator.yaml

Wait for the operator pod to be running in rook-ceph namespace before continuing.

Step 2 — CephCluster (single-node)

Single-node requires allowMultiplePerNode: true and explicit disk selection. The cluster-test example from the Rook repo is a reasonable starting point:

storage:
 useAllNodes: false
 nodes:
 - name: "192.168.1.171"
 devices:
 - name: "nvme0n1"
 - name: "sdc"
 - name: "sdd"
 - name: "sde"

Reference: cluster-test.yaml

Step 3 — CephBlockPool and StorageClass

apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
 name: replicapool
 namespace: rook-ceph
spec:
 replicated:
 size: 1

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
 name: rook-ceph-block
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
 clusterID: rook-ceph
 pool: replicapool
 imageFormat: "2"
 imageFeatures: layering
reclaimPolicy: Delete

Step 4 — PVC test

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
 name: test-pvc
spec:
 accessModes:
 - ReadWriteOnce
 storageClassName: rook-ceph-block
 resources:
 requests:
 storage: 10Gi

PVC reached Bound. A BusyBox pod mounting it could write to /mnt. The Ceph dashboard (kubectl -n rook-ceph port-forward svc/rook-ceph-mgr-dashboard 7000:7000) showed OSDs active and the pool present.

What did not work

The cluster ran but was not left stable. Single-node Ceph produces health warnings by design (no redundancy, no failure domain separation). More importantly, the setup was not revisited after initial testing and there are unresolved questions about:

CSI driver behaviour on Talos (Talos has specific requirements for CSI socket paths)
Whether the dashboard warnings were cosmetic or indicated real issues
Long-term stability under actual workloads

This is left as a draft until there is time to run it properly — ideally on more than one node.

Talos Linux + Omni

Thu, 14 May 2026 00:00:00 +0000

Talos Linux is an immutable, minimal operating system designed specifically for running Kubernetes. There is no shell, no SSH, no package manager. The entire OS is read-only and managed via a gRPC API (talosctl). Node configuration is declarative YAML applied over the API; changes that require a reboot take effect on the next boot.

The tradeoff is rigidity for operational simplicity. You cannot log into a Talos node and fix something by hand. In return, nodes are deterministic, reproducible, and there is no configuration drift.

Comparison to other installs

Method	OS	Config	Mutable
kubeadm	Ubuntu / RHEL / etc	Manual + scripts	Yes
k3s	Any Linux	Minimal	Yes
Talos	Talos Linux	Declarative API	No

k3s and kubeadm give you more flexibility and a familiar Linux environment. Talos is the right choice when you want the cluster nodes to behave like appliances — provisioned, never touched.

Omni

Omni is a cluster management platform by Sidero Labs built on top of Talos. It handles:

Node registration (nodes boot and phone home to the Omni API)
Cluster creation and machine assignment
Kubernetes upgrades (one action in the UI)
talosctl and kubeconfig access via the Omni CLI

Nodes register via a join token embedded in the kernel command line at PXE boot time. The cluster runs on your hardware; Omni only manages the control plane.

Hobby tier: 10 nodes, non-commercial use, free. Sidero Labs also offers a self-hosted version.

Image Factory

factory.talos.dev generates custom Talos images with hardware extensions included. Notable extensions:

siderolabs/bnx2 — Broadcom NetXtreme II (BCM5708/BCM5709) NIC firmware, required on some enterprise hardware (IBM x3550 M3, HP Gen 6/7 blades)
siderolabs/intel-ucode — Intel microcode updates
siderolabs/nvidia-* — NVIDIA GPU support

The factory produces both ISO and PXE artifacts (kernel + initramfs). See the OPNSense + iPXE reference for how to serve these over TFTP.

Supporting Sidero Labs

Talos and Omni are built by Sidero Labs — good people doing good work. I sponsor them via GitHub Sponsors at the fanboi tier.

Relevant links

Talos Linux in the homelab via Omni

Thu, 14 May 2026 00:00:00 +0000

Getting Talos Linux running in the homelab via PXE boot and Omni — starting with ODEN (SYS-005), an IBM System x3550 M3. The full OPNSense + iPXE configuration lives in the reference note; this covers what actually happened, in order.

Setup

Hardware: ODEN (SYS-005) — IBM x3550 M3, Broadcom BNX2 NICs (BCM5709)
Network: OPNSense router on LAN; ODEN connected via one NIC (start with one — removes variables)
Target: Single-node Talos cluster registered in Omni

Step 1 — OPNSense DHCP and TFTP

Enable network booting on the LAN DHCP server and download the iPXE binaries to the TFTP root. Full field values in the iPXE reference note.

One thing to check first: if you previously set DHCP options 66 and 67 as raw additional options, remove them. OPNSense’s built-in network boot fields do the same job and having both causes conflicts.

Step 2 — iPXE boot script

Write default.ipxe to /usr/local/tftp/. Include a boot menu with at minimum a Talos option and a shell fallback — the shell is genuinely useful when something fails and you need to debug from the boot prompt. Full script in the reference note.

The Talos entry in the menu needs the Omni join token from your Omni console. Generate a join link in Omni; it provides the API endpoint, token, and SideroLink addresses.

Step 3 — Talos kernel and initramfs

The standard Talos release binaries do not include BNX2 firmware. Since around Talos 1.6 those drivers are available as extensions but not in the mainline image. Without them, the node boots, fails to initialise the NIC, and produces can't load firmware bnx2 errors — everything else looks fine until you notice the node never gets an IP and never appears in Omni.

Fix: generate a custom image at factory.talos.dev with the siderolabs/bnx2 extension included, then download the PXE kernel and initramfs from the factory URL. Commands in the reference note.

Step 4 — First boot

Go into BIOS and set the boot device to PXE. On the M3, UEFI boot with ipxe.efi fails silently — the image is too large for the NIC’s PXE memory buffer. Switch to legacy/BIOS mode and use undionly.kpxe instead.

The machine takes a while to POST and boot. This is normal for old enterprise hardware. It is also why demos typically use virtual machines.

Step 5 — Static IP

After the BNX2 fix the node boots Talos successfully but still does not appear in Omni. The DHCP assignment for the node is not being picked up during early boot. Workaround: add a static IP via kernel params in the iPXE script:

ip=192.168.1.171::192.168.1.1:255.255.255.0::eth0:off

Add this to the kernel line in the Talos iPXE entry. The format is ip=<client-ip>::<gateway>:<netmask>::<iface>:off.

Step 6 — Omni registration

With a working NIC and an IP, the node contacts the Omni API using the join token. It appears in the Omni console as an unallocated machine. Create a cluster, assign the machine, and let Omni configure it. The initial cluster bootstrap takes a few minutes.

Step 7 — Fix the BIOS boot order

After the cluster is up, change the BIOS boot order so the disk is first. If PXE remains the primary boot device, every reboot drops the machine back to the iPXE menu instead of booting the installed Talos. Discovered on first reboot. Worth noting it here so you don’t make the same trip to the garage.

Upgrade

Omni makes single-node upgrades straightforward: open the cluster in the Omni console, select a new Talos version, apply. The node reboots once. Single-node means the cluster has downtime during the reboot; that is expected. Nothing else to do.

Result

Single-node Kubernetes cluster running on ODEN, managed via Omni. kubectl and talosctl access via the Omni CLI. Next experiment: Rook + Ceph for persistent storage.

Kubernetes Across the Stack

Mon, 16 Mar 2026 00:00:00 +0000

A documented comparison of running Kubernetes across every major hosting model — cloud managed, self-managed on cloud, private cloud, and bare metal at home. The goal is a honest, practical reference for each environment: what it costs you in time and money, where the rough edges are, and how the networking story differs between them.

The thread running through all of it is Talos Linux — an immutable, API-driven OS built specifically for Kubernetes. No SSH, no shell, no config drift. The same OS everywhere means the operational model stays consistent regardless of what is running underneath.

Environment	Approach
OpenStack — Cleura	Talos & Terraform	draft exists
OpenStack — Cleura	Talos, with Omni	maybe ?
OpenStack — ElastX	Talos & Terraform	draft exists
OpenStack — ElastX	Talos, with Omni	maybe ?
Homelab — bare metal	Talos + Pixieboot + Omni	draft exists
Homelab — bare metal	Talos + Pixieboot without Omni	maybe ?
Homelab — OpenStack	OpenStack on bare metal, Talos running on top	(stretch)
Homelab — OpenStack	Talos on bare metal, OpenStack inside cluster	(stretch)
AWS	Talos on EC2	(stretch)
Azure	Talos on VMs	(stretch)
GCP	Talos on Compute Engine	(stretch)

Stretch goals

AWS, Azure, GCP — same Talos approach, different underlying infrastructure. Interesting eventually, but not the priority.

Omni

Omni is Sidero’s managed control plane for Talos clusters — worth documenting both with and without it. Without Omni gives you the full picture of what Talos management looks like manually; with Omni shows what the managed layer buys you.

Homelab provisioning

Nodes provisioned via Pixieboot — no USB sticks, no manual installations. A node powers on, boots from the network, and registers. The goal is a fully reproducible cluster from scratch with minimal human steps.

Scope

Cluster provisioning and bootstrap for each environment
Networking — CNI choices, ingress, cross-cluster connectivity
Storage — what you get managed vs what you have to bring yourself
Operational differences — upgrades, node management, observability
Cost and trade-off summary across environments

Making it usable

Getting a cluster running is the easy part. Making it usable is where environments diverge. Each environment needs an answer for ingress, DNS, and storage — and the answer varies significantly depending on what the underlying platform provides.

On managed cloud you can lean on load balancers and block storage from the provider. On OpenStack you have those options if the provider exposes them. On bare metal at home you are on your own — MetalLB or similar for load balancer IPs, a local DNS solution, and either local storage or something like Rook/Ceph. Same Kubernetes, very different operational story underneath.

Notes exist in various states — pulling them together, testing, and documenting properly is the work.

Minecraft Server

Mon, 16 Mar 2026 00:00:00 +0000

Building and running a Minecraft server with the kids — hosted in the homelab on bare metal rather than paying for a managed service. Part infrastructure project, part excuse to learn together.

The longer-term goal is a proper setup: automated backups, world persistence across restarts, maybe some automation around starting and stopping the server on demand.

Notes and repo to follow.

More to come.

Argo

Mon, 01 Jan 2024 00:00:00 +0000

The Argo project is a suite of Kubernetes-native tools for running and managing workloads and deployments. Each tool solves a distinct problem and they compose well together, but they are independent — you can use any one without the others. All four are CNCF graduated or incubating projects.

ArgoCD

GitOps continuous delivery. ArgoCD watches a Git repository and continuously reconciles the cluster state to match it — any drift is detected and corrected automatically. It is the CD half of a modern Kubernetes delivery pipeline: a CI system builds and pushes an image, ArgoCD detects the new tag and rolls it out. See the ArgoCD note for a full walkthrough including App of Apps, bootstrapping, and self-management.

Argo Workflows

A general-purpose workflow execution engine for Kubernetes. Workflows are CRDs that define DAGs or sequential step graphs — each step runs in a container, with outputs passed as artifacts or parameters to downstream steps. Used for CI pipelines, ML training jobs, data processing, and batch workloads. Where Tekton models CI-specific primitives (Tasks, Pipelines), Argo Workflows is lower-level and more flexible: any containerised workload that has dependencies between steps fits the model.

apiVersion: argoproj.io/v1alpha1
kind: Workflow
spec:
 entrypoint: build-test
 templates:
 - name: build-test
 dag:
 tasks:
 - name: build
 template: run-step
 arguments:
 parameters: [{name: cmd, value: "make build"}]
 - name: test
 dependencies: [build]
 template: run-step
 arguments:
 parameters: [{name: cmd, value: "make test"}]
 - name: run-step
 inputs:
 parameters:
 - name: cmd
 container:
 image: golang:1.22
 command: [sh, -c]
 args: ["{{inputs.parameters.cmd}}"]

Argo Rollouts

Progressive delivery for Kubernetes. Where a standard Kubernetes Deployment does a rolling update (replace pods gradually), Argo Rollouts adds canary and blue-green strategies with analysis gates. A canary rollout shifts a percentage of traffic to the new version, runs automated analysis (checking metrics from Prometheus, Datadog, or similar), and either promotes fully or rolls back based on the result. This makes deployments measurably safer — a bad release fails the analysis gate before it reaches 100% of traffic.

Argo Events

Event-driven automation. Argo Events defines EventSources (sensors that listen for events — git pushes, S3 uploads, Kafka messages, webhooks, cron schedules) and Sensors (triggers that respond to those events by creating Argo Workflows, sending notifications, or calling other systems). It is the event bus that ties the rest of the Argo stack together: a git push fires an EventSource, a Sensor detects it and creates a Workflow, the Workflow builds and tests, ArgoCD picks up the new image and rolls it out.

Kargo

A newer tool from Akuity (the company behind ArgoCD) that solves multi-stage GitOps promotion. ArgoCD is good at keeping one environment in sync with a Git ref — but promoting a release through dev → staging → production requires updating that ref in each environment and coordinating the sequence. Kargo models this as Stages with FreightRequests — a release is a piece of freight that must pass through each stage in order, with optional approval gates between them. It sits above ArgoCD in the stack and handles the promotion logic that ArgoCD deliberately leaves out.

Resources

ArgoCD

Mon, 01 Jan 2024 00:00:00 +0000

You deploy with kubectl apply from your laptop. It works. Then a colleague edits a deployment directly on the cluster to fix something urgent. Now what is running no longer matches what is in Git. That is drift, and it is silent — until something breaks in production and nobody can explain why the live state differs from the last known good config.

So you use ArgoCD. Git becomes the single source of truth. Every change flows through a pull request, gets reviewed, and syncs to the cluster automatically. If anyone touches a resource directly, ArgoCD detects the divergence and overrides it back. The cluster converges to Git, always.

This is GitOps: the deployment pipeline is driven by Git state, not by humans running commands.

CI vs CD

A useful mental separation: CI and CD are different concerns and should be handled by different tools.

CI (Continuous Integration) is about code — build, test, produce an artifact (a container image). A pipeline in GitHub Actions, Tekton, or Jenkins owns this. It ends with an image pushed to a registry.

CD (Continuous Delivery) is about cluster state — take that artifact and make sure the right version is running in the right environment. ArgoCD owns this. It watches Git, not the CI pipeline.

Keeping them separate means your deployment logic is not buried inside a CI pipeline that developers need to understand and maintain. ArgoCD runs in the cluster and continuously reconciles state. It is always on.

Applications

ArgoCD manages Applications — a CRD that maps a Git source to a cluster destination:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
 name: my-app
 namespace: argocd
spec:
 project: default
 source:
 repoURL: https://github.com/myorg/my-app-config
 targetRevision: main
 path: manifests/
 destination:
 server: https://kubernetes.default.svc
 namespace: my-app
 syncPolicy:
 automated:
 prune: true
 selfHeal: true

prune: true — resources removed from Git are deleted from the cluster. selfHeal: true — any manual change to the cluster is immediately reverted.

App of Apps

Managing dozens of Applications individually gets unwieldy. The App of Apps pattern solves this: one root Application whose source is a directory of other Application manifests. ArgoCD applies the root, which creates all the child Applications, which in turn sync their own workloads. One repo, one sync, everything deployed.

Sync strategies

Strategy	Behaviour
Automated	ArgoCD syncs on every Git change automatically
Manual	Changes are detected and shown as OutOfSync — a human triggers the sync

Automated sync with selfHeal is the purest GitOps posture. Manual sync is useful for production environments where you want a human approval step before changes roll out.

Rollback

Because every state the cluster has ever been in corresponds to a Git commit, rollback is a git revert — or clicking “Sync to previous revision” in the ArgoCD UI. No special tooling, no runbooks, just Git history.

Repo structure

A layout that works well in practice separates ArgoCD’s own installation from the workloads it manages:

cluster/<cluster>/
 cfg/argo-cd/ # ArgoCD install only — CRDs and Helm values
 app-of-apps/ # Root Application, Projects, app definitions
 overlay/<app>/ # Per-cluster Kustomize patches, secret/config overrides

external/ # Reusable base manifests shared across clusters
internal/ # Internal app base manifests

The key separations:

ArgoCD install is isolated in cfg/argo-cd to avoid recursive install loops and make upgrades predictable. ArgoCD is not managing its own installation yet at this point — that comes later.
App-of-Apps lives separately from the install. Once ArgoCD is running, applying app-of-apps/ bootstraps the entire cluster in one step.
Base vs overlay — external/ and internal/ define what an app is. The cluster overlay defines how it runs in this environment. Cluster-specific concerns (resource limits, replica counts, secret refs) stay in the cluster directory and never bleed into the base.

Bootstrapping a cluster

There is a chicken-and-egg problem: ArgoCD manages everything, but something has to install ArgoCD first. The two-step bootstrap solves it cleanly.

Step 1 — Install ArgoCD manually (once):

helm repo add argo https://argoproj.github.io/argo-helm
helm repo update
helm install argocd argo/argo-cd \
 -n argo-cd --create-namespace \
 -f cluster/staging/cfg/argo-cd/values.yaml

Step 2 — Apply the App-of-Apps root:

kubectl apply -k cluster/<cluster>/app-of-apps/

From this point ArgoCD reconciles the entire cluster. Every subsequent change goes through Git — you never run helm install or kubectl apply for workloads again.

Self-management

The final step is making ArgoCD manage its own upgrades. Create an Application that points at cluster/<cluster>/cfg/argo-cd:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
 name: argocd
 namespace: argocd
spec:
 project: default
 source:
 repoURL: https://github.com/myorg/cluster-config
 targetRevision: main
 path: cluster/staging/cfg/argo-cd
 destination:
 server: https://kubernetes.default.svc
 namespace: argocd
 syncPolicy:
 automated:
 prune: false # be cautious pruning ArgoCD's own resources
 selfHeal: true

Now ArgoCD upgrades itself when you update the Helm values in Git. No more manual helm upgrade — the cluster is fully self-managing. Changes to ArgoCD config go through the same PR review process as everything else.

Resources

Docker & OCI

Mon, 01 Jan 2024 00:00:00 +0000

Docker packages applications and their dependencies into portable, reproducible units called containers. Unlike virtual machines, containers share the host kernel — they’re isolated processes, not emulated hardware. This makes them fast to start, light on resources, and consistent across environments: the same image runs on a developer’s laptop, in CI, and in production.

Docker popularised containers, but the underlying standard is now open. The OCI (Open Container Initiative) defines three specifications:

Image spec — the format of a container image: layers, config, manifest
Runtime spec — how a container is run: namespaces, cgroups, lifecycle
Distribution spec — how images are pushed and pulled from registries

Any tool that produces an OCI image can run on any OCI-compliant runtime. Docker is one implementation. It is still the most natural entry point and the docker CLI remains the most familiar interface, but it is worth knowing that the ecosystem is broader than Docker Inc.

OCI images and containers

An image is a read-only, layered filesystem snapshot built from a Dockerfile — each layer is a diff on top of the previous one. A container is a running instance of an image — an isolated process with its own filesystem, network interface, and process space, sharing the host kernel.

docker build -t myapp:1.0 . # build OCI image from Dockerfile
docker run -p 8080:8080 myapp:1.0 # start container
docker ps # list running containers
docker exec -it <id> bash # shell into a running container

Images are stored in registries — Docker Hub, GitHub Container Registry, ECR, Nexus. All speak the OCI distribution spec, so images built with any tool push and pull the same way.

Dockerfile

The Dockerfile defines how an image is built — each instruction adds a layer:

FROM golang:1.22-alpine AS build
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN go build -o /app/server .

FROM alpine:3.19
COPY --from=build /app/server /server
EXPOSE 8080
ENTRYPOINT ["/server"]

Multi-stage builds keep the final image lean: the first stage compiles using the full toolchain, the second copies only the binary. No compiler, no source, no build cache in the image you ship.

Order matters for layer caching — put things that change rarely (dependency downloads) before things that change often (source code). A cache miss invalidates all subsequent layers.

Volumes and bind mounts

Containers have ephemeral filesystems — anything written inside is lost when the container stops. Persist data with volumes:

docker volume create pgdata
docker run -v pgdata:/var/lib/postgresql/data postgres:16

For local development, bind mounts map a host directory into the container:

docker run -v $(pwd):/app -w /app node:20 npm test

Networking

Containers on the same Docker network can reach each other by name. Docker Compose creates a default network automatically; named networks can be created explicitly:

docker network create backend
docker run --network backend --name db postgres:16
docker run --network backend myapp # can reach 'db' by hostname

Podman

Podman is a drop-in Docker replacement that runs without a daemon and without root. The CLI is intentionally compatible:

alias docker=podman # usually just works

Rootless containers mean a compromised container process cannot escalate to host root. Daemonless means no long-running background service with broad system access. On RHEL and Fedora, Podman is the default. For CI environments and security-conscious setups it is the better choice.

Podman also supports pods — groups of containers sharing a network namespace, mirroring the Kubernetes pod model. Useful for local development that needs to mirror how things will run in the cluster.

Buildah

Buildah builds OCI images without a Docker daemon. It can build from a Dockerfile or construct images programmatically using shell commands — useful in CI pipelines where running a privileged Docker daemon is undesirable:

buildah bud -t myapp:1.0 . # build from Dockerfile
buildah push myapp:1.0 registry/myapp:1.0

Buildah and Podman share the same underlying storage, so images built with Buildah are immediately available to Podman.

Docker Compose

Compose manages multi-container applications defined in compose.yml:

services:
 app:
 build: .
 ports:
 - "8080:8080"
 environment:
 DATABASE_URL: postgres://app:secret@db/appdb
 depends_on:
 - db

 db:
 image: postgres:16
 volumes:
 - pgdata:/var/lib/postgresql/data
 environment:
 POSTGRES_PASSWORD: secret

volumes:
 pgdata:

docker compose up -d # start in background
docker compose logs -f # stream logs
docker compose down # stop and remove containers

Compose is useful for local development environments. It is a shame it exists as a separate abstraction — it taught people to think in multi-container terms without teaching them Kubernetes, and then left them with a gap to cross when they needed to go to production. That said, it is practical for what it does and is not going away.

For production orchestration, see Kubernetes.

Skopeo

Skopeo works with OCI images directly — copy, inspect, and convert — without pulling them to local storage. Useful in pipelines and for auditing registries:

# Inspect an image without pulling it
skopeo inspect docker://registry.example.com/myapp:1.0

# Copy between registries without touching local disk
skopeo copy docker://source-registry/myapp:1.0 docker://dest-registry/myapp:1.0

# Copy to a local OCI layout
skopeo copy docker://myapp:1.0 oci:myapp-local:1.0

skopeo inspect is particularly useful for checking image metadata, digest, and labels in CI before deciding whether to promote an image.

ORAS

ORAS (OCI Registry As Storage) pushes and pulls arbitrary artifacts to OCI registries — not just container images. Helm charts, SBOMs, attestations, Terraform modules, binary releases — anything can be stored in a registry that speaks OCI distribution spec:

# Push a file as an OCI artifact
oras push registry.example.com/myapp-sbom:1.0 sbom.json:application/spdx+json

# Pull it back
oras pull registry.example.com/myapp-sbom:1.0

This matters because it means a single registry can become the distribution mechanism for the entire software supply chain — image, SBOM, signature, attestation — all with the same access controls and audit trail.

Useful practices

Use specific image tags (postgres:16.2, not postgres:latest) — latest changes under you
Reference images by digest in production (myapp@sha256:abc123) — tags are mutable, digests are not
Run as a non-root user: USER appuser in the Dockerfile
Add a .dockerignore to exclude .git, node_modules, build artefacts from the build context
Keep images small — large images are slow to push, pull, and scan

Resources

etcd

Mon, 01 Jan 2024 00:00:00 +0000

etcd is the distributed key-value store that backs Kubernetes. Every Kubernetes object — pods, services, deployments, configmaps, secrets — is stored in etcd. The API server is the only component that reads and writes it directly; everything else in the cluster reads from the API server’s cache. etcd’s reliability is the cluster’s reliability: if etcd loses quorum, the Kubernetes control plane stops functioning.

Raft consensus

etcd uses the Raft consensus algorithm. The cluster elects a leader; all writes go through the leader, which replicates them to followers before acknowledging the write. The cluster tolerates (n-1)/2 node failures — a three-node cluster survives one failure, a five-node cluster survives two. This is why control plane node counts are always odd. Three nodes is standard for production; five for clusters where control plane availability is critical.

Watches and revisions

Every write increments a global revision counter. Clients can watch a key or key prefix and receive every change since a given revision. This is how the Kubernetes controller manager and scheduler work — they hold long-lived watch connections and react to changes in specific resource types without polling.

Operations

# Snapshot backup
etcdctl snapshot save /backup/etcd-snapshot.db \
 --endpoints=https://127.0.0.1:2379 \
 --cacert=/etc/kubernetes/pki/etcd/ca.crt \
 --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt \
 --key=/etc/kubernetes/pki/etcd/healthcheck-client.key

# Restore from snapshot
etcdctl snapshot restore /backup/etcd-snapshot.db --data-dir=/var/lib/etcd-restore

# Check cluster health
etcdctl endpoint health --cluster

Backing up etcd regularly is the most critical operational task for a Kubernetes cluster. The snapshot is the only path to full recovery if cluster state is lost.

Resources

Istio

Mon, 01 Jan 2024 00:00:00 +0000

Istio is a service mesh for Kubernetes. It injects a sidecar proxy (Envoy) into every pod, and all traffic between pods flows through these proxies rather than directly between containers. This gives the mesh control over traffic routing, security, and observability without any changes to application code.

What it solves

In a large microservice deployment, every service needs to handle retries, timeouts, circuit breaking, mutual TLS, and metrics collection — or skip them and accept the risk. Without a mesh, each team implements this differently, or not at all. Istio moves these concerns out of the application and into the infrastructure layer, where they are configured once and applied uniformly.

Traffic management

Istio’s VirtualService and DestinationRule CRDs give fine-grained control over how traffic is routed:

apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
 name: reviews
spec:
 hosts:
 - reviews
 http:
 - match:
 - headers:
 end-user:
 exact: test-user
 route:
 - destination:
 host: reviews
 subset: v2
 - route:
 - destination:
 host: reviews
 subset: v1

This routes a specific user to v2 of a service while everyone else gets v1 — canary testing without a load balancer rule or code change.

mTLS

Istio issues and rotates certificates for every workload and enforces mutual TLS between services automatically. Services authenticate each other’s identity, not just encrypt the connection. A PeerAuthentication policy can enforce strict mTLS across a namespace, ensuring no plaintext traffic is accepted.

Observability

Because all traffic flows through Envoy sidecars, Istio generates L7 metrics (request rate, error rate, latency percentiles), distributed traces, and access logs for every service-to-service call — without instrumentation in the services themselves. This integrates with Prometheus, Grafana, and Jaeger.

Cost

Istio adds latency (two extra proxy hops per call) and resource overhead (a sidecar per pod). For clusters with tens of services, the operational benefit is clear. For small clusters or teams early in a microservices journey, the complexity may outweigh the gains.

Resources

K9s & Lens

Mon, 01 Jan 2024 00:00:00 +0000

You run everything with kubectl. Get pods, describe, logs, exec, delete, apply — fifty times a day across five namespaces. It works, but every command is a context switch: type, wait, read, type again. -n namespace on every single invocation.

So you use K9s. A terminal UI that shows your entire cluster in one view. Switch namespaces and clusters in a keystroke, tail logs in real time, exec into a pod without constructing the command — everything you reach for in kubectl, but without the friction.

K9s

K9s is a TUI (terminal UI) for Kubernetes. It stays in your terminal, updates live, and is keyboard-driven throughout.

brew install derailed/k9s/k9s
k9s # connect to current context
k9s --context prod # specific context
k9s -n monitoring # start in a specific namespace

Key	Action
`:pod`	Jump to pods view
`:deploy`	Deployments
`:svc`	Services
`:ns`	Switch namespace
`/`	Filter/search
`l`	Logs
`e`	Edit resource YAML
`d`	Describe
`s`	Shell into pod
`ctrl-d`	Delete
`?`	Help / full keybinding list

Most resource types are reachable by typing : followed by the resource name — :configmap, :secret, :ingress, :pvc, and so on.

Why TUI over GUI

K9s lives in the terminal alongside your other tools. No window switching, works over SSH, starts instantly, and the keyboard-driven workflow is faster once it is in muscle memory. For day-to-day cluster work it is the right default.

Lens

Lens is a desktop GUI for Kubernetes — a full IDE-style interface with a visual cluster overview, resource browsing, metrics charts, log streaming, and terminal access built in.

It is the better choice when you need to onboard someone who is not yet comfortable with the terminal, or when you want a visual overview to share with a non-technical stakeholder. For engineers doing operational work all day, K9s is faster.

Worth noting: Lens has moved toward a commercial model (Lens Desktop Pro). OpenLens is the open-source build of the same codebase, without the account requirement.

kubectx / kubens

If K9s is more than you need and you just want to stop typing --context and -n on every command, kubectx and kubens solve exactly that:

kubectx # list contexts
kubectx prod # switch to prod context
kubectx - # switch back to previous context

kubens # list namespaces
kubens monitoring # switch default namespace

No TUI, no GUI — just fast context and namespace switching that persists for the rest of your terminal session. Install alongside K9s; they complement each other.

brew install kubectx

Resources

K9s documentation
K9s GitHub
Lens
OpenLens — open-source Lens build
kubectx/kubens — fast context and namespace switching

Kubernetes

Mon, 01 Jan 2024 00:00:00 +0000

Kubernetes (K8s) is the de facto standard for container orchestration and the second largest open source project after the Linux kernel. It has well and truly reached the plateau of productivity — the ecosystem is mature and it genuinely delivers.

That said, the honest take: K8s is ridiculously hard to deploy and manage (day 2 operations especially). Docker Swarm is equally ridiculously easy to get started with. For raw scale, Mesos/DC/OS wins — clusters of 80k+ nodes have been documented in the wild, versus K8s master’s practical ceiling of around 5k nodes.

So the real question is whether the ecosystem justifies the complexity for your situation. For most teams doing cloud-native work, it does.

Core concepts

The main building blocks:

Pods — smallest deployable unit, wrapping one or more containers that share network and storage.

Deployments — declare desired state; K8s handles rolling updates and self-healing.

Secrets — store sensitive data (passwords, tokens, keys) separately from application config.

DaemonSets — run a pod on every node. Typical use: log collectors, monitoring agents.

ReplicaSets — ensure N copies of a pod are running at any given time.

Ingress — HTTP/S routing rules at layer 7. Your load balancer config, declarative.

CronJobs — scheduled jobs, K8s-native.

Custom Resource Definitions (CRDs) — extend the K8s API with your own resource types. The foundation of most K8s operators.

Architecture

How the pieces fit together internally:

Containers vs virtual machines

Not an either/or — they solve different problems and are frequently combined.

Local clusters for development

When you need K8s without a full cluster:

Tool	Best for
MicroK8s	Ubuntu, snap-based, batteries included
Minikube	The classic, broad driver support
Kind	K8s in Docker, great for CI pipelines
K3D	K3s in Docker, fast startup
K3S	Lightweight K8s, edge and IoT use cases

Resources

kubernetes.io
CNCF Landscape — map of the cloud-native ecosystem
TGI Kubernetes intro (YouTube)
Setting up MicroK8s with RBAC and Storage

Kubernetes Autoscaling

Mon, 01 Jan 2024 00:00:00 +0000

Kubernetes has built-in autoscaling at two levels: the Horizontal Pod Autoscaler scales the number of pod replicas based on CPU or memory, and the Cluster Autoscaler adds or removes nodes when pods can’t be scheduled. KEDA and Karpenter extend these primitives — KEDA pushing workload scaling further, Karpenter replacing the node provisioner entirely.

KEDA

Kubernetes Event-Driven Autoscaling. KEDA extends the HPA to scale workloads based on external event sources — Kafka consumer lag, queue depth in SQS or RabbitMQ, HTTP request rate, database query results, cron schedules. The built-in HPA only knows about CPU and memory; KEDA adds a long list of scalers for external systems. The important capability it adds is scale-to-zero: a consumer that has no messages to process can scale down to zero pods and scale back up when work arrives. This makes it well-suited for event-driven workloads and batch processing where idle replicas waste resources.

Karpenter

A node provisioner that replaces the Cluster Autoscaler, originally from AWS and now a CNCF project with support for other clouds. Where the Cluster Autoscaler works by adjusting existing Auto Scaling Groups, Karpenter provisions EC2 instances (or equivalent) directly based on the actual resource requirements of pending pods — choosing the right instance type, size, and purchase option (on-demand vs spot) in real time. This makes provisioning significantly faster and more cost-efficient: the cluster gets exactly the nodes the pending workload needs, not the nearest pre-configured node group. Karpenter also handles consolidation — continuously evaluating whether running workloads could be packed onto fewer nodes and replacing over-provisioned nodes accordingly.

Resources

KubeVirt

Mon, 01 Jan 2024 00:00:00 +0000

See Virtualization — KVM and KubeVirt for full coverage of both KVM and KubeVirt.

Kyverno

Mon, 01 Jan 2024 00:00:00 +0000

Kyverno is a policy engine for Kubernetes. It runs as an admission controller and intercepts every resource creation or update, applying rules that validate, mutate, or generate resources. Policies are written as Kubernetes CRDs in YAML — no Rego, no separate language to learn. If you can write a Kubernetes manifest, you can write a Kyverno policy.

Three rule types

Validate — reject resources that don’t meet requirements:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
 name: require-labels
spec:
 rules:
 - name: check-team-label
 match:
 any:
 - resources:
 kinds: [Deployment]
 validate:
 message: "Deployments must have a 'team' label."
 pattern:
 metadata:
 labels:
 team: "?*"

Mutate — automatically add or modify fields on admission:

- name: add-default-resources
 match:
 any:
 - resources:
 kinds: [Pod]
 mutate:
 patchStrategicMerge:
 spec:
 containers:
 - (name): "*"
 resources:
 requests:
 +(memory): "64Mi"
 +(cpu): "250m"

Generate — create related resources automatically. A common use: generate a NetworkPolicy every time a new namespace is created.

Enforcement vs audit

Policies run in enforce mode (block non-compliant resources) or audit mode (allow but report violations). Audit mode is the right starting point — understand your existing state before enforcing.

Common policies

The Kyverno policy library has ready-made policies for common requirements: disallow privileged containers, require image tags to not be latest, enforce resource limits, restrict hostPath mounts. Most teams start from the library and customise.

Resources

Local Kubernetes

Mon, 01 Jan 2024 00:00:00 +0000

Running Kubernetes locally is useful for development, testing, and CI — a real cluster without the cloud bill. The options differ mainly in weight, startup speed, and whether they target local dev, CI pipelines, or lightweight production use.

MiniKube

The original local Kubernetes, maintained by the Kubernetes project itself. Runs a single-node cluster inside a VM (VirtualBox, HyperKit) or a Docker container. The reference implementation — if something works in Kubernetes, it works in MiniKube. Slower to start than the container-based options, heavier on resources, but the most faithful representation of a real cluster. Good for getting started and for testing things that need VM-level isolation.

Kind

Kubernetes IN Docker — each cluster node runs as a Docker container, no VM required. Fast startup (seconds), low overhead, and multi-node clusters are easy to spin up. The standard choice for running Kubernetes in CI pipelines: create a cluster, run tests, tear it down. The Kubernetes project itself uses Kind for conformance testing. Not designed for running workloads long-term, but excellent for ephemeral test environments.

K3S

Lightweight Kubernetes from Rancher (now SUSE), packaged as a single binary under 100MB. It strips out cloud-provider integrations, in-tree storage drivers, and alpha features — the result is a fully conformant Kubernetes that runs on hardware where full K8s won’t. Used in production for edge deployments, IoT, and resource-constrained environments. Also a good choice when you want a real persistent cluster locally without the overhead of MiniKube.

K3D

K3S running inside Docker containers — the same relationship Kind has to standard Kubernetes. Fast, lightweight, multi-node clusters in Docker. The advantage over Kind is that K3S starts faster and uses less memory per node. Good choice for local dev and CI when you want the lightweight K3S runtime rather than full upstream Kubernetes.

MicroK8S

Canonical’s take on local Kubernetes, distributed as a snap package on Ubuntu. Single-command install, add-ons (DNS, storage, ingress, observability) enabled with microk8s enable <addon>. Opinionated and tightly integrated with the Ubuntu/Canonical ecosystem. The right choice if you’re on Ubuntu and want a low-friction local cluster with batteries included — less so outside that ecosystem.

Which to use

	Best for
MiniKube	Getting started, testing with VM isolation
Kind	CI pipelines, ephemeral test clusters
K3S	Persistent local cluster, edge/IoT production
K3D	Fast local dev and CI with K3S runtime
MicroK8S	Ubuntu users wanting a managed local cluster

Resources

Loki

Mon, 01 Jan 2024 00:00:00 +0000

Prometheus tells you that something is wrong and when it started. Loki tells you what happened — it is the log aggregation layer of the observability stack. Logs from every pod across every node are collected, indexed, and made searchable in one place. Grafana is the front end for both.

How it works

Loki stores logs as compressed chunks, indexed only by labels (not by content). This makes it cheap to store and fast to query by label — namespace, pod name, app — but slower for full-text search than something like Elasticsearch. The trade-off is intentional: label-scoped queries cover the vast majority of real operational use, and the storage cost is dramatically lower.

Promtail runs as a DaemonSet on every node, tails log files from /var/log/pods/, attaches Kubernetes labels, and ships to Loki. Grafana queries Loki directly.

Deployment modes

SingleBinary — ingestion, querying, and management all run in a single instance. Simple to deploy, minimal operational overhead. A single point of failure: if it goes down, ingestion stops and logs are lost. The right starting point for most clusters.

SimpleScalable — responsibilities split into separate pods, each running a minimum of two instances for HA. Ingestion, querying, and the compactor can be scaled independently. Significantly more operational overhead, but fault-tolerant and tunable under load. The right move for production once you have volume and reliability requirements.

Getting started

The fastest path to a working stack is deploying Loki alongside kube-prometheus-stack, which brings up Prometheus, Grafana, and Alertmanager together. See the Prometheus note for the kube-prometheus-stack setup and the ArgoCD CRD workaround.

Loki and Promtail are installed as a separate ArgoCD Application, using multiple Helm sources with values pulled from the cluster config repo:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
 name: log-ingestion
 namespace: argo-cd
spec:
 project: default
 sources:

 # Loki
 - repoURL: "https://grafana.github.io/helm-charts"
 chart: loki
 targetRevision: 6.55.0
 helm:
 releaseName: loki
 valueFiles:
 - $values/cluster/testing/overlay/monitoring/helm/loki-values.yaml

 # Promtail
 - repoURL: "https://grafana.github.io/helm-charts"
 chart: promtail
 targetRevision: 6.17.1
 helm:
 releaseName: promtail
 valueFiles:
 - $values/cluster/testing/overlay/monitoring/helm/promtail-values.yaml

 # Values source — cluster config repo
 - repoURL: 'git@github.com:example-org/cluster-config.git'
 targetRevision: HEAD
 ref: values

 destination:
 server: https://kubernetes.default.svc
 namespace: monitoring
 syncPolicy:
 automated:
 selfHeal: true
 prune: true
 syncOptions:
 - CreateNamespace=true
 - ServerSideApply=true

Note: targetRevision: HEAD is fine for testing environments. Pin to a tag for staging and production.

Promtail deprecation

Promtail is deprecated as of February 2025 and in LTS — security fixes only, no new features. Expected EOL is end of 2026.

The Grafana-recommended replacement is Grafana Alloy, a more capable collector that handles metrics, logs, and traces in a single agent. The migration path is not yet settled enough for a confident recommendation — worth waiting for clear community consensus before moving. Until then, Promtail continues to work and the LTS window gives time to plan.

Grafana integration

Add Loki as a data source in Grafana and logs become queryable alongside metrics. A useful starting point is a simple app-oriented logs dashboard — filter by namespace and pod, tail in near-real-time, correlate timestamps with Prometheus spikes.

LogQL, Loki’s query language, mirrors PromQL in style:

# All error logs from a namespace
{namespace="production"} |= "error"

# Parse and filter structured logs
{app="my-api"} | json | status >= 500

# Rate of error log lines over time
rate({namespace="production"} |= "error" [5m])

Resources

Loki documentation
Grafana Alloy documentation — future Promtail replacement
loki-stack Helm chart
kube-prometheus-stack

Managing Secrets in Kubernetes

Mon, 01 Jan 2024 00:00:00 +0000

Kubernetes has a built-in Secret resource, but it is not a secrets management solution — it is base64-encoded storage with no encryption at rest by default and no access audit trail. How you actually manage secrets in a Kubernetes cluster depends on how far you need to go beyond the default.

Native Kubernetes Secrets

The baseline. A Secret is a key-value store mounted into pods as environment variables or files:

apiVersion: v1
kind: Secret
metadata:
 name: db-credentials
type: Opaque
data:
 username: YWRtaW4=  # base64("admin")
 password: cGFzc3dvcmQ=

The problems: base64 is encoding, not encryption. Secrets are stored in etcd — enabling etcd encryption at rest is a cluster configuration step that is easy to skip. Secrets are visible to anyone with kubectl get secret in that namespace. For anything beyond a local dev cluster or a low-sensitivity workload, you need something more.

Sealed Secrets

A Kubernetes controller from Bitnami. SealedSecret resources contain secrets encrypted with the cluster’s public key — only the controller running in that cluster can decrypt them. The encrypted form is safe to commit to Git, which makes GitOps workflows possible without a separate secrets store. Simple to operate, no external dependency. The tradeoff: secrets are tied to a specific cluster’s key, cross-cluster sharing requires re-encryption, and there is no centralised audit trail.

External Secrets Operator

ESO reads secrets from an external store (AWS Secrets Manager, HashiCorp Vault, GCP Secret Manager, Azure Key Vault, 1Password) and syncs them into native Kubernetes Secrets. Your source of truth stays in the external system; the K8s Secret is a read-only projection of it, refreshed on a configurable interval:

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
 name: db-credentials
spec:
 refreshInterval: 1h
 secretStoreRef:
 name: aws-secrets-manager
 kind: ClusterSecretStore
 target:
 name: db-credentials
 data:
 - secretKey: password
 remoteRef:
 key: prod/db/password

ESO is the right choice when you already have a secrets store and want Kubernetes workloads to consume from it without changing how secrets are managed elsewhere.

Secrets Store CSI Driver

An alternative to ESO for the same problem: mount secrets from an external store directly as files in a pod, without creating a Kubernetes Secret at all. The secret materialises only in the pod’s filesystem, is not stored in etcd, and disappears when the pod terminates. Supported by AWS, Azure, GCP, and Vault providers. Used in combination with a SecretProviderClass to define what to fetch and where to mount it.

HashiCorp Vault

A dedicated secrets management platform. Vault stores arbitrary secrets, issues dynamic credentials (database passwords that expire, AWS IAM credentials valid for an hour), manages PKI, and provides a full audit log of every read and write. Kubernetes workloads authenticate to Vault via the Kubernetes auth method (using the pod’s service account token) and receive a Vault token scoped to the secrets their service account is allowed to read. More to operate than the other options, but the right answer for organisations that need dynamic credentials, fine-grained access control, and audit logs.

Summary

Approach	Good for
Native Secrets	Local dev, low-sensitivity workloads
Sealed Secrets	GitOps, single-cluster, no external dependency
External Secrets Operator	Syncing from existing external stores
Secrets Store CSI	Avoiding etcd entirely, file-based secret injection
HashiCorp Vault	Dynamic credentials, audit logs, enterprise requirements

Resources

OpenShift Data Foundation

Mon, 01 Jan 2024 00:00:00 +0000

OpenShift Data Foundation (ODF) is Red Hat’s enterprise Kubernetes storage platform, built on Ceph orchestrated by Rook. Where Rook-Ceph is the open source upstream, ODF packages it with an operator, a validated configuration, enterprise support, and integration with the OpenShift console. It provides block (RBD), file (CephFS), and object (S3-compatible via Ceph RGW) storage as Kubernetes StorageClasses on the same hardware.

What it provides

Three storage modes from one cluster:

Mode	StorageClass	Use case
Block (RBD)	`ocs-storagecluster-ceph-rbd`	Databases, stateful apps needing a single-writer disk
File (CephFS)	`ocs-storagecluster-cephfs`	Shared filesystems, multiple pods reading/writing the same volume
Object	S3-compatible endpoint	Buckets via `ObjectBucketClaim`, backup targets, artifact storage

Installation

ODF installs via the ODF operator from OperatorHub. The operator creates a StorageCluster CR that drives the Ceph deployment:

apiVersion: ocs.openshift.io/v1
kind: StorageCluster
metadata:
 name: ocs-storagecluster
 namespace: openshift-storage
spec:
 storageDeviceSets:
 - name: ocs-deviceset
 count: 1
 replica: 3
 dataPVCTemplate:
 spec:
 storageClassName: local-storage
 volumeMode: Block
 resources:
 requests:
 storage: 1Ti

Requires at minimum three nodes with dedicated block devices. The operator handles Ceph cluster formation, monitors, MGRs, and OSDs.

vs Rook-Ceph

ODF IS Rook-Ceph under the hood. The difference is packaging and support: ODF is tested and supported on OpenShift, includes the NooBaa multi-cloud gateway for object storage federation, and integrates with the OpenShift UI. For self-managed Kubernetes outside OpenShift, raw Rook-Ceph is the equivalent path.

Resources

Reloader

Mon, 01 Jan 2024 00:00:00 +0000

Reloader is a Kubernetes controller from Stakater that watches ConfigMaps and Secrets and automatically triggers rolling restarts of Deployments, StatefulSets, and DaemonSets when the watched resources change. Kubernetes does not do this natively — updating a ConfigMap does not restart pods that consume it, so configuration changes don’t take effect until the next deploy.

Usage

Annotate a Deployment to watch a specific ConfigMap or Secret:

apiVersion: apps/v1
kind: Deployment
metadata:
 name: my-app
 annotations:
 reloader.stakater.com/auto: "true" # watch all referenced ConfigMaps/Secrets
 # OR be specific:
 configmap.reloader.stakater.com/reload: "my-config"
 secret.reloader.stakater.com/reload: "my-secret"

When the ConfigMap or Secret changes, Reloader detects it and triggers a rolling restart by updating a pod template annotation. The deployment rolls out new pods that pick up the updated configuration.

Installation

helm repo add stakater https://stakater.github.io/stakater-charts
helm install reloader stakater/reloader -n reloader --create-namespace

Why not use a hash annotation manually

The common alternative is to inject a hash of the ConfigMap into the pod template annotations via Helm or Kustomize — when the hash changes, Kubernetes rolls the deployment. This works but requires build-time tooling. Reloader handles it at runtime without any changes to the deployment pipeline.

Resources

Velero

Mon, 01 Jan 2024 00:00:00 +0000

Velero backs up and restores Kubernetes clusters. It captures both Kubernetes resource definitions (deployments, services, configmaps, secrets, CRDs) and persistent volume data, stores them in object storage (S3, GCS, Azure Blob), and can restore them to the same cluster or a different one. The primary use cases are disaster recovery, cluster migration, and namespace cloning.

How it works

Velero runs as a controller in the cluster. A Backup CR triggers a snapshot of selected resources:

apiVersion: velero.io/v1
kind: Backup
metadata:
 name: daily-backup
 namespace: velero
spec:
 includedNamespaces:
 - production
 storageLocation: default
 ttl: 720h  # 30 days

Persistent volume data is handled via storage provider snapshots (CSI snapshots, AWS EBS snapshots) or a file-system-level backup using the node-agent daemonset (formerly Restic). CSI snapshot integration is the preferred modern approach.

Scheduled backups run via a Schedule CR:

apiVersion: velero.io/v1
kind: Schedule
metadata:
 name: daily
 namespace: velero
spec:
 schedule: "0 2 * * *"
 template:
 includedNamespaces:
 - production
 ttl: 720h

Restore

Restoring is a Restore CR pointing at a backup:

velero restore create --from-backup daily-backup

Velero recreates the Kubernetes objects and restores volume data. Namespaces can be remapped on restore — useful for cloning production to staging.

Cluster migration

The standard migration pattern: back up from the source cluster, configure the destination cluster to point at the same object storage bucket, restore. Velero handles the resource recreation; DNS cutover is a separate step.

Resources

Virtualization — KVM and KubeVirt

Mon, 01 Jan 2024 00:00:00 +0000

KVM is the Linux kernel’s native hypervisor. KubeVirt extends Kubernetes to run virtual machines using KVM under the hood. They are the same virtualization layer at different levels of abstraction — KVM on bare metal, KubeVirt in a Kubernetes cluster.

KVM

Kernel-based Virtual Machine. KVM turns the Linux kernel into a hypervisor using hardware virtualization extensions (Intel VT-x, AMD-V). Virtual machines run as regular Linux processes backed by QEMU for device emulation. Managed via libvirt and its CLI tools (virsh, virt-install) or the virt-manager GUI.

# Create a VM from an ISO
virt-install \
 --name ubuntu-vm \
 --ram 4096 \
 --vcpus 2 \
 --disk path=/var/lib/libvirt/images/ubuntu.qcow2,size=40 \
 --cdrom /tmp/ubuntu.iso \
 --os-variant ubuntu22.04

# List running VMs
virsh list

# Start/stop
virsh start ubuntu-vm
virsh shutdown ubuntu-vm

# Connect to console
virsh console ubuntu-vm

KVM gives near-native performance for CPU-bound workloads. Network and disk I/O use virtio drivers for efficient paravirtualised I/O. Live migration moves a running VM between hosts without downtime if shared storage is available.

KubeVirt

KubeVirt adds VirtualMachine and VirtualMachineInstance CRDs to Kubernetes. VMs are defined as Kubernetes resources, scheduled by the Kubernetes scheduler, and managed alongside containers. Under the hood, each VM runs as a pod containing a QEMU-KVM process.

apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
 name: ubuntu-vm
spec:
 running: true
 template:
 spec:
 domain:
 devices:
 disks:
 - name: rootdisk
 disk:
 bus: virtio
 resources:
 requests:
 memory: 4Gi
 cpu: "2"
 volumes:
 - name: rootdisk
 containerDisk:
 image: kubevirt/fedora-cloud-container-disk-demo

The virtctl CLI complements kubectl for VM-specific operations:

virtctl start ubuntu-vm
virtctl stop ubuntu-vm
virtctl console ubuntu-vm # serial console
virtctl ssh ubuntu-vm # SSH via the Kubernetes API
virtctl migrate ubuntu-vm # live migrate to another node

CDI — Containerized Data Importer

KubeVirt is typically paired with CDI, which imports VM disk images from URLs, container registries, or PVCs into DataVolume resources that VMs can boot from. CDI handles the data flow; the VM definition just references the DataVolume.

Why VMs in Kubernetes

Some workloads can’t be containerised — legacy applications expecting a full OS, Windows workloads, software with kernel module requirements. KubeVirt lets those workloads live in the same cluster as containers, managed with the same tooling, subject to the same scheduling and networking policies.

Kubernetes on Backend Engineering Strategy Tools

Image Tooling

Images

Quick start

Setup (contributors / maintainers)

Releasing

Links

Gardener on Cleura

Steps

1 — Shoot cluster

2 — Minecraft via standard LoadBalancer

2.5 — Migrate to Helm chart

3 — Envoy Gateway

4 — HTTPRoute, certificates, and BlueMap

5 — Migrate to TCPRoute

6 — Velocity (if needed)

7 — Plugin pipeline

8 — AI

IaC gap

Status

Gardener on Cleura

Concepts

Shoot cluster on Cleura

Networking

Ingress — classic vs Gateway API

Envoy Gateway

TCPRoute — declaring TCP services

HTTPRoute — HTTP services

LoadBalancer — direct TCP via Octavia

Storage

Provisioning a shoot cluster on Cleura

Authentication

Bootstrap (once per project/region)

Create a shoot cluster

Poll until ready

Fetch kubeconfig

Script

IaC options

Kubernetes Policy

ValidatingAdmissionPolicy (VAP)

CEL basics

MutatingAdmissionPolicy

Gatekeeper

Gatekeeper vs Kyverno

Resources

Crossplane

Core concepts

The platform engineering model

vs Terraform

Upbound

Learning curve

ASGARD — the blade cluster

Power reality

What each role actually needs

Current blade state

Suggested configuration from existing stock

Storage — what moves where

Three things to do before blades can boot anything

The permanent vs experiment split

BGP

eBGP vs iBGP

ASNs for private use

Why BGP for Kubernetes LoadBalancer IPs

Testing with a real BGP network

Related

Ceph

Storage types

Components

Single-node constraint

Related

Rook

How it works

Typical install sequence

Single-node considerations

Related

Rook + Ceph on ODEN

Hardware

Step 1 — Install the Rook operator

Step 2 — CephCluster (single-node)

Step 3 — CephBlockPool and StorageClass