Deploy HA Kubernetes on Talos Linux Bare Metal

Architectural Blueprint

What is Talos Linux Based On?
HA Architecture & etcd Quorum
Step 1: OS Installation (IPMI)
Step 2: Generating Talos Config
Step 3: L2 VIP & VLAN Patching
Step 4: Bootstrapping & Backups
Step 5: Cilium CNI (Native L2)
Step 6: The Production Stack

Running Kubernetes in the cloud provides flexibility, but for I/O and network-heavy workloads, hypervisor overhead can impact performance. Transitioning to Bare Metal Kubernetes offers direct access to PCIe lanes and raw compute.

However, installing Kubernetes on general-purpose Linux distributions requires strict CIS compliance hardening to reduce the attack surface. Enter Talos Linux.

What is Talos Linux Based On?

A common question among engineers is, "What is Talos Linux based on?" While it utilizes the Linux kernel, it is an immutable, API-driven operating system designed explicitly for Kubernetes. It drastically reduces the OS-level attack surface by eliminating SSH, the shell, and package managers. Every interaction happens via a mutually authenticated gRPC API (talosctl).

The API Security Reality

While Talos secures the underlying node, it does not make your cluster invincible. The Kubernetes API remains a massive attack vector. If an attacker compromises your API endpoint, they gain full control. True security still mandates strict RBAC, Pod Security Standards, and intra-cluster mTLS.

HA Architecture & etcd Quorum

Running a single Control Plane is a lab experiment, not a production setup. The Kubernetes database (etcd) relies on a strict quorum (majority) to function. A production-grade cluster requires a minimum of 3 Control Plane nodes.

The Quorum Risk: In a 3-node cluster, the quorum is 2. If one node fails, the cluster survives. If two nodes fail, the cluster is dead. You cannot read or write to the API.

Infrastructure & The Layer 2 VIP

To expose the API securely, Talos uses a Virtual IP (VIP) backed by gratuitous ARP. Limitation: This requires all Control Plane nodes to reside in the exact same Layer 2 subnet.

When data sovereignty, compliance, or I/O latency requirements mandate North American data residency, deploying this architecture on USA Dedicated Servers provides the necessary physical Layer 2 networking capabilities without cloud routing restrictions.

3x Control Plane Nodes: (e.g., IPs: 10.10.10.11, .12, .13)
1x Private L2 VIP for API Server: (e.g., 10.10.10.100)

Step 1: OS Installation via IPMI

In a true datacenter environment, writing ISOs to physical USB drives is impractical. Bare metal provisioning relies on PXE booting or remote Out-of-Band (OOB) management.

Download the Talos Linux Metal ISO from the official GitHub releases.
Log into your server's IPMI / iKVM Console.
Navigate to Virtual Media, mount the ISO, and power cycle the server.
The system will boot into Talos Maintenance Mode and await configuration over the network.

Step 2: Generating the HA Configuration

Generate the foundational machine configuration. Notice that we bind the cluster endpoint to our Private VIP (10.10.10.100).

talosctl gen config my-ha-cluster https://10.10.10.100:6443

# Generated files: controlplane.yaml, worker.yaml, talosconfig

Step 3: L2 VIP & VLAN Patching

We must configure Talos to announce the Layer 2 VIP across the Control Planes. This ensures that if Control Plane 1 dies, the ARP table updates and the VIP seamlessly fails over to Control Plane 2.

Create patch-cp.yaml. (Note: We also disable the default kube-proxy because we will use Cilium as a full replacement).

machine:
  network:
    interfaces:
      - interface: eth1
        vip:
          ip: 10.10.10.100 # The L2 Shared API Endpoint
cluster:
  network:
    cni:
      name: none # We will install Cilium manually
  proxy:
    disabled: true # Cilium will replace kube-proxy

Merge this patch with the base configuration:

talosctl machineconfig patch controlplane.yaml --patch @patch-cp.yaml -o cp-patched.yaml

Step 4: Bootstrapping & Backups

Apply the patched configuration to all three Control Plane nodes.

talosctl apply-config --insecure --nodes 10.10.10.11 --file cp-patched.yaml
talosctl apply-config --insecure --nodes 10.10.10.12 --file cp-patched.yaml
talosctl apply-config --insecure --nodes 10.10.10.13 --file cp-patched.yaml

Once the nodes boot, bootstrap the cluster on only the first node to initiate the etcd quorum.

talosctl config endpoint 10.10.10.100
talosctl config node 10.10.10.11

talosctl bootstrap --talosconfig ./talosconfig
talosctl kubeconfig ./kubeconfig --talosconfig ./talosconfig
export KUBECONFIG=$(pwd)/kubeconfig

Day-2 Operations: etcd Disaster Recovery

Do not wait for a failure. Immediately establish a cron job to backup your cluster state using:
talosctl etcd snapshot db.snapshot. Store these snapshots externally (e.g., S3 storage).

Step 5: Cilium CNI (Native L2 Announcements)

A common legacy practice was deploying MetalLB alongside your CNI. Modern eBPF-based CNIs like Cilium now natively support L2 announcements and BGP, making standalone LoadBalancers redundant resource bloat.

1. Install Cilium (Replacing Kube-Proxy)

helm install cilium cilium/cilium \
  --namespace kube-system \
  --set ipam.mode=kubernetes \
  --set kubeProxyReplacement=true \
  --set k8sServiceHost=10.10.10.100 \
  --set k8sServicePort=6443 \
  --set l2announcements.enabled=true \
  --set securityContext.capabilities.ciliumAgent="{CHOWN,KILL,NET_ADMIN,NET_RAW,IPC_LOCK,SYS_ADMIN,SYS_RESOURCE,DAC_OVERRIDE,FOWNER,SETGID,SETUID}" \
  --set securityContext.capabilities.cleanCiliumState="{NET_ADMIN,SYS_ADMIN,SYS_RESOURCE}" \
  --set cgroup.autoMount.enabled=false \
  --set cgroup.hostRoot=/sys/fs/cgroup

2. Define the IP Pool

Apply the CiliumLoadBalancerIPPool and CiliumL2AnnouncementPolicy to expose your LoadBalancer type services.
[WARNING: The IPs below are RFC-5737 documentation examples. Replace them with your actual assigned Public IP block.]

apiVersion: "cilium.io/v2alpha1"
kind: CiliumLoadBalancerIPPool
metadata:
  name: public-ip-pool
spec:
  blocks:
  - cidr: "198.51.100.10/29" # REPLACE WITH YOUR REAL IPs
---
apiVersion: "cilium.io/v2alpha1"
kind: CiliumL2AnnouncementPolicy
metadata:
  name: default-l2-policy
spec:
  interfaces:
  - eth0
  externalIPs: true
  loadBalancerIPs: true

Step 6: The Production Readiness Stack

Your bare metal cluster is now online, highly available, and networking natively via eBPF. However, a true production environment is not complete until you deploy the Day-2 operations stack:

Ingress Routing: Deploy the Kubernetes Gateway API (Envoy) or NGINX Ingress Controller for proper HTTP/S traffic routing.
Certificate Management: Install cert-manager integrated with Let's Encrypt for automated TLS renewals.
Observability: You are flying blind without metrics. Deploy the Prometheus Operator, Grafana, and Cilium Hubble to monitor cluster health and network flows.

Talos Kubernetes & Bare Metal FAQ

What is Talos Linux based on?

While it utilizes the Linux kernel, Talos is not a fork of Ubuntu, Alpine, or Debian. It is an immutable, API-driven operating system designed explicitly for Kubernetes from the ground up, stripping away SSH, the shell, and package managers to ensure a minimal attack surface.

Why use Talos Linux instead of Ubuntu for Kubernetes?

General-purpose distributions like Ubuntu require extensive CIS hardening, frequent OS-level patching, and SSH key management. Talos Linux eliminates configuration drift and OS-level vulnerabilities by being immutable and strictly API-managed, saving hundreds of hours in DevOps maintenance.

Is Talos Linux free and open source?

Yes, Talos Linux is 100% free and open-source software (FOSS). It is developed and maintained by Sidero Labs, and its source code is openly available on GitHub. You can deploy it on your own bare metal servers without any enterprise licensing fees.

Do I need a USB drive to install Talos on Bare Metal?

No. In an enterprise datacenter environment, you can mount the Talos ISO remotely using your dedicated server's IPMI / iKVM console or utilize PXE booting for automated, remote deployments without requiring physical access to the hardware.

What is the difference between talosctl and kubectl?

talosctl is the CLI tool used to manage the underlying Talos operating system (e.g., configuring networks, upgrading the OS, fetching syslog). kubectl is the standard Kubernetes CLI used to manage containerized applications and cluster resources (e.g., deploying pods, managing services).

How do bare metal Kubernetes nodes communicate securely?

Kubernetes nodes should never route internal traffic over the public internet. Secure bare metal clusters route Control Plane and Worker node traffic exclusively over an isolated Private VLAN (Layer 2), effectively mitigating external network sniffing and DDoS attacks on internal components.

The Bare Metal Kubernetes Blueprint: Deploy Talos Linux

Master production-grade High Availability (HA), etcd quorum failover, and native Cilium L2 routing on dedicated hardware.

Architectural Blueprint

What is Talos Linux Based On?

The API Security Reality

HA Architecture & etcd Quorum

Infrastructure & The Layer 2 VIP

Step 1: OS Installation via IPMI

Step 2: Generating the HA Configuration

Step 3: L2 VIP & VLAN Patching

Step 4: Bootstrapping & Backups

Day-2 Operations: etcd Disaster Recovery

Step 5: Cilium CNI (Native L2 Announcements)

1. Install Cilium (Replacing Kube-Proxy)

2. Define the IP Pool

Step 6: The Production Readiness Stack

Talos Kubernetes & Bare Metal FAQ

Ready to Launch with Unmatched Power?

The Bare Metal Kubernetes Blueprint: Deploy Talos Linux

Master production-grade High Availability (HA), etcd quorum failover, and native Cilium L2 routing on dedicated hardware.

Architectural Blueprint

What is Talos Linux Based On?

The API Security Reality

HA Architecture & etcd Quorum

Infrastructure & The Layer 2 VIP

Step 1: OS Installation via IPMI

Step 2: Generating the HA Configuration

Step 3: L2 VIP & VLAN Patching

Step 4: Bootstrapping & Backups

Day-2 Operations: etcd Disaster Recovery

Step 5: Cilium CNI (Native L2 Announcements)

1. Install Cilium (Replacing Kube-Proxy)

2. Define the IP Pool

Step 6: The Production Readiness Stack

Talos Kubernetes & Bare Metal FAQ

Ready to Launch with Unmatched Power?

Subscribe to Our Newsletter

Thank you for subscribing to

Christmas Mega Sale!