Due to code formatting issues on this blog you can get this entire test-case from the same repo on my GitHub Profile
Related articles
The links bellow are to be used in conjunction with this guide:
- Test-Case - MetalLB
- Test-Case - Longhorn Storage
- Test-Case -
cert-manager
andingress-nginx
- K3s cluster recovery
Preface
The following guide is a simple yet effective approach to build an on-premises Kubernetes cluster using the SUSE Rancher's K3s Kubernetes engine. It is a simple yet effective solution for many scenarios and it will work great even for on-premises production.
This guide takes into consideration the following scenario.
- Only one master node.
Considering the fact that most on-premise datacenters use some virtualization solution like VMware vSphere, KVM with Proxmox etc, all of those already have some backup in place. Therefore the VMs will be backed up already. - At least three worker nodes
Three are the recommended minimum for the Kubernetes storage module - Longhorn.
Finally don't forget to visit the Official Documentation for more configuration or architectural design options
Minimum Recommended Requirements
The requirements greatly depend on the project in question and the application performance expectations. This cluster has even been tested to be working in a home lab on a stronger desktop computer with a VirtualBox.
However for small to medium teams working on a small start-up project I could only assume the following would be the bare minimum:
VM Requirements
Operating System: This has been tested on Ubuntu22.04
For more serious setup each VM in use should have the bellow bare minimums:
- CPU - 4x Virtual
- RAM - 8GB per VM (probably 16GB for the master node)
Storage
- Disk 40GB for the system OS volume
- Disk 100GB or more provisioned as LVM volume and mounted in
/longhorn
Network
- Network speed - Project specific
- A reserved network pool of Routable IP address for the MetalLB where the DHCP will not assign any address. In this example the VMs of the cluster are in the
172.16.0.0/24
network. I decided to take a chunk of IPs by dividing it into four subnets and I will be using the last subnet/26
subnet for the MetalLB i.e172.16.0.192/26
. This gives me 62 usable IP addresses for use by MetalLB for any future deployments.
Underlying hardware/fabric requirements
As this is planned for on-premise scenarios it is highly recommended for the underlying hypervisor to have snapshot capabilities and an external backup for each VM respectively.
VERY IMPORTANT
- It is best that all VMs in the cluster are backed up externally using a Hypervisor VM backup solution. Example would be Veeam Backup or VMware vSphere clusters.
- The master node VM must always have daily backups. This is where the ETCD database which contains the cluster configuration is located. The ETCD service on the master nodes does it's own daily snapshots at
/var/lib/rancher/k3s/server/db/snapshots/
- The worker nodes does not have to be backed up, however if the application in question uses the longhorn storage, this should have its own backup (application specific). The Longhorn storage module itself has backup options which are out of the scope of this guide.
- Finally, once you provision the K3s master node the node token (
K3S_TOKEN
), the etcd db snapshots mentioned above, and the kube configk3s.yaml
file are to be considered the holly trinity of the kubernetes cluster and should be securely backed up, and kept at a save place, until you no longer need the cluster.
Lose these and you lose the cluster. You've been warned!
Node details
All nodes should have their IP and DNS hostnames configured in the local DNS for proper name resolution:
k3s-master
k3s-master.tomspirit.me IN A 172.16.0.50
k3s.tomspirit.me IN CNAME k3s-master.tomspirit.me ## This is the k3s cluster URL
k3s-prometheus.tomspirit.me IN CNAME k3s-master.tomspirit.me
k3s-alertmanager.tomspirit.me IN CNAME k3s-master.tomspirit.me
k3s-grafana.tomspirit.me IN CNAME k3s-master.tomspirit.me
k3s-worker01
k3s-worker01.tomspirit.me IN A 172.16.0.51
k3s-worker02
k3s-worker02.tomspirit.me IN A 172.16.0.52
k3s-worker03
k3s-worker03.tomspirit.me IN A 172.16.0.53
General node preparation
This should be done on all nodes:
$ sudo ufw disable
# Used by the local-path-provisioner that comes with the cluster by default
$ sudo mkdir /local-path-provisioner
# Set the hostnames on all servers respectively
$ sudo hostnamectl set-hostname SERVER-FQDN
$ sudo apt update && sudo apt upgrade -y
$ sudo apt install -y open-iscsi nfs-common jq vim htop # Longhorn requirements and misc things
Provision the k3s
master
By default k3s
deployes traefic
and metrics-server
as part of the deployment. These will be disabled because nginx-ingress
and metallb
will be used instead.
$ curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.24.12+k3s1 sh -s - server \
--cluster-init \
--default-local-storage-path /local-path-provisioner \
--node-taint CriticalAddonsOnly=true:NoExecute \
--node-taint CriticalAddonsOnly=true:NoSchedule \
--tls-san 172.16.0.50 \
--tls-san k3s.tomspirit.me \
--tls-san k3s-master.tomspirit.me \
--disable traefik,metrics-server
# Get the node-token and set the kubeconfig to be readable for everyone
$ sudo cat /var/lib/rancher/k3s/server/node-token
## Take note of the node token
$ sudo chmod 644 /etc/rancher/k3s/k3s.yaml
Get the node-token which you will need for adding the worker nodes at the Add worker nodes section bellow:
# Get the node-token and set the kubeconfig to be readable for everyone
$ sudo cat /var/lib/rancher/k3s/server/node-token ## Take note of the node token and use it at the $K3S_TOKEN var bellow
$ sudo chmod 644 /etc/rancher/k3s/k3s.yaml
Provision the k3s
worker nodes
The worker nodes will need some additional preparation mostly for the storage part.
K3s storage preparation
The cluster by default uses the so-called local-path-provisioner
with its location at /local-path-provisioner
. This is a good storage for testing. For more serious use-cases there will be a Longhorn kubernetes storage module installed bellow.
For this reason additional storage partitions should be prepared as lvm ext4
and mounted on /longhorn
. The VMs in use for this had an additional virtual disk shown as /dev/sdb
on each VM.
In short, the additional storage partition has been provisioned using the following commands:
## Create the PV (Physical Volume)
$ pvcreate /dev/sdb && pvdisplay /dev/sdb
## Create the VG (Volume Group)
$ vgcreate longhorn-vg /dev/sdb && vgdisplay longhorn-vg
## Create he LV (Logical Volume)
## The `--extents 38399`` value has been obtained from the PE value from the `vgdisplay` command
$ lvcreate --extents 38399 -n longhorn-lv longhorn-vg && lvdisplay /dev/longhorn-vg/longhorn-lv
## Format and mount the lvm partition as ext4 file system
$ mkfs.ext4 /dev/longhorn-vg/longhorn-lv
$ mkdir /longhorn && mount /dev/longhorn-vg/longhorn-lv /longhorn && df -Th
## Update the /etc/fstab file
$ cat /etc/fstab >fstab.20230809.bak && echo '/dev/mapper/longhorn--vg-longhorn--lv /longhorn ext4 defaults 0 0' | tee -a /etc/fstab && echo -n "\n\n" && cat /etc/fstab
## REBOOT the system at the end to make sure everythingn is working normally and
## confirm the partition is mounted using `df -h` or `lsblk` commands
$ systemctl reboot
$ lsblk
#...
#... output omited ...
#...
sdb 8:16 0 150G 0 disk
└─longhorn--vg-longhorn--lv 253:0 0 150G 0 lvm /longhorn
Add worker nodes to the cluster
Execute the bellow command on each of the worker nodes, but make sure to update the K3S_TOKEN
variable which can be obtained from the master node at the previous step:
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.24.12+k3s1 \
K3S_URL=https://k3s.tomspirit.me:6443 \
K3S_TOKEN=VERY_LARGE_NODE_TOKEN_HERE \
sh -
Verify the cluster
The bellow commands should return positive status, a cluster with 4 nodes in total:
$ kubectl get nodes
$ kubectl get pods --all-namespaces
$ kubectl cluster-info
Install additional packages
Additional packages are important for the functionality of the whole cluster. This cluster will feature the following modules added:
cert-manager
- Provide SSL/TLS Certificate functionality for other services;ingress-nginx
- Provides L7 Ingress load balancing;metallb
- provides L4 load balancing;prometheus-monitoring
- Adds Prometheus, Grafana and Alert Manager monitoring componentsLonghorn
- Adds Kubernetes storage capabilities
cert-manager
For more info visit:
cert-manager
documentationcert-manager
helm repo
$ kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.11.0/cert-manager.crds.yaml
$ helm repo add jetstack https://charts.jetstack.io
$ helm repo update
$ helm install cert-manager jetstack/cert-manager \
--namespace cert-manager \
--create-namespace \
--version v1.12.0
Once the cert-manager
is installed create a default self signed ClusterIssuer for the whole cluster. This gives the capability to issue self signed certificates a simple TLS capability:
$ cat >cert-manager-ClusterIssuer.yml <<EOF
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: selfsigned-cluster-issuer
spec:
selfSigned: {}
EOF
$ kubectl apply -f cert-manager-ClusterIssuer.yml
ingress-nginx
For more info visit:
ingress-nginx
helm repo
# nginx ingress
# https://kubernetes.github.io/ingress-nginx/deploy/
$ helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
$ helm repo update
$ helm search repo ingress-nginx --versions
$ cat >values-ingress-nginx.yml <<EOF
controller:
ingressClass: nginx
ingressClassResource:
default: 'true'
service:
type: LoadBalancer
admissionWebhooks:
certManager:
enable: 'true'
metrics:
enabled: true
prometheusRule:
enable: 'true'
config:
allow-snippet-annotations: 'true'
ssl-redirect: 'false'
hsts: 'false'
EOF
$ helm install ingress-nginx ingress-nginx/ingress-nginx \
--namespace ingress-nginx --create-namespace \
--version 4.7.1 \
--values values-ingress-nginx.yml
NOTE:
The cert-manager
and ingress-nginx
test at the additional section bellow is actually done automatically done during the prometheus-monitoring section, as it creates a self-signed certificates for the prometheus stack and implements them at the ingress-nginx
. It can still be used however to test an internal PKI implementation for a separate namespace.
Additional section:
Test ingress-nginx and cert-manager self signed certificates
Prometheus monitoring
For more info visit:
prometheus-community
helm repoprometheus-community
github repo
As per my experience this is the most boring and complex part of the setup, mainly because the prometheus stack is actuall 3 or more products (based on your settings) bundled into one helm chart.
First create the monitoring namespace and the TLS Certificate which will be used by the prometheus stack:
$ cat >monitoring-namespace.yml <<EOF
apiVersion: v1
kind: Namespace
metadata:
name: monitoring
EOF
$ kubectl apply -f monitoring-namespace.yml
$ cat >prometheus-stack-cert.yml <<EOF
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: prometheus-stack-certificate
namespace: monitoring
spec:
secretName: prometheus-stack-certificate-rsa-secret
duration: 87600h #10y
renewBefore: 3600h
subject:
organizations:
- tomspirit.me
isCA: False
privateKey:
#algorithm: ECDSA
#size: 256
algorithm: RSA
encoding: PKCS1
size: 2048
usages:
- server auth
- client auth
dnsNames:
- k3s-prometheus.tomspirit.me
- k3s-alertmanager.tomspirit.me
- k3s-grafana.tomspirit.me
issuerRef:
name: selfsigned-cluster-issuer
kind: ClusterIssuer
group: cert-manager.io
EOF
$ kubectl apply -f prometheus-stack-cert.yml
To install the prometheus-stack
you will need to properly configure the .values
file. If you are to lazy to do that yourself I can understand that compleately, and to counter that I also have this whole project on my GitHub profile so you can take the values-kube-prometheus-stack.yml
from there.
Install the prometheus-stack
using helm
:
$ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
$ helm repo update
$ helm install kube-prometheus-stack \
-f values-kube-prometheus-stack.yml \
prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--version 46.8.0
To test the implementation once all of the pods have been successfully deployed browse to: https://k3s-grafana.tomspirit.me
and login with the admin username and password used from the adminPassword
value from the values-kube-prometheus-stack.yml
file.
Longhorn storage install
For more info visit:
$ curl -sSfL https://raw.githubusercontent.com/longhorn/longhorn/v1.4.1/scripts/environment_check.sh | bash
$ helm repo add longhorn https://charts.longhorn.io
$ helm repo update
$ helm install longhorn longhorn/longhorn \
--namespace longhorn-system --create-namespace \
--version 1.4.2
Longhorn Configuration
The configuration includes adjusting some settings from the UI as well as configuring additional storage classes to accommodate particular usage scenarios.
The default class longhorn
can be used too, but it shouldn't be altered, as per their documentation.
Accessing the Longhorn UI
To access the longhorn UI it is best if you use Lens and create a port forward on the longhorn frontend service.
- From the Lens UI select
Network
->Services
. - From the top right
Namespace
drop down menu selectlonghorn-system
. - From the list of services locate the
longhorn-frontend
service and click on it. - On the right pannel that opens at the Connection section there is the Ports sub-section indicating
80:http/TCP
. Click on the Forward button to enable port forward to the interface. - Once you are finished working, click Stop/Remove on the same button, or delete the port forwarding object from the
Network
->Port Forwarding
section.
Initial configuration
From the Longhorn UI the system has been configured with /longhorn
as its main storage device. This location is mounted on a separate LVM volume from within the system.
The default storage device /var/lib/longhorn
has been disabled as this resides on the /(root)
location of the system.
For each of the nodes at the Node section of the UI the following disk configuration has been set:
- Default disk (usually named)
default-disk-fd0100000
, with path/var/lib/longhorn
, set scheduling toDISABLED
, added labeldo-not-use
, for future reference. - Added the additional LVM partition which was prepared earlier as Longhorn
disk-1
,with path/longhorn
. Scheduling set toENABLED
, added labelLVM
and 25G storage reserved.
Best practices
The following is just a section from the Longhorn Best Practices from their official documentation.
- Replica Node Level Soft Anti-Affinity:
FALSE
- Allow Volume Creation with Degraded Availability:
FALSE
Creating additional storage classes
More information of the settings applied can be found at:
$ cat >longhorn-storage-classes.yml <<EOF
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: longhorn-best-effort-reclaim-delete
annotations:
storageclass.kubernetes.io/is-default-class: 'false'
provisioner: driver.longhorn.io
allowVolumeExpansion: true
reclaimPolicy: Delete
allowVolumeExpansion: true
volumeBindingMode: Immediate
parameters:
dataLocality: best-effort
fromBackup: ''
fsType: ext4
numberOfReplicas: '3'
staleReplicaTimeout: '2880'
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: longhorn-best-effort-reclaim-retain
annotations:
storageclass.kubernetes.io/is-default-class: 'false'
provisioner: driver.longhorn.io
allowVolumeExpansion: true
reclaimPolicy: Retain
allowVolumeExpansion: true
volumeBindingMode: Immediate
parameters:
dataLocality: best-effort
fromBackup: ''
fsType: ext4
numberOfReplicas: '3'
staleReplicaTimeout: '2880'
EOF
$ kubectl apply -f longhorn-storage-classes.yml
Additional section:
Test Longhorn volume provisioning
MetalLB
For more info visit:
Label the node and generate the metallb
values file:
$ kubectl label node k3s-master.tomspirit.me metallb-controller=true
$ cat >values-metallb.yml <<EOF
loadBalancerClass: "metallb"
controller:
nodeSelector:
metallb-controller: "true"
tolerations:
- key: CriticalAddonsOnly
operator: Exists
effect: NoExecute
- key: CriticalAddonsOnly
operator: Exists
effect: NoSchedule
speaker:
frr:
enabled: false
EOF
Deploy metallb using their helm chart:
$ helm repo add metallb https://metallb.github.io/metallb
$ helm repo update
$ helm install metallb metallb/metallb --namespace metallb-system --create-namespace -f values-metallb.yml --version 0.13.10
Configure the IP address pool and the L2 advertisement | Reference
$ cat >IPAddressPool.yml <<EOF
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: dev-pool
namespace: metallb-system
spec:
addresses:
- 172.16.0.192/26
avoidBuggyIPs: false
autoAssign: true
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: default-l2-advertisement
namespace: metallb-system
EOF
$ kubectl apply -f IPAddressPool.yml