Budget and powerful k8s cluster

Last year became interested in Kubernetes. Administration side of this tool is more interesting for me, however I have a lot of fun using k8s cluster for self-hosted applications.

A few months ago I tried to use some managed services (mainly from Scaleway and Digital Ocean) to check how my setup will work in production environment. On my homelab cluster I have 8 powerful workers (but they are VMs on one machine), 3 masters and metallb. I'm able to run a lot of containers there without any performance issues. I wanted to duplicate this experience in a production (somehow) environment.

I decided to rent some cheap dedicated servers and set up a cluster myself. With load balancer for ingress controller, persistent storage on nodes' drives and at least 4 worker nodes.

Project is in progress. First updates soon.

10th December 2024

I have finally finished the first phase of the project. I have created a set of Ansible roles that define:

Base server configuration - installation of all tools needed for manual interventions and configuration changes,
Basic Kubernetes node - common for masters and workers,
Kubernetes master node
Kubernetes worker node

and Ansible playbook, which initializes the cluster with locally signed certificates. This was a very important part and it took me some time to debug issues with certificates common names, kubeconfigs etc.

I'm preparing a blog post about this part, but first I have to finish tidying up the tasks files in the roles. I have separate files for different aspects (e.g. generating private keys and certificate signing requests, installing additional software etc.), but not everything is so well organised.

Next steps for this stage of the project:

Preparing kubernetes-storage role to set up Ceph Rook. I have been testing a few storage solutions for Kubernetes and this looks the most promising to me.
Adapting my kubernetes-master role for multi-master configuration.
Roles for load balancers. Currently, I'm using HAProxy for multi-master load balancer, so this role will probably just install that tool, fill the configuration and apply it to the LB server.

May 2025

I put that project on-hold. I hadn't enough time to develop this solution. To save money on monthly payments for hardware, I stopped it. I'm going to continue that in the future using Scaleway's bare metal or even VPS' with hourly based payment model.

New approach with different server provider - September 2025

During last two months I was deploying my tools on managed Kubernetes cluster on Scaleway. Although everything works there, it's quite a boring setup.

I think the idea with setting up own cluster on dedicated hardware was fine, but I had to pay installation fees for servers and when I hadn't enough time to play with the cluster, I still needed to pay. Other painful thing was that this setup wasn't enough to use Ceph Rook storage efficiently. Some servers were in Poland, some in France. In France they were also spreaded between two locations. Without good network link it was quite slow.

I knew Hetzner offers wide range of reasonably priced dedicated servers. I started to look more closely at the offer of this provider. I found they provide cloud controller manager and CSI for their cloud-based block storage. Unfortunaetly it is not possible to use cloud storage on dedicated servers.

I was skeptical about their cloud offer (as I mentioned earlier, I think most cloud providers offers are expensive). I was surprised to find that I could buy instances similar to those offered by Scaleway at a much lower price. Example: Scaleways DEV1-L (4 vCPUs shared and 8 GiB RAM) costs 34.44 EUR per month while similar configuration (4 vCPUs and 8 GiB of RAM) costs 6.30 EUR per month on Hetzner. Before you start telling me that a vCPU from one provider may not be equal to a vCPU from another - I know, but I was not going to compare that now. We will see how it will handle workload. Maybe one day I'll perform some tests, but it will be quite hard, because shared vCPUs will certainly behave different at different moments. So probably I'll verify if that's enough for my needs.

So what's the current status? I created 5 instances. 3 of them are 2 vCPUs and 4 GiB of RAM. One is master, two others are workers. Other two out of 5 are 4 vCPU 8 GiB instances - also workers. I updated my Ansible roles to initialize cluster in a way that Hetzner cloud controller and CSI require. It seems that it works. ArgoCD also bootstraps all base components, but this still require a bit more work to do. Stay tuned.