I was playing around with updates last night - just a little bit too much from what it seems, causing the disaster — all nodes suddenly stopped

I broke my Kubernetes cluster running on Raspberry Pi

submited by
Style Pass
2021-05-28 08:30:17

I was playing around with updates last night - just a little bit too much from what it seems, causing the disaster — all nodes suddenly stopped detecting network interfaces, and it was impossible to recover. Article published on Pi Day ( 14/03 ) with all-you-can-eat code attached ( at the end ).

As my pet cluster grew to six nodes in total (thanks to my wife, who knows that the best thing for my birthday is a pie, RaspberryPi), I’ve had a choice — start from the beginning following my own article on setting up the Kubernetes cluster on Raspberry Pi, or do it the DevOps and SRE way — fully automate potential rebuild and cluster management.

Approach #1: Wake up early and rebuild things manually following my own article (sic!) and maybe find some room for improvements. This method, as I’ve proven to myself quite a few times, is prone to errors and mistakes, mostly typos or skipping one or two steps, then trying to figure out what on earth happened for the next half an hour, to wipe everything out and start from the beginning over and over again. I’ve done this a few times already, but after all, one more time, never hurt anyone.

Approach #2: Wake up early and start coding from scratch; the ultimate solution to help me this time is to do it once and use it forever. Make the whole cluster rebuild and production as easy to reproduce as possible. It will, of course, mean more downtime. However, its benefits will be long-standing ones, allowing me to not worry about the cluster itself and finally treat it as a stable solution towards which I can migrate my whole home infrastructure.

Leave a Comment