on
installing talos on hetzner dedicated server
When you get a bare-metal instance, like EX44, from Hetzner (not Hetzner Cloud service), it gives you access to a real machine with all the quirks.
As I was trying to install Talos OS v1.7.0 to get an immutable OS for my small bare-metal cluster, it just didn’t work. I even had a technician write ISO to a USB and plug it in and enable a KVM switch for me so that I can see what’s happening because it simply wouldn’t boot. The only message on the screen was the following:
EFI stub: Loaded initrd from LINUX_EFI_INITRD_MEDIA_GUID device path
The setup worked in the vKVM rescue system (it loads your whole machine as a VM and gives you screen access) so there wasn’t anything wrong with the image.
After a few days of digging possible causes of Linux boot problems, I’ve found that in some cases kernel needs more
information on where to print its logs. Hence, some setups required addition of kernel cmdline parameters to configure
kernel console, like console=ttyS0,9600
as netboot.xyz instructs
here for Oracle Cloud.
Talos has an Image Factory where you can generate Talos images with extra options, like
additional kernel parameters. Using console=ttyS0,9600
didn’t work.
What I ended up doing was as following:
-
Log into rescue system (Linux option so that you get SSH access).
-
Query ttys:
ls -al /dev | grep tty
For me, this didn’t return any
ttyS*
. There were a lot of justtty
s, includingtty0
. It’s likely because the machine doesn’t have hardware serial port; alltty
’s are virtual. See more details here about the difference.If you do see
ttyS0
, you likely want to use that one. -
Go to Image Factory and add the following as additional kernel parameter:
console=tty0
-
Download the raw disk image (
metal-amd64.raw.xz
) in rescue system.cd /tmp wget <disk url>
-
List all the disks and choose the appropriate one.
lsblk -f
-
You will write to the whole disk as the disk image contains all partition information as well. In my case, I had two disks
/dev/nvme0n1
and/dev/nvme1n1
. I chose/dev/nvme0n1
.xz -d -c /tmp/metal-amd64.raw.xz | dd of=/dev/<your disk> status=progress && sync
-
Now go back to UI and initiate a hardware reset.
-
After a while, confirm it booted up with the following command from your local machine:
talosctl -n <Machine IP Address> disks --insecure
You should see output like the following:
DEV MODEL SERIAL TYPE UUID WWID MODALIAS NAME SIZE BUS_PATH SUBSYSTEM READ_ONLY SYSTEM_DISK /dev/nvme0n1 SAMSUNG MZVL2512HCJQ-00B00 S675NL0W675607 NVME - eui.002538b631a62b48 - - 512 GB /pci0000:00/0000:00:01.0/0000:01:00.0/nvme/nvme0/nvme0n1 /sys/class/block * /dev/nvme1n1 SAMSUNG MZVL2512HCJQ-00B00 S675NL0W675614 NVME - eui.002538b631a62b4f - - 512 GB /pci0000:00/0000:00:06.0/0000:02:00.0/nvme/nvme1/nvme1n1 /sys/class/block
Note that if you see
QEMU HARDDISK
, then Talos OS is booted in the vKVM system. You need to restart the machine so that it goes out of the vKVM rescue system. -
Make sure to have the same additional kernel cmdline parameter in your Talos machine config:
machine: install: extraKernelArgs: - console=tty0
It turns out, some single-board computers need this setting as well, like Raspberry Pi, and Talos handles them by having an overlay with this argument passed in.
Follow @muvaffakonus on Twitter.