Previously I talked about setting up OpenBGPD on OpenBSD to provide high availability by taking advantage of Vultr's BGP feature and reserved IPs. This post will cover some practical considerations to bear in mind when doing so, and some of the ways I have dealt with various problems posed by this setup. This is not a how-to guide for a full setup of any of these ideas; it's just enough to get you started.
The first issue is that SSH will listen on all IP addresses, because the default setting for "ListenAddress" is "0.0.0.0". This means SSH is going to listen on the BGP-announced floating IP. While this is not necessarily harmful because it is no less secure than SSH listening on any other port, you will have a problem when you try to SSH to your domain when the A record is set to the floating IP. The connection will be routed via anycast with ECMP, so you could be connecting to any of your instances, but you don't control which one. The next time you connect, you will likely get sent to a different instance, which will cause the host key verification to complain that the key has changed.
Instead, use your local ~/.ssh/config file to make connections to each host easier to manage. I only have SSH listening on each instance's assigned public IP, and have something like this in my client's ~/.ssh/config file:
Host kp1
Hostname 1.2.3.4
IdentityFile ~/.ssh/id_ecdsa
User joe
Port 4000
Host kp2
Hostname 5.6.7.8
IdentityFile ~/.ssh/id_ecdsa
User joe
Port 4000
I can just add a new block if I add a new instance to the cluster.
The next issue is to keep all website files in sync. As mentioned previously, I keep the files for this site in a Github repo. httpd's root directory of /var/www/htdocs/kernelpanic.life is just that repo. If I add a post or make any changes, I push to Github, and then SSH to the instances and broadcast a "git pull" to all of them to pull the newest changes from Github. There's also a line in /etc/daily.local on each instance to do a git pull in case I forget or don't want to make changes live right now, but don't want to forget either.
A tricky problem is handling Let's Encrypt TLS certificates and their renewals. Simply running acme-client with HTTP-01 challenges works great for a regular httpd setup. An overly simplified version of HTTP-01 is this: Let's Encrypt hands you a token, acme-client puts a file containing that token and a fingerprint of your key in your /.well-known/acme-challenge/ location served by your web server. Let's Encrypt can then verify to make sure it's there, and if so, your validation is successful.
The anycast setup breaks this, because when Let's Encrypt looks for that file, it obviously needs to be sent to the same instance that initiated the validation request, and this is not guaranteed with ECMP routing. If you are using BGP prepends to have a primary and a backup pair of servers, you could get around this by doing the ACME challenge on the primary and then syncing its certs to the backup. This doesn't really work with ECMP though.
Instead I created another server that handles TLS creation and renewal with acme.sh using DNS-01 authentication. DNS-01 uses the creation of a DNS TXT record to prove you own the domain, which acme.sh can automate for a number of DNS providers. In my case, I'm using Vultr's DNS service, so acme.sh will use the Vultr API to edit DNS records and add the challenge TXT record. I created a sub-user on my Vultr account that only has DNS privileges, and I'm using that sub-user's API key. That key is further restricted to just the IP address of the ACME instance. A line like this in /etc/monthly.local renews the certs every month:
acme.sh --renew --force --dns dns_vultr -d kernelpanic.life --standalone
The advantage of using DNS-01 is that the instance adding the DNS records doesn't have to be part of the cluster and doesn't need the domain pointing to its IP. Each instance in the cluster will then SCP to this instance and copy the cert and key to their proper local locations. This is also automated on them with /etc/monthly.local
Doing it this way makes it easy to add new servers to the cluster, because the certs are handled centrally, so getting them onto a new instance and keeping them all in sync is simple. There is a single point of failure here if the acme.sh server goes down, as the certs will not get renewed and/or nothing in the cluster will be able to get them. But there is enough time before expiration that I could fix anything that goes awry, or just deploy a new acme.sh instance from a snapshot if necessary.
The last hurdle is logging. Visitor logs to the site are spread across the /var/www/logs/access.log files on each of the instances. If I wanted to see where traffic was coming from, or whether there was a problem with one of the web servers, I would have to SSH into each instance. This doesn't scale well.
One obvious way to handle this is to make a new instance to act as a log collector and have a line in /etc/weekly.local on each instance to SCP their logs to it. Another way would be to use some sort of object storage solution, like Amazon S3 or Vultr Object Storage, and do essentially the same thing, but send the logs to the storage service with s4cmd or similar. I haven't cared anough about checking logs to do either of these at the moment. I am aware that this is shortsighted.