OpenBSD Amsterdam Logo
485 VMs deployed 🐡

Known issues

Vulnerabilities

In early February 2020 a number of vulnerabilities have been found, as documented on the openbsd-tech list. For a long time the TLB flush wasn’t addressed.

The TLB handling of guest pages is broken, in that the INVEPT
instructions in the host could be issued on the wrong CPUs. This means
that if UVM decides to swap out a guest page, the guest could still
access it via stale TLB entries. On AMD CPUs, there is no TLB handling
at all (??).

This means your VM, data, was potentially at risk during this time.

We worked together with Mike Larkin on a patch, which was deployed on 2020-11-02 during the upgrade from OpenBSD 6.7 to 6.8. Now all VM memory is hardwired.

The patch was commited to -current on 2021-01-23. As of 2021-05-01 the patch is now part of -release.

Connectivity

With the release of OpenBSD 6.9 we have changed the networking configuration from vether(4)/bridge(4) to vport(4)/veb(4). We are seeing good results in regards to stability and it’s looks like the below work arounds are no longer needed.

It’s possible for your VM to suffer from connectivity loss, latency and/or packet loss. Although with the last release it seems a lot less, but as a workaround run ping from cron(8) like:

*/5       *       *       *       *       -n ping -c3 <IPv4 gateway>

-or-

*/1       *       *       *       *       -n ping -c3 <IPv4 gateway>

When it’s more severe you can wrap ping in tmux in cron(8) like:

@reboot /usr/bin/tmux new -d 'while true; do /sbin/ping -i5 <IPv4 gateway>; done' \;

Or for IPv6:

@reboot /usr/bin/tmux new -d 'while true; do /sbin/ping6 -i5 <IPv6 gateway>; done' \;

You can find your gateway by using route(8). For example:

vm03$ route -n show | awk '/default/{print $2}'
46.23.92.1
2a03:6000:921::1

Unresponsive VM

It might happen your VM gets unresponsive and only “kill -9” helps. Since all VMs are running as root this is not possible as a normal user, we added pkill in doas(1) so you can.

The doas.conf(5) entry we are using is:

permit nopass vm-owner as root cmd pkill args -9 -xf "vmd: vm-name"

The command you need to issue is something like:

vm03$ doas pkill -9 -xf "vmd: vm03"

High CPU interrupts

VMs have a constant high intr CPU state in top(1):

CPU states: 0.0% user, 0.0% nice, 0.1% sys, 0.0% spin, 98.0% intr, 1.9% idle

This is an accounting error.

Clock

When you are seeing your clock drifting you can check if your VM assigned MHz are close to the host. On the host:

server10$ dmesg | awk '/^cpu0:.*Hz/'
cpu0: Intel(R) Xeon(R) CPU X5690 @ 3.47GHz, 3591.43 MHz, 06-2c-02
On the VM:
vm03$ dmesg | awk '/^cpu0:.*Hz/'
cpu0: Intel(R) Xeon(R) CPU X5690 @ 3.47GHz, 3466.79 MHz, 06-2c-02

When the difference is substantial, a reboot of the VM might help. If the clock to drift remains severe run rdate(8) from cron:

*/5       *       *       *       *       /usr/sbin/rdate -s pool.ntp.org