Kubevirt + Kube-OVN Live Migration

Live migration is a process during which a running Virtual Machine Instance moves to another compute node while the guest workload continues to run and remain accessible.

Limitations of Live Migration with KubeVirt VM Networking

When working with KubeVirt virtual machines, live migration presents some networking limitations, particularly when using specific network interface types.

Bridge Interface Limitation: According to the official KubeVirt documentation, live migration cannot be performed if the virtual machine (VM) is using a bridge network binding. This limits the flexibility in scenarios where VMs are connected directly to physical or external networks via a bridge.

Masquerade Interface Challenge: While the masquerade interface allows live migration, it introduces a separate issue — the VM cannot ping itself or other VMs. This lack of internal connectivity can pose a problem in some deployment scenarios, especially when VMs need internal network access for certain applications.

Motivation

kubevirt integration requires the ability to have live-migration functionality across the network with Kube-OVN implemented at the pod's default network.

User-Stories/Use-Cases

A kubevirt user, need to migrate one virtual machine with bridge binding without affecting different ones and have minimal disruption.

Solution Using Kube-OVN

To overcome these limitations, you can use the live migration feature with Kube-OVN. This approach allows for seamless live migration of VMs even when using bridge binding. By configuring the cluster's default network with the appropriate annotations, you enable the VMs to be migrated smoothly across different nodes without losing network connectivity, minimizing disruption to workloads.

This feature is always enabled, it get triggered when a VM is annotated with kubevirt.io/allow-pod-bridge-network-live-migration: "" and use bridge binding.

Example: Live Migrating an Ubuntu Image

Install KubeVirt following the guide here

Create an ubuntu virtual machine with the annotations kubevirt.io/allow-pod-bridge-network-live-migration:

cat <<'EOF' | kubectl create -f -
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  name: rj-ubuntu-1
spec:
  runStrategy: Always
  template:
    metadata:
      annotations:
        # Allow KubeVirt VMs with bridge binding to be migratable
        # also kube-ovn will not configure network at pod, delegate it to DHCP
        kubevirt.io/allow-pod-bridge-network-live-migration: ""
      labels:
        kubevirt.io/size: small
        kubevirt.io/domain: rj-ubuntu-1
    spec:
      domain:
        cpu:
          cores: 2
        devices:
          disks:
          - name: disk0
            disk:
              bus: virtio
          - name: cloudinit
            cdrom:
              bus: sata
              readonly: true
          rng: {}
        resources:
          requests:
            memory: 1024Mi
      terminationGracePeriodSeconds: 15
      volumes:
      - name: disk0
        dataVolume:
          name: rj-ubuntu-1
      - name: cloudinit
        cloudInitNoCloud:
          userData: |-
            #cloud-config
            chpasswd:
              list: |
                ubuntu:ubuntu
              expire: False
            disable_root: false
            ssh_pwauth: True
            ssh_authorized_keys:
              - ssh-rsa AAAA...
  dataVolumeTemplates:
    - metadata:
        name: rj-ubuntu-1
      spec:
        storage:
          storageClassName: storage-nvme-c1
          # volumeMode: Filesystem
          # accessModes:
          #   - ReadWriteMany
          accessModes:
          - ReadWriteMany
          resources:
            requests:
              storage: 10Gi
        source:
          registry:
            url: docker://quay.io/containerdisks/ubuntu:22.04
EOF

After waiting for the VM to be ready, the VM status should be as shown below

kubectl get dv rj-ubuntu-1
NAME          PHASE       PROGRESS   RESTARTS   AGE
rj-ubuntu-1   Succeeded   100.0%                3m17s

kubectl get vmi rj-ubuntu-1
NAME          AGE    PHASE     IP               NODENAME            READY
rj-ubuntu-1   2m6s   Running   10.240.167.238   raja-k3s-agent3     True

ubuntu@rj-ubuntu-1:~$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8900 qdisc fq_codel state UP group default qlen 1000
    link/ether 66:c6:21:9d:04:87 brd ff:ff:ff:ff:ff:ff
    inet 10.240.167.238/13 metric 100 brd 10.247.255.255 scope global dynamic enp1s0
       valid_lft 86313415sec preferred_lft 86313415sec
    inet6 fe80::64c6:21ff:fe9d:487/64 scope link 
       valid_lft forever preferred_lft forever

Also we can check the neighbours cache to verify it later on

ubuntu@rj-ubuntu-1:~$ arp -a
_gateway (10.240.0.1) at 6a:0e:95:4b:c8:f0 [ether] on enp1s0

Keep in mind the default gw is a link local address; that is because the live migration feature is implemented using ARP proxy.

The last route is needed since the link local address subnet is not bound to any interface, that route is automatically created by dhcp client.

ubuntu@rj-ubuntu-1:~$ ip route
default via 10.240.0.1 dev enp1s0 proto dhcp src 10.240.167.238 metric 100 
10.240.0.0/13 dev enp1s0 proto kernel scope link src 10.240.167.238 metric 100 
10.240.0.1 dev enp1s0 proto dhcp scope link src 10.240.167.238 metric 100 
10.250.0.10 via 10.240.0.1 dev enp1s0 proto dhcp src 10.240.167.238 metric 100

Then a live migration can be initialized with virtctl migrate rj-ubuntu-1

kubectl get vmi rj-ubuntu-1
NAME          AGE   PHASE     IP               NODENAME            READY
rj-ubuntu-1   14m   Running   10.240.167.238   raja-k3s-agent2     True

After migration, the network configuration is the same - including the GW neighbor cache.

virtctl console rj-ubuntu-1

ubuntu@rj-ubuntu-1:~$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8900 qdisc fq_codel state UP group default qlen 1000
    link/ether 66:c6:21:9d:04:87 brd ff:ff:ff:ff:ff:ff
    inet 10.240.167.238/13 metric 100 brd 10.247.255.255 scope global dynamic enp1s0
       valid_lft 86312706sec preferred_lft 86312706sec
    inet6 fe80::64c6:21ff:fe9d:487/64 scope link 
       valid_lft forever preferred_lft forever

ubuntu@rj-ubuntu-1:~$ arp -a
_gateway (10.240.0.1) at 6a:0e:95:4b:c8:f0 [ether] on enp1s0

Summary

At kubevirt every VM is executed inside a "virt-launcher" pod; how the "virt-launcher" pod net interface is "bound" to the VM is specified by the network binding at the VM spec, kube-ovn support live migration of the KubeVirt network bridge binding.

The KubeVirt bridge binding adds a bridge and a tap device and connects the existing pod interface as a port of the aforementioned bridge. When IPAM is configured on the pod interface, a DHCP server is started to advertise the pod's IP to DHCP aware VMs.

KubeVirt doesn't replace anything. It adds a bridge and a tap device, and connects the existing pod interface as a port of the aforementioned bridge.

Benefit of the bridge binding is that is able to expose the pod IP to the VM as expected by most users also pod network bridge binding live migration is implemented by kube-ovn.