QubesOS backups using ZFS


Rather than use the QubesOS backup solution, I homecooked my own leveraging ZFS. Since my disk is encrypted using ZFS, this has the benefit of allowing incremental backups while sending encrypted backups to another ZFS pool.

My implementation uses sanoid / syncoid to handle the incremental subvolume management. Fortunately, this is packaged in Fedora and simply requires:

qubes-dom0-update sanoid

The rest of the workflow is handled by very simple shell scripts.

SSH tunnel

Since dom0 is without internet, syncoid needs to send the encrypted dataset via an ssh tunnel that goes through qvm-run. To facilitate this, under /usr/local/bin/qubes-ssh-proxy, I have a very simple wrapper script:

#!/bin/bash
CMD="ssh ${@}"
CMD=$(sed 's$;$\\;$g'  <<< ${CMD})
CMD=$(sed 's$|$\\|$g'  <<< ${CMD})
qvm-run -p $QUBES_SSH_PROXY "${CMD}"

This uses environment variable QUBES_SSH_PROXY to set target, wraps the ssh command and feeds it into qvm-run. Special characters are escaped.

So that syncoid uses the SSH proxy, we have to modify /usr/sbin/syncoid:

diff --git a/usr/sbin/syncoid.orig b/usr/sbin/syncoid
index 956f3e7..484aae4 100755
--- a/usr/sbin/syncoid.orig
+++ b/usr/sbin/syncoid
@@ -103,7 +103,7 @@ $ENV{'PATH'} = $ENV{'PATH'} . ":/bin:/usr/bin:/sbin";
 
 my $zfscmd = 'zfs';
 my $zpoolcmd = 'zpool';
-my $sshcmd = 'ssh';
+my $sshcmd = 'qubes-ssh-proxy';
 my $pscmd = 'ps';
 
 my $pvcmd = 'pv';

Eventually, might be worthwhile to implement an environment variable to feed $sshcmd in a more elegant way.

Syncoid wrapper

Finally, I use a very small script that wraps syncoid, and transfers the subvolumes that I want to transfer. I call it bacoid, and it lives under /usr/local/bin:

#!/bin/bash

USER=(target user)
MACHINE=(machine name)
HOST=(target host)
PORT=(target port)
POOL=(target pool)
KEY=(key on dom0 to use)
KNOWN_HOSTS=(known hosts on dom0 to use)

export QUBES_SSH_PROXY=sys-firewall

if ! qvm-run -p $QUBES_SSH_PROXY "ls /home/user/QubesIncoming/dom0/${KEY##*/}"; then
    echo "copying dom0 key"
    qvm-copy-to-vm $QUBES_SSH_PROXY $KEY
fi

if ! qvm-run -p $QUBES_SSH_PROXY "ls /home/user/QubesIncoming/dom0/${KNOWN_HOSTS##*/}"; then
    echo "copying dom0 known_hosts"
    qvm-copy-to-vm $QUBES_SSH_PROXY $KNOWN_HOSTS
fi

subvolArray=(
    (array of subvolumes to transfer)
)


for subvol in ${subvolArray[@]}; do
    syncoid --sendoptions="w" --compress=none --recvoptions="u" -r --sshport=${PORT} --no-privilege-elevation ${subvol} ${USER}@${HOST}:${POOL}/${USER/-/\/}/${MACHINE}/${subvol} --sshkey=/home/user/QubesIncoming/dom0/${KEY##*/}
done

What this does is use qubes-ssh-proxy to tunnel a ZFS send / receive operation that targets a remote server. We leverage --sendoptions="w" to send the ZFS subvolume as raw, thus encrypted. The target machine will not be able to read the subvolume.

This wrapper evidently assumes you have a key generated on dom0, and known_hosts file with the target machines’ host. So you have the send the public key to the remote server and import the server’s fingerprint in dom0. Since I use a disposable VM for sys-firewall, this is why dom0 key and known hosts file is transferred.

Security

It’s worth mentionning again that if your dataset is not encrypted, it will be readable by the target machine. That said, the transfer is always encrypted since it’s over SSH. In any case, I send this to a machine that I control, and the hosts fingerprint would be different in the event of a man-in-the-middle attack.

There are also more sophisticated implementation of this idea. For example, another version of bacoid used on my servers leverage ZFS config parameters to set target machine, user, etc. Since my needs on my Qubes workstation is limited, I kept this KISS.

Created . Edited .