We want to move images of open food facts from a normal storage on off1 to a ZFS dataset on off2, replicated on ovh3.
We want to move to ZFS because it makes it really easy to backup thanks to ZFS snapshot and syncing capabilities. It was already done for products.
The strategy is the following
-
first reverse current syncing of ZFS so that ovh3 is synced from off2
-
mount image dataset on off1 thanks to NFS (NB: use nolock option)
-
for each top level folders
- rsync the folder to the ZFS dataset
- move the folder to a new name (xxx.old)
- replace by a symlink that point to the synced folder on the ZFS dataset (mounted on off1)
- rsync again to be sure we didn't miss anything
At the end we will have a product images folder full of symlinks.
-
At this point we can replace move the image folder ot image.old and NFS mount the ZFS dataset as image
We have two script implementing that for products that we can transform for images. They are in scripts/zfs/migration.sh
and scripts/zfs/migration-ean8.sh
I first verify if off2 is in sync with ovh3:
(off2)$ sudo zfs list -t snapshot zfs-hdd/off/images
...
zfs-hdd/off/images@20230220220000 1.75G - 9.70T -
zfs-hdd/off/images@20230221070000 1.61G - 9.70T -
zfs-hdd/off/images@20230221183000 232K - 9.70T -
(ovh3)$ sudo zfs list -t snapshot rpool/off/images
...
rpool/off/images@20230221070000 2.21G - 10.1T -
rpool/off/images@20230221183000 6.76G - 10.1T -
rpool/off/images@20230516083018 22.3G - 10.1T -
It's not, we miss one snapshot, so we must sync.
Sync from ovh3 (we had to use -F)
zfs send -i rpool/off/images@20230221183000 rpool/off/images@20230516083018 |ssh off2.openfoodfacts.org zfs recv zfs-hdd/off/images -F
I also see that last snapshot on ovh3 has not all modifications:
$ zfs list -po written rpool/off/images
WRITTEN
24130245744
$ zfs get -Hr written rpool/off/images
rpool/off/images written 22.5G -
...
So I create a snapshot and redo the sync operation
TIMESTAMP=$(date +%Y%m%d%H%M%S)
zfs snapshot rpool/off/images@$TIMESTAMP
zfs send -i rpool/off/images@20230516083018 rpool/off/images@$TIMESTAMP |ssh off2.openfoodfacts.org zfs recv zfs-hdd/off/images -F
But after that, on ovh3 the dataset keeps continuing being written at !
We will suppose that it's because of atime
(access time) which must be stored.
On ovh3, I will use sanoid but with the synced data format:
# /etc/sanoid/sanoid.conf
…
[rpool/off/images]
use_template=synced_data
recursive=no
…
On off, I will use sanoid with normal format:
# /etc/sanoid/sanoid.conf
…
[zfs-hdd/off/images]
use_template=prod_data
recursive=no
…
And configure syncoid to replicate to ovh3
# /ect/sanoid/syncoid-args.conf
…
--no-sync-snap zfs-hdd/off/images [email protected]:rpool/off/images
…
Let's try a run of syncoid:
syncoid --no-sync-snap zfs-hdd/off/images [email protected]:rpool/off/images
NEWEST SNAPSHOT: 20230523165657
INFO: no snapshots on source newer than 20230523165657 on target. Nothing to do, not syncing.
So I wait until next snapshoting… (10 min later). And it's catastrophic: OVH3 nginx is not capable of serving images anymore ! And in fact syncoid service is blocked (because of a problem on ovh3 that makes it reboot…).
This made me think I should add a TimeoutStartSec to syncoid for such case (eg. 6h).
After stoping the stalled syncoid, I wait for the next run.
It's catastrophic, OVH3 does not serve images anymore. Looking at processes I see a zfs rollback
on rpool/off/images
and it's normal since syncoid tries to put it on last snapshot before syncing. But in the meantime, NGINX does not have access to images… and it seems to take a long time to happen.
I decided to disable atime on ovh3 to see if it make the writes stops.
On ovh3 (useful resource):
$ zfs set atime=off rpool/off/images
$ # verifying
$ zfs get atime rpool/off/images
NAME PROPERTY VALUE SOURCE
rpool/off/images atime off local
$ mount |grep rpool/off/images
rpool/off/images on /rpool/off/images type zfs (rw,noatime,xattr,noacl)
I then take a snapshot:
TIMESTAMP=$(date +%Y%m%d%H%M%S)
zfs snapshot rpool/off/images@$TIMESTAMP
Now it seems nothing is written anymore on the disk:
$ zfs list -po written rpool/off/images
WRITTEN
0
So the guess about atime was right.
I synced back to off2:
zfs send -i rpool/off/images@20230523165657 rpool/off/images@20230531083016 |ssh off2.openfoodfacts.org zfs recv zfs-hdd/off/images -F
Now that we have a stable ZFS on ovh3 we can activate the syncoid sync.
Let's mount the images images ZFS volume from off2 to off1.
I first install nfs server on off2 and enable nfs sharing on my dataset:
apt install nfs-kernel-server
zfs set sharenfs="[email protected]/32" zfs-hdd/off/images
Then on off1,
To install NFS, I add to update the sources list in /etc/apt/sources.list
and replace fr2.ftp.debian.org
by archive.debian.org
Then
apt update
apt install nfs-common
Trying to mount
mkdir /mnt/off2
mkdir /mnt/off2/off-images
mount -t nfs -o rw "10.0.0.2:/zfs-hdd/off/images" /mnt/off2/off-images
ls /mnt/off2/off-images
Adding to /etc/fstab
:
…
# off2 NFS mounts
10.0.0.2:/zfs-hdd/off/images /mnt/off2/off-images nfs rw,nolock,nofail 0 0
…
Doing rsync manually for one folder to see the time it takes. Taking some 3?? code. It shows it was fast enough on second pass.
I use migration.sh and wrote scripts/zfs/migration-images.sh
and copied it to off1.
Running it works fine. But it had to be relaunched several time because sometime rsync did end up with non zero error.
EAN8 are the longest to migrate due to their sequentiality.
It took around 6 days.
EAN8 folders continually get created, so I ended up with a single command to do last syncs and change the link to the NFS mount:
./migration-images.sh && \
./migration-images.sh && \
unlink /srv/off/html/images/products && \
ln -s /mnt/off2/off-images/products /srv/off/html/images/products
and do a last check that we have no non migrated directory left:
find /srv2/off/html/images/products.old/ -maxdepth 1 -type d -not -name '*.old'