It’s essential to have a proper backup mechanism for virtualization infrastructure. Also, we need to test the backup and recovery method at least once in 3 months to validate that everything is working as expected. Also documenting the recovery procedure helps to avoid surprises when there is a recovery scenario. Organizations should be ready to address unexpected failures at any time.
For Oracle Linux Virtualization Manager (OLVM) 4+ environments you can use API v4 for invoking all backup-related tasks. Import/export mode defines the way the backups and restores are done. OLVM (with API v4) supports 3 modes:
1. Disk Attachment
Which exports VM metadata (in OVF format) with separate disk files (in RAW format) via Proxy VM with the Node installed
2. Disk Image Transfer
Which exports VM metadata (in OVF format) with disk snapshot chains as separate files (QCOW2 format):
3. SSH Transfer, this method assumes that all data transfers are directly from the hypervisor over the SSH protocol
Below mentioned URL below helps you to filter all the backup tools that support Oracle Linux Virtualization Manager
Supported third-party backup tools:
https://apexapps.oracle.com/pls/apex/f?p=10263:17::::::
Figure 1: Third-party Backup Tools
In some cases, the disk image transfer network connection is disturbed disk will be stuck in finalizing state.
Note: If the disks are finalizing state, you cannot put KVM into maintenance mode.
Figure 2: Try to put KVM into Maintenance Mode
You can get a clear understanding of disk image transfer by referring to the below-mentioned URL: https://storware.gitbook.io/backup-and-recovery/protecting-virtual-machines/virtual-machines/oracle-linux-virtualization-manager
Disk Image Transfer API:
This API allowed the export of individual snapshots directly from the OLVM manager. So instead of installing multiple Proxy VMs, you can have a single external Node installation, which invokes APIs via the OLVM manager.
In this article, I will cover how it can be overcome if the disk is stuck in a finalizing state.
Also, I have mentioned the Oracle meta link note : OLVM: Unable to put KVM host to maintenance mode due to Image transfer in progress (Doc ID 2915392.1)
As mentioned in Figure 3, this is how it looks when disks are stuck in finalizing state.
Figure 3: disk stuck in finalizing state.
The best approach is the query the disk state in the OLVM engine. This will help you to understand which disks are stuck in finalizing status.
Note: All the commands should be executed from the OLVM engine server
[root@local-olvm-engine ~]# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "select command_id, phase, disk_id, last_updated from image_transfers;"
command_id | phase | disk_id | last_updated
--------------------------------------+-------+--------------------------------------+----------------------------
dcc47178-ebb1-47c1-900b-bc9753e12378 | 7 | 37d4046a-2705-4b55-9005-65567e50620c | 2023-04-29 23:55:31.678-04
77a820ab-c580-4b46-9c0c-22102a0ce706 | 7 | a9b5b747-2fae-4b32-b839-2ea03dfcf35e | 2023-04-28 20:28:21.358-04
(3 rows)
[root@local-olvm-engine ~]#
As per the meta link note, you can update the image transfer status in phase 7 to either 9 failed or 10 completed depending on the situation.
[root@local-olvm-engine ~]# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "UPDATE image_transfers SET phase = '10' WHERE command_id = 'dcc47178-ebb1-47c1-900b-bc9753e12378'; "
UPDATE 1
[root@local-olvm-engine ~]# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "UPDATE image_transfers SET phase = '10' WHERE command_id = '77a820ab-c580-4b46-9c0c-22102a0ce706'; "
UPDATE 1
[root@local-olvm-engine ~]# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "select command_id, phase, disk_id, last_updated from image_transfers;"
command_id | phase | disk_id | last_updated
--------------------------------------+-------+--------------------------------------+----------------------------
dcc47178-ebb1-47c1-900b-bc9753e12378 | 10 | 37d4046a-2705-4b55-9005-65567e50620c | 2023-04-29 23:55:31.678-04
77a820ab-c580-4b46-9c0c-22102a0ce706 | 10 | a9b5b747-2fae-4b32-b839-2ea03dfcf35e | 2023-04-28 20:28:21.358-04
(2 rows)
[root@local-olvm-engine ~]#
Execute below mentioned command to validate the disk status, Also disk should be changed to the O.K state in the OLVM URL.
[root@sofe-olvm-01 ~]# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "select command_id, phase, disk_id, last_updated from image_transfers;"
command_id | phase | disk_id | last_updated
--------------------------------------+-------+--------------------------------------+----------------------------
dcc47178-ebb1-47c1-900b-bc9753e12378 | 10 | 37d4046a-2705-4b55-9005-65567e50620c | 2023-04-29 23:55:31.678-04
77a820ab-c580-4b46-9c0c-22102a0ce706 | 10 | a9b5b747-2fae-4b32-b839-2ea03dfcf35e | 2023-04-28 20:28:21.358-04
(2 rows)
When an organization hosts a critical VM server in the OLVM virtualization environment they need to plan their backup method. There can be a situation where you have to recover the VM from the backup. Backup and recovery need to be tested and documented.
To resolve disk state errors we need to update the Postgres database, I would recommend backing up the OLVM engine before making any changes. Also better to consult an Oracle engineer to get a more precise understanding before changing the image_transfer phase.