It has been ages since when Oracle released its own Hypervisor Oracle Virtualization Manager (OVM). The OVM technology is based on paravirtualization and uses Xen-based hypervisor. The OVM’s latest release version 3.4.6.3 is the latest one available. Oracle announces extended support for OVM and the support period is March 2021 and will end on March 31, 2024.
If you need more information about OVM support read the below mentioned article: https://blogs.oracle.com/virtualization/post/announcing-oracle-vm-3-extended-support/
This is going to be the end of the OVM tree, after this there will be no release for OVM. The latest technology going to Oracle KVM. Oracle KVM is much more stable than the OVM and gives you more flexibility in the virtualization environment. If you are still planning on staying on on-prem. I would say this is the right time to plan your journey to KVM.
In this article, I will cover the issue we faced recently in the OVM environment.
We faced a new issue with the OVM cluster environment. This was caused by to sudden data center power outage. Once everything was online we were not able to start the OVM hypervisor. so we had to perform a complete reinstallation of the node.
When I tried to add node backup to the cluster we faced an issue with mounting the repositories. The next option was to remove the nodes from the cluster again, This action was performed via GUI.
I did validation and realized that cluster entries are still there in node02. we found that there were some stale entries in the master node and node02.
Please find the Oracle meta link node that covers the issue for stale entries :
OVM – How To Remove Stale Entry of the Oracle VM Server which was Removed from The Pool (Doc ID 2418834.1)
First, validate the o2cb status. This is the cluster service which consists of all the information about the cluster. I have highlighted node information in red color.
[root@ovm-node02 ~]# service o2cb status
Driver for "configfs": Loaded
Filesystem "configfs": Mounted
Stack glue driver: Loaded
Stack plugin "o2cb": Loaded
Driver for "ocfs2_dlmfs": Loaded
Filesystem "ocfs2_dlmfs": Mounted
Checking O2CB cluster "f6f6b47b38e288e0": Online
Heartbeat dead threshold: 61
Network idle timeout: 60000
Network keepalive delay: 2000
Network reconnect delay: 2000
Heartbeat mode: Global
Checking O2CB heartbeat: Active
0004FB0000050000B705B4397850AAD6 /dev/dm-2
Nodes in O2CB cluster: 0 1
Debug file system at /sys/kernel/debug: mounted
Now let’s check entries from node02, if this is correctly removed from the cluster you should see only one entry. But here there are two entries.
[root@ovm-node02 ovm-node02]# ls -lrth /sys/kernel/config/cluster/f6f6b47b38e288e0/node/
total 0
drwxr-xr-x 2 root root 0 Jun 23 09:28 ovm-node02
drwxr-xr-x 2 root root 0 Jun 23 09:33 ovm-node01
[root@ovm-node02 ovm-node02]#
The next step is to validate from the master node (ovm-node01) database entries. This shows there are two pool_member_ip_list.
[root@ovm-node01]# ovs-agent-db dump_db server
{'cluster_state': 'DLM_Ready',
'clustered': True,
'fs_stat_uuid_list': ['0004fb000005000015c1fb14ef761f40',
'0004fb000005000079ae03177c3edc7e',
'0004fb000005000065985109f8834e8b'],
'is_master': True,
'manager_event_url': 'https://192.168.85.152:7002/ovm/core/wsapi/rest/internal/Server/08:00:20:ff:ff:ff:ff:ff:ff:ff:00:10:e0:ef:de:6a/Event',
'manager_ip': '192.168.85.152',
'manager_statistic_url': 'https://192.168.85.152:7002/ovm/core/wsapi/rest/internal/Server/08:00:20:ff:ff:ff:ff:ff:ff:ff:00:10:e0:ef:de:6a/Statistic',
'manager_uuid': '0004fb0000010000c8ecbd219dc6b1ee',
'node_number': 0,
'pool_alias': 'EclipsysOVM',
'pool_master_ip': '192.168.85.177',
'pool_member_ip_list': ['192.168.85.177', '192.168.85.178'],
'pool_uuid': '0004fb0000020000f6f6b47b38e288e0',
'poolfs_nfsbase_uuid': '',
'poolfs_target': '/dev/mapper/36861a6fddaa0481ec0dd3584514a8d62',
'poolfs_type': 'lun',
'poolfs_uuid': '0004fb0000050000b705b4397850aad6',
'registered_hostname': 'ovm-node01',
'registered_ip': '192.168.85.177',
'roles': set(['utility', 'xen'])}
[root@calavsovm01 ovm-node01]#
Now we can remove the oven-node02 from the second node.
[root@ovm-node01]# o2cb remove-node f6f6b47b38e288e0 ovm-node02
After removing node02, we can see only one entry in the OVM database.
[root@ovm-node01]# ls /sys/kernel/config/cluster/f6f6b47b38e288e0/node/
ovm-node02
[root@ovm-node01]#
First, restart the ovs-agent on both nodes and validate the o2cb cluster status from node01.
[root@ovm-node01]# service ovs-agent restart
Stopping Oracle VM Agent: [ OK ]
Starting Oracle VM Agent: [ OK ]
[root@ovm-node01 ~]# service ovs-agent status
log server (pid 32442) is running...
notificationserver server (pid 32458) is running...
remaster server (pid 32464) is running...
monitor server (pid 32466) is running...
ha server (pid 32468) is running...
stats server (pid 32470) is running...
xmlrpc server (pid 32474) is running...
fsstats server (pid 32476) is running...
apparentsize server (pid 32477) is running...
[root@ovm-node01 ~]#
Also, I would recommend restarting the node02 after the node removal, Once the node is back online validate the /etc/ocfs2/cluster.conf
[root@ovm-node01 ~]# cat /etc/ocfs2/cluster.conf
cluster:
heartbeat_mode = global
node_count = 1
name = f6f6b47b38e288e0
node:
number = 0
cluster = f6f6b47b38e288e0
ip_port = 7777
ip_address = 10.110.110.101
name = ovm-node01
heartbeat:
cluster = f6f6b47b38e288e0
region = 0004FB0000050000B705B4397850AAD6
Note: OVS-Agent restart won’t have any impact on running Virtual Machines
[root@ovm-node01]# ovs-agent-db dump_db server
{'cluster_state': 'DLM_Ready',
'clustered': True,
'fs_stat_uuid_list': ['0004fb000005000015c1fb14ef761f40',
'0004fb000005000079ae03177c3edc7e',
'0004fb000005000065985109f8834e8b'],
'is_master': True,
'manager_event_url': 'https://192.168.85.152:7002/ovm/core/wsapi/rest/internal/Server/08:00:20:ff:ff:ff:ff:ff:ff:ff:00:10:e0:ef:de:6a/Event',
'manager_ip': '192.168.85.152',
'manager_statistic_url': 'https://192.168.85.152:7002/ovm/core/wsapi/rest/internal/Server/08:00:20:ff:ff:ff:ff:ff:ff:ff:00:10:e0:ef:de:6a/Statistic',
'manager_uuid': '0004fb0000010000c8ecbd219dc6b1ee',
'node_number': 0,
'pool_alias': 'EclipsysOVM',
'pool_master_ip': '192.168.85.177',
'pool_member_ip_list': ['192.168.85.177'],
'pool_uuid': '0004fb0000020000f6f6b47b38e288e0',
'poolfs_nfsbase_uuid': '',
'poolfs_target': '/dev/mapper/36861a6fddaa0481ec0dd3584514a8d62',
'poolfs_type': 'lun',
'poolfs_uuid': '0004fb0000050000b705b4397850aad6',
'registered_hostname': 'ovm-node01',
'registered_ip': '192.168.85.177',
'roles': set(['utility', 'xen'])}
[root@ovm-node01]#
There can be situations gui will not remove the entries from the OVM Hypervisor. Always validate the OVM data entries before retying the node addition to the cluster. Make sure the cluster-shared repositories are mounting automatically.