This post aims to dissect the encountered issues, the steps taken to resolve them, and the broader implications for DBAs.
The Initial Conundrum
Our journey began with a client reaching out regarding certain database activities on Oracle EXACC, which were not proceeding as expected. The initial encounter with the problem presented itself through the execution of a dbaascli command aimed at retrieving database information from an Oracle Home Directory:
[oracle@testvm1 ~]$ sudo dbaascli dbhome getDatabases --oracleHome /u02/app/oracle/product/12.2.0/dbhome_22
DBAAS CLI version 24.1.1.0.0
Executing command dbhome getDatabases --oracleHome /u02/app/oracle/product/12.2.0/dbhome_22
Job id: jhgfd678-iuyt-765fg-678jhg-drftghyuji78jhgfd99
Session log: /var/opt/oracle/log/dbHome/getDatabases/dbaastools_2024-02-01_04-03-11-PM_6490.log
[FATAL] [DBAAS-60231] Unable to detect CRS home from inventory. CAUSE: Either CRS home does not exist or CRS home information is missing in the inventory. ACTION: Make sure that CRS is configured and registered to central inventory.
This error pointed to an issue with detecting the CRS (Cluster Ready Services) home from the inventory, indicating a possible misconfiguration or corruption of the inventory files. Further investigation revealed a concerning discovery:
[root@testvm1 ContentsXML]# cat inventory.xml
[root@testvm1 ContentsXML]# ls -l inventory.xml -rw-rw---- 1 grid oinstall 0 Jun 27 22:07 inventory.xml
The inventory.xml file, a cornerstone for Oracle software tracking and management, was found to be zeroed out, raising alarms about the integrity of the Oracle Inventory.
Diagnosis and Initial Recovery
The output provided details on the Oracle Home but underscored the continued challenges posed by the missing inventory data. The resolution path was clear: restore the inventory.xml file from the latest backup
find . -name inventory.xml /u01/app/oraInventory/backup/2024-03-03_07-35-21PM/ContentsXML/inventory.xml mv /u01/app/oraInventory/ContentsXML/inventory.xml /u01/app/oraInventory/ContentsXML/inventory.xml_bkp mv /u01/app/oraInventory/backup/2024-03-03_07-35-21PM/ContentsXML/inventory.xml /u01/app/oraInventory/ContentsXML/
To confirm the impact and scope of this issue, we executed another dbaascli command:
[root@testvm1 ContentsXML]# dbaascli dbhome getDetails --oracleHome /u02/app/oracle/product/12.2.0/dbhome_22
DBAAS CLI version 24.1.1.0.0
Executing command dbhome getDetails --oracleHome /u02/app/oracle/product/12.2.0/dbhome_22
Job id: 67567fghg-7657fghgh-76546fghjg-5678fgh-56789gfdghj
Session log: /var/opt/oracle/log/dbHome/getDetails/dbaastools_2024-03-12_06-37-17-PM_338480.log
{
"homePath" : "/u02/app/oracle/product/12.2.0/dbhome_22",
"homeName" : "OraHome22",
"version" : "12.2.0.1.220419",
"createTime" : 5678909876877,
"updateTime" : 1707679045000,
"unifiedAuditEnabled" : false,
"ohNodeLevelDetails" : {
"testvm2" : {
"nodeName" : "testvm2",
"version" : "12.2.0.1.220419",
"patches" : [ "33904781", "33028462", "33613833", "33613829", "33912887", "33810224", "32327201", "31335037", "30171171", "33829783", "25292893", "33878460", "33871679" ]
},
"testvm1" : {
"nodeName" : "testvm1",
"version" : "12.2.0.1.220419",
"patches" : [ "33904781", "33028462", "33613833", "33613829", "33912887", "33810224", "32327201", "31335037", "30171171", "33829783", "25292893", "33878460", "33871679" ]
}
},
"messages" : [ ]
}
dbaascli execution completed
A New Challenge Emerges
Just when it seemed the situation was under control, another issue surfaced, further complicating the scenario:
Execution of check_patch_conflicts failed [FATAL] [DBAAS-60022] Command '/u01/app/19.0.0.0/grid/OPatch/opatchauto apply -analyze -oh /u01/app/19.0.0.0/grid -phBaseDir /u02/exapatch/patches/ru/656788976 ' has failed on nodes [testvm1]. *MORE DETAILS* Result of node:testvm1 ERRORS: java.lang.NullPointerException ... [OPATCHAUTO-72083: Performing bootstrap operations failed., OPATCHAUTO-72083: The bootstrap execution failed because failed to detect Grid Infrastructure setup due to java.lang.NullPointerException., OPATCHAUTO-72083: Fix the reported problem and re-run opatchauto. In case of standalone SIDB installation and Grid is not installed re-run with -sidb option., OPatchauto session completed at Fri Mar 10 10:23:53 2024, Time taken to complete the session 0 minute, 28 seconds, opatchauto bootstrapping failed with error code 255.]
This error, stemming from an attempt to apply a patch, indicated deeper issues with the Oracle Inventory and the configuration of the Oracle Grid Infrastructure.
Upon returning to the ContentsXML directory, it was noted that the comps.xml and libs.xml files were also empty just as was the case earlier with the inventory.xml , with their last modifications on the same troubling day:
[oracle@testvm1 ContentsXML]$ ls -l total 12 -rw-rw---- 1 grid oinstall 0 Feb 27 15:01 comps.xml -rw-r----- 1 grid oinstall 5335 Mar 15 15:13 inventory.xml -rw-r----- 1 root root 0 Mar 6 18:21 inventory.xml_bkp -rw-rw---- 1 grid oinstall 0 Feb 27 15:01 libs.xml -rw-rw---- 1 grid oinstall 174 Jan 31 15:47 oui-patch.xml
The Final Resolution
Understanding the gravity of these corrupted XML files for the operation and management of the Oracle infrastructure, we proceeded with the restoration of libs.xml and comps.xml from backups, mirroring the approach taken with inventory.xml. This comprehensive restoration effort aimed to bring the Oracle Inventory back to a consistent, operational state.
mv /u01/app/oraInventory/ContentsXML/libs.xml /u01/app/oraInventory/ContentsXML/libs.xml_bkp mv /u01/app/oraInventory/ContentsXML/comps.xml /u01/app/oraInventory/ContentsXML/comps.xml_bkp cp /u01/app/oraInventory/backup/2024-03-03_07-35-21PM/ContentsXML/libs.xml /u01/app/oraInventory/ContentsXML/libs.xml cp /u01/app/oraInventory/backup/2024-03-03_07-35-21PM/ContentsXML/comps.xml /u01/app/oraInventory/ContentsXML/comps.xml
Key Takeaways and Best Practices
This experience underscores the importance of Regular Backups: The critical role of regular, comprehensive backups of the Oracle Inventory and related files cannot be overstated. These backups are lifelines in the event of corruption or accidental deletion.