Understanding ASM Capacity and Reservation of Free Space in Exadata (Doc ID 1551288.1)
In this Document
APPLIES TO:Oracle Database Cloud Schema Service - Version N/A and laterOracle Cloud Infrastructure - Database Service - Version N/A and later Oracle Database Exadata Express Cloud Service - Version N/A and later Gen 1 Exadata Cloud at Customer (Oracle Exadata Database Cloud Machine) - Version N/A and later Oracle Database - Enterprise Edition - Version 11.2.0.3 to 19.3.0.0.0 [Release 11.2 to 19] Information in this document applies to any platform. PURPOSEThis note discusses the concepts of ASM disk group failure coverage, how to reserve free space to cover disk or cell failure in an Exadata environment, and how to determine the resulting capacity of a disk group (DG). SCOPEThis applies primarily to Exadata which uses ASM for storage management. This information is intended for architects, database, and systems administrators who will plan and manage storage in the Exadata environment. It is believed that the concepts apply to non-Exadata environments, but the assertions in this note have not been tested in non-Exadata environments. DETAILSOverviewExadata uses ASM for storage management and disk space allocation; the implementation of ASM is the same as non-Exadata environments, but there are specific things to consider in an Exadata environment:
This note discusses the following:
ASM redundancy typesASM disk groups in Exadata are defined as either normal or high (External redundancy is NOT supported with Exadata storage). Normal redundancy provides two copies of file extents (database files are stored as one or more "file extents" in ASM); high redundancy provides for three copies of file extents. Each disk is partnered with a set of other disks in other failure groups to ensure that file extent copies are stored in separate failure groups and the disk group can tolerate the loss of one disk or one cell in normal redundancy, or two disks or two cells in high redundancy disk groups. In all current production releases as of the update date of this MOS note, we have 8 disk partners and, when possible, 4 failure group (cell partners). As an example, with a 5 cell configuration a given disk will be partnered with two other disks on each of four cells. ASM's partnering method when the CONTENT_TYPE attribute is used, is done so that a loss of all partner disks will not cause the loss of another disk group. For example, we partner differently for DATA than we do for RECO meaning a full partner failure will not cause the loss of both disk groups. Oracle recommends high redundancy disk groups to maintain redundancy during storage cell updates (for example during monthly security updates); this means the disk group will tolerate a single disk failure on a partner cell while the cell is offline for the software update. With database consolidation and with the growing sizes of most databases on Exadata, the impact of any downtime becomes significant, thus high redundancy provides the correct level of protection for business and mission critical applications. Development and test databases can also benefit from high redundancy due to consolidation density or the sheer number of databases that can reside on the same Exadata cluster. In these environments, the cost of downtime is very high due to the impact on many developers, so the use of high redundancy disk groups is imperative for these types of clusters as well. Normal redundancy may be suitable for development/test databases that can tolerate downtime for: 1) storage maintenance since it’s safer to do an offline storage update when configured with normal redundancy or, 2) can tolerate downtime to restore and recover cluster and test/development databases in the case of losing a disk group. Failure CoverageFailure coverage refers to the amount of space in a disk group that will be used to remirror data in the event of Disk failure coverage (DFC) will refer to having enough free space to allow data to be re-mirrored (and rebalanced) after a single disk failure in a normal redundancy disk group, or single or dual disk failure in a high redundancy disk group. Cell failure coverage (CFC) will refer to having enough free space to allow data to be re-mirrored after the loss of one entire cell (i.e., an ASM failure group). Double cell failures are extremely rare and will not be considered in this note. Cell failures are very rare and rebalancing is typically not desired; it is better to simply repair the cell and bring it back online. Therefore, CFC will not be discussed in the remainder of this note. If for some reason you must drop a single cell, each disk group should have FREE_MB greater than a cell's worth of the total disk group space plus an additional 5% of that space. For example, if a disk group has a total size of 2048 GB with 4 cells, if one cell is to be dropped, then FREE_MB should be at least: (2048 GB / 4) * 1.05 = 538 GB Oracle recommends the use of high redundancy disk groups using high capacity or flash disks with space reserved for disk failure coverage. This gives excellent availability benefits and performance along with good capacity. Greater availability can be achieved by having a standby database in addition to using high redundancy disk groups. To learn more about availability, please see the Maximum Availability Architecture resources on OTN.
Calculating Reserve Space and Disk Group CapacityReserving space in the disk group means that you monitor the disk group to ensure that FREE_MB never goes below the minimum amount needed for disk failure Calculating Reserve Space for Failure CoverageFor DFC, to enable rebalancing after the loss of a single disk, Oracle recommends having free space in the disk group equal to or greater than the percentage of the total disk group capacity as follows:
Prior to the fix in bug 32166950, REQUIRED_MIRROR_FREE_MB in V$ASM_DISKGROUP didn’t reflect the values shown in table 1 above. However, the value of REQUIRED_MIRROR_FREE_MB was changed with bug 32166950 to directly show the values recommended in table 1. After the advent of Smart Rebalance (see the section below, "Exadata Smart Rebalance for High Redundancy Disk Groups" for more details) and with the bug fix in 35177768, REQUIRED_MIRROR_FREE_MB is set to zero when a disk group is high redundancy in version 19.x and higher. We recommend applying the collection of improvements available in BLR 35394920 to obtain the latest recommended values and other fixes related to ASM space management. Please note that Oracle has supplied an Exachk check for many years to ensure systems have sufficient free space to rebalance disk groups after a disk failure. The check is called “Verify there is enough diskgroup freespace for a rebalance operation”. To validate if your disk groups have sufficient space, run the specific Exachk check after upgrading to AHF release 23.4 or higher : #exachk -check B516E6BD64DC3012E0431EC0E50A83E8,655C2F41EBD4D8D2E053D398EB0A46B7 -excludeprofile switch An example of the output for the check is: PASS => There is enough disk group free space for a rebalance operation DATA FROM EXADB01 - VERIFY THERE IS ENOUGH DISK GROUP FREE SPACE FOR A REBALANCE OPERATION SUCCESS: Disk group DATAC1 has 68.6% free space which is enough to complete a rebalance operation if a disk fails Disk group DATAC1 HIGH Redundancy, Required Minimum Percent Free = 11%, Required Minimum Free MB = 41246064 Disk group DATAC1 has Total MB = 374964224, Free MB = 257196660 , Usable MB = 71983531 Number of failgroups = 5 SUCCESS: Disk group RECOC1 has 94.4% free space which is enough to complete a rebalance operation if a disk fails Disk group RECOC1 HIGH Redundancy, Required Minimum Percent Free = 11%, Required Minimum Free MB = 10312248 Disk group RECOC1 has Total MB = 93747712, Free MB = 88452120, Usable MB = 26046623 Number of failgroups = 5 Model : Oracle Corporation ORACLE SERVER X10-2L_EXTREME_FLASH Eighth Rack : FALSE Oracle recommends running the above Exachk report frequently to stay up-to-date with disk group capacity.
If your disk group does not have sufficient free space, Oracle recommends:
Calculating Disk Group Capacity Usable DG capacity is calculated by taking into account the reserve space as well as mirroring for each kind of failure coverage: Disk Usable File MB = (FREE_MB - Disk Required Mirror Free MB) / 2 (or divide by 3 for high redundancy) Where, FREE_MB is the raw free space in the disk group in MB. Disk Required Mirror Free MB is the amount of space that should be reserved for disk failure coverage (as explained above in Calculating Reserve Space for Failure Coverage). Compute this as: Disk Required Mirror Free MB = Required % Free space (from table 1 or 2 above) X size of the disk group (from V$ASM_DISKGROUP.TOTAL_MB)
Please note that you must monitor V$ASM_DISKGROUP.FREE_MB to ensure that it never goes below the required amount of free space (Disk Required Mirror Free MB) . If you need to change the space allocation of your disk groups, please see the Oracle Exadata documentation for instructions on adding or resizing your disk groups to increase space in one disk group while shrinking another disk group. Exadata Smart Rebalance for High Redundancy Disk GroupsThe Exadata Smart Rebalance feature adjusts how Exadata handles disk failure when there is insufficient space to rebalance the affected disk groups. This feature effectively increases the usable space in disk groups by eliminating the need to reserve free space to accomodate a rebalance after a disk failure. The Exadata Smart Rebalance feature works in conjunction with high redundancy disk groups only, and is available starting with Exadata Server Software version 19.1 and Oracle Grid Infrastructure version 19.3. Without the Smart Rebalance feature, Exadata responds to disk failure by forcibly dropping the disk and performing a rebalance operation. If a disk group contains insufficient free space, the rebalance terminates with an ORA-15041 error, and the disk group is left in a potentially unbalanced state and without full redundancy. With the Smart Rebalance feature, Exadata software first determines whether there is sufficient free space to complete the rebalance operation. If there is enough space, Exadata automatically drops the disk and proceeds with the rebalance. If there is insufficient free space, then Exadata instructs ASM to OFFLINE the disk, which avoids a failed rebalance operation. Then, after physical disk replacement, Exadata directs ASM to perform a REPLACE DISK operation, which efficiently reconstructs just the lost contents of the failed disk. Finally, ASM automatically brings the disk online after completion of the disk replacement operation. Smart Rebalance only applies to disk failures, not to cell failures. If a cell fails and failgroup_repair_timer expires, then a rebalance will be attempted regardless of whether Smart Rebalance can be used for disk failures. Note that affected disk groups must run with reduced redundancy for the whole time that the disk is offline. During this time, failure of additional partner disks would further compromise redundancy, and in the worst-case scenario, data loss is inevitable if all partner disks fail. Consequently, as best practice, Oracle recommends maintaining sufficient free space to conduct a rebalance operation to ensure that triple redundancy will be restored after a disk failure and give you greater resiliency in the event of a disk failure or an online storage cell planned maintenance operation (which temporarily reduces redundancy to two copies while a cell is offline). Smart Rebalance will still be used regardless if the following recommendations are heeded or not.
If you choose to rely on the Smart Rebalance feature to maximize storage capacity, then you should meet the following recommendations to mitigate the risk of failure to additional partner disks and possibly a lost disk group: 1. The disks must be less than five (5) years old and must be 8TB or greater in size (see MOS Document 2075007.1) 2. Oracle ASR must be fully configured for both ILOM and OS usage 3. Your operations staff is responsive to disk failures and maintain up-to-date operational procedures 4. You replace failed disks within 48 hours. To determine how quickly you historically replace failed disks, you can analyze the cell alert history by using the following command: dcli -g cell_group -l root "cellcli -e list alerthistory" | grep -i 'hard disk' | egrep -i 'fail|normal’
Note that the above command shows events retained in each cell alert history, which may only cover a brief period, depending on the frequency of alert generation and the size of the alert history. In this case, you may need to consult your Oracle Service Request history or other maintenance logs for the required information. You are also encouraged to maintain knowledge across your personnel regarding the location and usage of on-site disk “spares” kits. Requirements 1 and 2 are considered the basic recommendations for using Smart Rebalance. Requirements 3 and 4 must be assessed and maintained by you in addition to the basic recommendations. If you cannot meet these recommendations, Oracle recommends that you do not rely on Smart Rebalance.
In summary, here are the pros and cons of Exadata Smart Rebalance:
* MAA-recommendation Considerations when Adding Space to an Existing Disk GroupStorage cells can be added when additional storage is needed. Only a small amount of free space is needed in each disk group to ensure new storage can be added; this is computed as: Minimum free space needed in each disk group when adding storage = (64MB X rebalance power X total number of disks currently in a disk group) "rebalance power" is the rebalance power used when adding the storage "number of disks" is the total number of disks in a disk group (e.g., for a quarter rack there are 3 cells with 12 HC disks for a total of 36 disks) For example, if we are adding storage to an existing rack with 3 HC storage servers (12 disks per storage server) for a total number of 36 disks for each disk group, and we plan to add the new disks using rebalance power 4, the free space needed would be: Minimum free space needed in each disk group for adding storage = (64 MB X 4 X 36 ) = 9216 MB If your system does not have the fix for bug 33317279, then you should compute an additional amount of free space per disk group using this formula: Additional free space needed in each disk group when adding storage = (Number of disks being added * disk size in TB / 1.73TB + 2) * Number of mirrors (either 2 or 3) * 4MB This additional free space requirement is added to the "Minimum free space needed in each disk group when adding storage" computation above for a total amount of free space that is needed for each disk group.
Oracle recommends performing a rebalance on all disk groups and ensure they complete successfully prior to attempting to add additional storage after applying the fix for 33317279. After the fix is applied and a rebalance operation completes without error on all disk groups, then you will not need to add this additional free space for future storage additions.
For example, if we are adding a single storage server to an existing rack that has 3 storage servers, and the ASM disk size for one of the high redundancy disk groups is 16 TB, then the additional space for that disk group would be: Additional free space needed in each disk group when adding storage =( 12 X 16 TB/1.73 TB + 2 ) X 3 mirrors X 4 MB = (12 X 9.25 + 2) X 12 = ( 111 + 2) X 12 = 1356MB This calculation would need to be repeated for each disk group and added to the results of the "Minimum free space needed in each disk group when adding storage above".
For the examples above, the total minimum free space needed for the disk group in our calculations would be: 9216 + 1356 MB = 10572 MB If you have just applied the fix for bug 33317279 and but have not yet successfully rebalanced the disk groups, then you should add this additional space requirement.
Do not attempt to add storage unless your disk groups have the minimum amount of space needed as indicated above. Once additional storage is available, one should attempt to ensure that all disk groups are sized to allocate at least enough free space to accommodate the DFC equirements mentioned above.
Oracle Cloud ConsiderationsWhen storage is added in the Oracle cloud, the storage is added in two phases:
Additional Considerations
List of Recommended Bug Fixes
REFERENCESNOTE:1467056.1 - Resizing Grid Disks in Exadata: ExamplesNOTE:1070954.1 - Oracle Exadata Database Machine Exachk NOTE:1465230.1 - Resizing Grid Disks in Exadata: Example of Recreating RECO Grid Disks in a Rolling Manner NOTE:1464809.1 - Script to Calculate New Grid Disk and Disk Group Sizes in Exadata | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||



Comments
Post a Comment