Best Practices and Recommendations for RAC databases with SGA size over 100GB (Doc ID 1619155.1)
In this Document
| Purpose |
| Scope |
| Details |
| References |
APPLIES TO:
Oracle Database Cloud Exadata Service - Version N/A and laterOracle Database Cloud Service - Version N/A and later
Oracle Database - Enterprise Edition - Version 11.2.0.3 and later
Oracle Database Backup Service - Version N/A and later
Oracle Database Cloud Schema Service - Version N/A and later
Information in this document applies to any platform.
PURPOSE
The goal of this note is to provide best practices and recommendations to users of Oracle Real Application Clusters (RAC) databases using very large SGA (e.g. 100GB) per instance (note that RAC assumes homogeneously sized SGAs across the cluster). This document is compiled and maintained based on Oracle's experience with its global RAC customer base.
This is not meant to replace or supplant the Oracle Documentation set, but rather, it is meant as a supplement to the same. It is imperative that the Oracle Documentation be read, understood, and referenced to provide answers to any questions that may not be clearly addressed by this note.
All recommendations should be carefully reviewed by your own operations group and should only be implemented if the potential gain as measured against the associated risk warrants implementation. Risk assessments can only be made with a detailed knowledge of the system, application, and business environment.
SCOPE
This article applies to all new and existing RAC implementations.
This is for RAC databases only as most of the parameters listed in here are for RAC Database only.
DETAILS
Note that the recommendations presented in this note are a result of the experience from working on databases with SGA of 1 TB and 2.6 TB.
However, the databases with SGA of 100GB and 300GB also benefited from the recommendations
Also, some recommendation is removed for 18.1 and above, so check if the recommendation is applicable to your database.
Download latest AHF. Refer ti Autonomous Health Framework (AHF) - Including TFA and ORAchk/EXAchk Document 2550798.1
init.ora parameters:
Setting this will prevent some timeouts during reconfiguration and DRM. It's a static parameter and rolling restart is supported.
b. Set shared_pool_size to 15% or larger of the total SGA size.
For example, if SGA size is 1 TB, the shared pool size should be at least 150 GB. It's a dynamic parameter.
c. Set _gc_policy_minimum to 15000
There is no need to set _gc_policy_minimum if DRM is disabled by setting _gc_policy_time = 0. _gc_policy_minimum is a dynamic parameter, _gc_policy_time is a static parameter and rolling restart is not supported. To disable DRM, instead of _gc_policy_time, _lm_drm_disable should be used as it's dynamic.
Note: 15000 is the new default in 23c, 19c DBRU JUL '23, and 19c ADB. Customer won't have to tune this parameter any more in those releases or later.
This is due to internal bug 34729755.
d. Set _lm_tickets to 5000 (this recommendation is valid only for databases that are12.2 and lower)
Default is 1000. Allocating more tickets (used for sending messages) avoids issues where we ran out of tickets during the reconfiguration. It's a static parameter and rolling restart is supported. When increasing the parameter, rolling restart is fine but a cold restart can be necessary when decreasing.
e. Set gcs_server_processes to the twice the default number of lms processes that are allocated. (this recommendation is valid only for databases that are12.2 and lower)
The fix is also included in the 12.2.0.1 JUL 2018 database RU, so this does apply to the database that is running on 12.2.0.1 JUL 2018 or higher.
The default number of lms processes depends on the number of CPUs/cores that the server has,
so please refer to the gcs_server_processes init.ora parameter section in the Oracle Database Reference Guide
for the default number of lms processes for your server. Please make sure that the total number of lms processes
of all databases on the server is less than the total number of CPUs/cores on the server. Please refer to the Document 558185.1
It's a static parameter and rolling restart is supported.
f. Set TARGET_PDBS to the number of PDBs that are planned to be running in the CDB. Do not add seed and root in this count. (This recommendations is valid for 12.2 databases and higher)
The default value of TARGET_PDBS, especially for databases with a large sga_target setting, is known to cause performance and instance eviction issues.
For detailed description of issues related to target_pdbs, refer to the Document 2644243.1
In other words, setting up hugepages when SGA is large is a critical recommendation.
For other platforms, consider using large pages if possible.
Following patches are recommended:
11.2.0.3.5 DB PSU or above is highly recommended to address known issues with large SGA sizes.
BUG 12747740 - RAC PERF: NODE JOIN RECONFIGURATION (PCMREPLAY) DOES NOT SCALE WITH MORE LMS'S
BUG 14193240 - LMS SIGNALED ORA-600[KGHLKREM1] DURING BEEHIVE LOAD
BUG 16392068 - MSGQ: LMS0 HITS ORA-600 [KJBMPOCR:DSB]
BUG 17232014 - INITIAL ALLOCATION FOR KJBR&KJBL ARE TOO LOW W/ LARGE CACHES DUE TO UB4 OVERFLOW
BUG 17257445 - RAC PERF: DRM OPTIMIZATION (BUG 14558880) SHOULD ALSO WORK FOR RECONFIGURATION
BUG 17314971 - RAC PERF: RM/PT LATCH REDUCTION FOR RCFG (17257445) SHOULD BE ENABLED FOR SYNC7
For SGA that is larger than 4 TB and for Linux platform,
BUG 18780342 - LINUX SUPPORT FOR > 4TB SGA
REFERENCES
NOTE:558185.1 - LMS and Real Time Priority in Oracle RAC 10g and 11gNOTE:1392248.1 - Auto-Adjustment of LMS Process Priority in Oracle RAC with 11.2.0.3 and later
NOTE:2550798.1 - Autonomous Health Framework (AHF) - Including TFA and ORAchk/EXAchk
NOTE:2644243.1 - Performance Issues when using PDBs with Oracle RAC 19c and 18c
Comments
Post a Comment