Search Blog Post

Thursday, March 27, 2014

ORA-00494: enqueue [CF] held for too long (more than 900 seconds)

Problem:
For one of our client we have a backup strategy where we take a RMAN hot backup every night for all the EBS enviroment with Oracle Database 11.2.0.3.

But last night RMAN hot backup for our test environment failed or I would say got killed with the below error messages:
RMAN-03009: failure of backup command on dsk_9 channel at 03/27/2014 03:19:39
ORA-00028: your session has been killed
ORA-00028: your session has been killed
continuing other job steps, job failed will not be re-run

At the same time we found below error message in the alert alert log:
Thu Mar 27 03:18:11 2014
Errors in file /mnt/TSTapps/oracle/trctstdb/11.2.0/dbhome_1/log/diag/rdbms/trctst/TRCTST/trace/TRCTST_ora_70736.trc  (incident=234835):
ORA-00494: enqueue [CF] held for too long (more than 900 seconds) by 'inst 1, osid 66183'
Incident details in: /mnt/TSTapps/oracle/trctstdb/11.2.0/dbhome_1/log/diag/rdbms/trctst/TRCTST/incident/incdir_234835/TRCTST_ora_70736_i234835.trc

Cause:
The reason for above behavior could be high load on server or high concurrency on resources, IO waits , which keep the Oracle background processes from receiving the necessary resources.
We verified and found high IO waits on our server with sar -u 2 5 command, this probably could be the reason why kill blocker interface killed our rman hot backup process.

Solutions:
This kill blocker interface / ORA-494 was introduced in 10.2.0.4. This new mechanism will kill *any* kind of blocking process, non-background or background. If that blocking process is background process, then the instances crashes too.
 The following situation can be avoided in two ways:
  1. If you want to avoid the kill of the blocker (background or non-background process), you can set.
  •          _kill_controlfile_enqueue_blocker=false.    
             This means that no type of blocker will be killed anymore
2.   In order to prevent a background blocker from being killed, you can set the following init.ora parameter to 1 (default is 3).    
  •          _kill_enqueue_blocker=1      
            With this parameter, if the enqueue holder is a background process, then it will not be killed, therefore the instance will not crash.

NOTE:A common root cause of an ORA-494 are very frequent loq switches as can be seen in the alert.log. In those cases the enqueue is normally held by the CKPT process.

References:
Ora-00494: Enqueue [Cf] Held For Too Long Causing Database To Crash [ID 1101862.1]
PHYSICAL: ORA-00494: enqueue [CF] held for too long after Node Crash [ID 747071.1] 
ORA-00494 Or ORA-600 [2103] During High Load After 10.2.0.4 Upgrade (Doc ID 779552.1)
Database Crashes With ORA-00494 (Doc ID 753290.1)

No comments:

Post a Comment