Monitoring the Physical Standby Progress

In order to be able to fail over to a physical standby…you must be assured that the REDO data being shipped is being applied. I run a script daily on all databases involved in a DATAGUARD configuration to check for any differences. If the numbers don’t match when this script is run (at the same time on all sites) then something is wrong…this is assuming that you DON’T want a delay between production and the standby. There are certain reasons to use a delay: network bandwidth availability, protection from certain types of corruption, protection from enduser error on the production site such as dropping a table, etc.
SQL> select max(sequence#) from v$log_history;
If a gap develops then these are some of the steps to get things going again. DATAGUARD helps with shipping and applying of the redo logs resolving gaps when they happen which works flawlessly 99% of the time. Sometimes a gap will develop that DATAGUARD can’t seem to resolve…that can happen when you delete a archive log on the primary site before it gets shipped to the standby site or when there is a network latency or break in communication between sites.
1. If you are using DATAGUARD. How do I know if I am using DATAGUARD? If this parameter is true then you are running DATAGUARD.
PRIMARY> show parameter dg_broker
dg_broker_start boolean TRUE
2.  On the standby site see what the current status of all the process(s).  The APPLYING_LOG status for MRP0 is waiting for the next log, there is a gap.  Manually copy over the log number it it looking for. 

 SQL> select process, status, client_process, sequence#, block# from v$managed_standby;

STANDBY> PROCESS STATUS CLIENT_P SEQUENCE# BLOCK#

——— ———— ——– ———- ———-

ARCH CONNECTED ARCH 0 0

ARCH CONNECTED ARCH 0 0

MRP0 APPLYING_LOG N/A 7476 1

3. Redo both queries again, to see if any progress has occurred.  If it looks similar to the output below things are working again…if it doesn’t go on to the next step.

STANDBY> PROCESS STATUS CLIENT_P SEQUENCE# BLOCK#

——— ———— ——– ———- ———-

ARCH CONNECTED ARCH 0 0

ARCH CLOSING ARCH 7538 1

MRP0 APPLYING_LOG N/A 7488 177450

RFS IDLE UNKNOWN 7490 81921

RFS IDLE LGWR 7540 6338

 

4. Overview: Stop the broker on the primary site, remove the DATAGUARD configuration completely and create a new DATAGUARD configuration.  This is a last resort measure if you are having problems with adding/removing sites to a DATAGUARD or resolving gaps between sites.  

I use Grid Control and Oracle Enterprise Manager to monitor DATAGUARD configurations and have had an occasional issue where a standby site no longer appears in the monitoring list and can’t be added due to the complaint that the standby database is already in the configuration.  This tells me that the $ORACLE_HOME/dbs/dr1SID.dat is broken, this is where the DATAGUARD configuration information is stored.  Don’t alter the file manually.

Stop the DATAGUARD broker on the primary site first. 

PRIMARY> alter system set dg_broker_start = ‘FALSE’ scope =both;

Rename the $ORACLE_HOME/dbs/dr1SID.dat and dr2SID.dat on all sites involved in this configuration.

mv $ORACLE_HOME/dbs/dr1SID.dat $ORACLE_HOME/dbs/dr1SID.dat.bak

mv $ORACLE_HOME/dbs/dr2SID.dat $ORACLE_HOME/dbs/dr2SID.dat.bak  

Now restart the DATAGUARD broker on the primary.

PRIMARY> alter system set dg_broker_start = ‘TRUE’ scope =both;

Readd all of the sites back into the configuration…either with GC/OEM or the dgmgrl utility.

 

 
 
 
 

 

 
 
 
 

 

 

Advertisements

About April C Sims

Oracle DBA for over a decade...enough said.
This entry was posted in DATA GUARD, Logical Standby, Physical Standby and tagged , , . Bookmark the permalink.

2 Responses to Monitoring the Physical Standby Progress

  1. Linda says:

    Hi April,

    I was trying to run a daily/houly script to monitor whether there is a GAP betweben a primary and physical standby database as below:

    num_behind=`$ORACLE_BIN/sqlplus -s ‘/ as sysdba’ << __EOF__
    set heading off
    set feedback off
    –select max(al.sequence#)-max(lh.sequence#) from v\$archived_log al, v\$log_history lh;
    select max(sequence#) from v\$archived_log where applied='YES';
    exit
    __EOF

    When I test the script, it always says 'RA-01219: database not open: queries allowed on fixed tables/views only: integer expression'….

    The query ran fine when I connect to the database alone, any idea? would you mind how you check the GAP through a script?

    Thanks.

    Linda

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s