While talking to people about disaster recover and high availability in the Oracle database world, sometimes I get to hear keywords in the conversation misused or misunderstood. This happens generally during conversation with non-IT people, but will happen sometimes inside IT and even with some DBAs. So I thought about clearing up some of these (note that we’re restricting the scope here to Oracle Dataguard Physical Standby databases):
- Primary Database is the database currently running and open for transactions to happen.
- Standby Database is any one of all the copies of the primary database. These can be in any other state than open for transaction processing.
- Production (a.k.a. PR/PRD/PROD) refers generally to the main environment where normally the primary database runs.
- Disaster Recover (a.k.a. DR) refers to the secondary and normally remote environment where normally a standby database runs.
- Failover means the act of making a standby become the primary without (special attention to the emphasis here on the without) the presence of the current primary.
- Switchover means the act of changing roles between the current primary and any one of the available synced standby databases.
The most common misunderstanding happens after a switchover. The primary database can be left running in DR for a long time, so most of people will forget this fact. When it happens, people normally tend to mix up primary with production and standby with DR.
This can have dire consequences, like you asking the linux engineer to reboot the standby environment and he goes right on to the DR boxes, even when you pasted the PR server names below in the request.
Because of this, I like better to call production and DR by the name of the datacenter they’re in (as usually they’re in different ones), like “the standby database running in ABC” or “our primary database is in XYZ datacenter”.
Another mix-up I hear a lot is about failover and switchover. As failover is a much more known term than switchover, you hear the former when people are asking in fact about the latter. I normally just do the translation to myself, but here go the details on the differences, this time only:
On Oracle databases, once you do a failover, the old primary database is unsynchronized with the new primary, because the old has transactions that are not in the new. So after the new primary generates redo, it cannot be applied to the old. That means that the old primary is lost (yes, there are exceptions if you have at least 10gR2 and FRA and flashback database and and and). Lost as in you’ll have to restore it all from a backup and re-synchronize before having a standby again. This also means that there is no such thing as a failback.
With switchover, on the other hand, the transition of roles is controlled, both databases agree on a SCN (system change number, a unique sequential identifier) where the primary stops running transactions, the standby applies logs until the SCN, the roles are switched, and the new standby (the old primary) is able to receive logs generated in the new primary. A switchback is just a subsequent switchover – the same process, just with the ends inverted.