ORA-29702 – Starting RAC instance in non-rac mode

Today I was trying to create a simple database to troubleshoot one issue. It happens I was using an old server which had CRS and ASM installed and used by an old database. While my test database was not meant to be cluster-aware, I wasn’t able to start the instance, not even in nomount mode. As the old database was decommed, so was the shared storage attached to this server, thus leaving me no easy way I could have CRS started. Best option was to have Oracle acknowledge it was now a single instance/single server – no cluster.

Quickly googling it, I found this nice post that showed me how to solve it by relinking the executable:

Oracle Home: ORA-29702 – Starting RAC instance in non-rac mode.

Thanks Sanjay for sharing the information in your blog.

Posted in Oracle, Other Blogs | Tagged , , , , | Leave a comment

Troubleshooting ORA-600[qsmoDropPlans:1] on SPM

Some time ago I got this error message:

Errors in file /u01/app/oracle/diag/rdbms/mydb/mydb1/trace/mydb1_m000_15032.trc  (incident=354123):
ORA-00600: internal error code, arguments: [qsmoDropPlans:1], [], [], [], [], [], [], [], [], [], [], []
ORA-06512: at "SYS.DBMS_SPM", line 2444
ORA-06512: at "SYS.DBMS_SPM_INTERNAL", line 775

As the error message points to DBMS_SPM and qsmoDropPlans, I assume there is some issue with the procedure that drops unused baselines after they expire. Let’s check.

A quick select on dba_sql_management_config shows that PLAN_RETENTION_WEEKS is set to the default value of 53 weeks.

Related parameters:

optimizer_capture_sql_plan_baselines is FALSE.
optimizer_use_sql_plan_baselines is TRUE.

To then check the baseline repository contents with this query:

select trunc(LAST_EXECUTED,'mm'),count(*)
from dba_sql_plan_baselines
group by trunc(LAST_EXECUTED,'mm')
order by trunc(LAST_EXECUTED,'mm')

And find out there are some four million baselines stored, of which only 500k were used in the last six months and another 500k are older than 53 weeks. In some point the optimizer_capture_sql_plan_baselines parameter was set to TRUE, capturing plans that aged out.

Although the definitive solution would be to log a SR with Oracle Support and wait for the bugfix for the automatic purging procedure, I chose the faster way and used this code to manually purge the plans not used in the last 180 days. The most recent 180 days worth of baselines I’m keeping as they are in use by the database.

Posted in Troubleshoot | Tagged , , , , | Leave a comment

How to confine a set of users to the Active Data Guard DB only

Active Data Guard is a new (paid) feature of Oracle 11g where a physical standby database can be open for read-only access while Redo Apply is simultaneously active.

With the objective of relieving the production database from a group of users which run reports only, you open the standby database for reporting purposes.

Despite any and all communication attempts, however, there will always be users which will fail to change their configs to connect to the newly available database.

How can you ensure these users will only connect to the standby, thereby effectively making room in production to the transactional users?

With a combination of at least one role and a database logon trigger, this is easily achievable in a few steps:

  1. Create a role whose name contains a specific substring that no other role in the database has. I chose RPTG, and created the role as MYAPPLICATION_RPTG_ROLE. No specific privileges must be granted to the role.
  2. Grant the new role to the users you want to deny connections on the primary DB.
  3. Edit the trigger script Reporting_Role_Filter_Logon_Trigger by replacing RPTG with the substring you chose.
  4. (Optional) This trigger features a list of users that you DON’T want to deny connections even though they may have that role granted to them. (SYS,SYSTEM,app owner,…) I also recommend to add any accounts with high new connection rates there to avoid the overhead of running the trigger’s two selects too frequently.
  5. Compile the trigger.

Done! Any users that have the role granted, except for those listed in the trigger’s exception list, will get an error when connecting to the primary database.

Bonus: this trigger is switchover/failover aware, so you don’t need to bother – if the standby becomes primary, users will be denied over there (and vice-versa).

Posted in Oracle, Performance | Tagged , , , , | Leave a comment

Quick how-to: enable Active Data Guard

If you have the required additional license, with Oracle 11gR1 onward you can open your physical standby database for read-only operations. If correctly configured, it gives you the benefit of relieving the primary database from the burden of running those long, heavy reports users are always wanting.

The database section is very easy to implement:

ALTER DATABASE OPEN;    (it will open read-only, don't worry)

Now, OPEN_MODE in V$DATABASE should read “READ ONLY WITH APPLY”. That’s it, users are able to connect using the DR database name. But to enhance manageability I recommend that you create a specific service for this usage. The commands are

srvctl add service -d mydb_dr -s mydb_rptg.mydomain.com -r mydb1 -a mydb2 -l PHYSICAL_STANDBY
srvctl start service -d mydb_dr -s mydb_rptg.mydomain.com

If you go and create the service only on the standby side, however, you’ll get an error like I got:

PRCD-1084 : Failed to start service mydb_rptg.mydomain.com
PRCR-1079 : Failed to start resource ora.mydb_dr.mydb_rptg.mydomain.com.svc
CRS-5017: The resource action "ora.mydb_dr.mydb_rptg.mydomain.com.svc start" encountered the following error:
ORA-44317: database open read-only
ORA-06512: at "SYS.DBMS_SERVICE", line 478
ORA-06512: at "SYS.DBMS_SERVICE", line 229
ORA-06512: at line 1

CRS-2674: Start of 'ora.mydb_dr.mydb_rptg.mydomain.com.svc' on 'myDRserver01' failed
CRS-5017: The resource action "ora.mydb_dr.mydb_rptg.mydomain.com.svc clean" encountered the following error:
ORA-44304: service mydb_rptg.mydomain.com does not exist
ORA-06512: at "SYS.DBMS_SYS_ERROR", line 86
ORA-06512: at "SYS.DBMS_SERVICE", line 452
ORA-06512: at "SYS.DBMS_SERVICE", line 400
ORA-06512: at line 1

This happens because for a service to be activated some records regarding it must exist in the data dictionary of the database, and as the standby cannot write it’s own dictionary this error is raised.

The solution is to briefly enable the same service in the primary database:

srvctl add service -d mydb -s mydb_rptg.mydomain.com -r mydb1 -a mydb2 -l PHYSICAL_STANDBY
srvctl start service -d mydb -s mydb_rptg.mydomain.com
srvctl stop service -d mydb -s mydb_rptg.mydomain.com

By doing so the service will be registered in the database dictionary and the records transferred to standby via log transfer. Then you can enable the service on standby. Also, it’s wise to keep the same set of services in both clusters, so you won’t need to worry about them on a failover or switchover.

The value on the -l switch will make this service enable itself only when the database has the physical standby role.

Documentation references:
Use Oracle Active Data Guard Best Practices
Server Control Utility Reference (srvctl)

Posted in Availability, Oracle | Tagged | 3 Comments

Database Consolidation at Dell IT

A little more than a month and a quarter ago, in parallel with the daily production support routine, I had the pleasure to join the team running the database consolidation project in Dell IT. I was assigned as technical team lead for a group of Oracle DBAs with a goal of turning off a few hundred physical database servers each year.

Our team has to find the most suitable targets for consolidation, communicate with application teams, negotiate timing, execute the consolidations, decommission the old databases and make sure the old infrastructure is released/removed.

By the end of 2010 we had accomplished and exceeded the year’s targets. This year we’re also ahead of the game. It’s great to work with a strong team, to have support from other teams and thus be part of great results.

Special thanks to the PM which skillfully orchestrated the teams. Also to the database grid team which has been supportive at all times, provisioning, supporting, everything with an excellent service level.

In order to accomplish all this, Dell’s database practice team deployed 16-node grids with RAC and ASM, and is using Dell’s own blade servers and storage for it. Each grid handles up to 300 databases! More detail on this project was published by Dell as a case study and can be found in their site, following this link.

Posted in Best Practices, Oracle, Personal | Tagged , , , | Leave a comment