Managing Oracle on Linux Certified Expert

I’m very happy to share that I’m now an OCE, besides the long standing OCP titles.

The full certification title is Oracle Database 10g Managing Oracle on Linux Certified Expert.

Oracle Database 10g Managing Oracle on Linux Certified Expert

To achieve it, I passed exam 1Z0-046 Oracle Database 10g: Managing Oracle on Linux for DBAs.

In order to study for the exam, in addition to my previous work experience on installing and managing Oracle databases on linux, I have read these two manuals:

Oracle 10gR2 Database Installation Guide for Linux x86

Oracle 10gR2 Administrator’s Reference for UNIX-Based Operating Systems

Good luck if you’re thinking about taking this exam.

Posted in Certification, Oracle, Personal | Tagged , | 1 Comment

Eventual bad performance for a query with a good plan

The user calls complaining that a recurrent report, which usually completes in 15-20 minutes, is running for a few hours now. You request some more information from the user about the report and its database session, and start troubleshooting:

    • Servers are healthy, no hardware failures.
    • Server loads are within acceptable thresholds.
    • ASM performance numbers are acceptable.
    • Database alert logs show no recent incidents.
    • The plan for the query looks good, with all tables being accessed through indexes (no FTSs) and with low cost.
    • AWR shows the plan for the query’s SQL_ID hasn’t changed in the last days.
    • A handful of V$SESSION samples shows the query is progressing (no locks, no cluster events).
    • A dozen calls to V$SESSION_WAIT_HISTORY shows a high amount of “db file sequential read” events, and this is confirmed by looking up the session history on Oracle Enterprise Manager: Oracle Session showing High Sequential Reads in OEM

So, what’s the catch? Why is an indexed, low-cost plan doing a lot of sequential reads?

By taking samples of the P1 column in V$SESSION_WAIT_HISTORY (which for the “db file sequential read” event means the data file number) and querying V$DATAFILE, you get this result:

select name filenames from v$datafile
  where file# in (206,633,729,717,716,98,400,376,1191,752);

Instead of index and table segments, the database is reading undo segments. Why is that?

When a data block is changed, Oracle stores the old values of the data on Undo segments. It happens this way so queries can retrieve a consistent image of the data as it was when the query (or its transaction) begun. This Oracle feature is called Multiversion Read Consistency and is detailed in the Database Concepts manual:

Whenever a user modifies data, Oracle Database creates undo entries, … Thus, multiple versions of the same data, all at different points in time, can exist in the database.

It happens that one of the tables read by this query gets frequent updates on a set of rows. As the query started, data got changed once, twice and eventually several times.

In this situation, the query’s server process has to fetch multiple undo blocks for each data block it wants, until it finds the version of the block consistent with the time it started, and this is a time and resource consuming operation.

The worst cases may even lead to the dreadful “ORA-01555: snapshot too old” issue.

The best solution for this situation is to have the report and the data changes run separately. As a workaround, killing the report and running it again will yield good results, unless the data changes again.

If you want to know more, deeper information on this subject (beyond Oracle’s own documentation) can be found on these two excellent blog posts:

Feel free to leave a comment if you’d like to ask anything about this topic.

Posted in Oracle, Performance | Tagged , , , , , , , | 2 Comments

Predicting Oracle Transaction Rollback Time

While waiting for a large transaction to roll back, I’ve cooked this spreadsheet which can, to some extent (not precise, just estimated – use at your own risk) predict how long it will take for the rollback to complete.

ScreenshotThe spreadsheet takes as input the amount of undo records used by the transaction at two points in time (USED_UREC field from V$TRANSACTION) and the time you got both values, it then calculates the speed at which the rollback is happening, and makes a prediction.

Click to download the Rollback Prediction Spreadsheet, and please let me know if you like it. Thanks!

Posted in Oracle, Performance, Troubleshoot | Tagged , | Leave a comment

11g: handling error ORA-00752 – lost write of a data block in the PRIMARY database

With the recent (11gR1) addition of lost write detection techniques, Oracle has one more tool to help avoid data losses.

Prerequisites to it are having a physical standby database applying logs and the parameters enabled in both the primary and the standby databases.

Lost writes may happen both in primary and in standby databases. Although both kinds of messages are raised on the standby database, it is very important to be aware of the two different situations and react accordingly.

The easier-to-solve issue is when the lost write happened in the standby database itself. The error message in this case presents error code ORA-00600 [3020], and the solution is to restore a good backup of the affected datafile to the standby.

When the error message includes the ORA-00752 error code, however, it means that the primary database lost a write. In this case, having the datafile copied from primary to standby will force log apply to continue – thereby relieving the symptoms, but will not solve the underlying issue. Your primary will still have at least one block that lost a write, requiring additional work.

Below is a test case of the error messages in the alert log of a standby database:
Thu Abc 01 01:01:01 0101
Thu Abc 01 01:01:01 0101
Slave exiting with ORA-752 exception
Errors in file /oracle/diag/rdbms/mydb_dr/mydb2/trace/mydb2_pr03_30502.trc:
ORA-00752: recovery detected a lost write of a data block
ORA-10567: Redo is inconsistent with data block (file# 8, block# 2751496, file offset is 1065418752 bytes)
ORA-10564: tablespace USER_INDEX
ORA-01110: data file 8: '+DATA_1/mydb_dr/datafile/user_index.318.761516453'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 28737
Thu Abc 01 01:01:01 0101
Shutting down instance (abort)

Behold, suffering a lost write does not necessarily result in a corrupt block. Traditional corruption verification techniques may not find it. It may just mean that data which the user believes was written and committed to the database is in fact not there.

Oracle documentation will indicate a failover as the solution. That is not always the best solution, as a failover will lose all the transactions that happened after the lost write was detected.

In the above example, as the affected block belonged to an index, in order to solve the issue I copied the datafile from primary to standby and recreated the affected index in the production database.

If the lost write happens to be in a table, solving it gets complicated, depeding on what transactions occurred and how long it took for the lost write to be discovered. So in fact, for this instance I don’t have any suggestions. It would have to be troubleshooted case-by-case.

More information can be found in Oracle Documentation with this link: (Enabling Lost Write Detection)

This link has additional information also (Recovering From Lost-Write Errors on a Primary Database)

Posted in Availability, Oracle, Troubleshoot | Tagged , , , , , | Leave a comment

ASM MEMBER disks showing as CANDIDATE

Some time ago I spent a few hours troubleshooting an issue where some 20 ASM member disks went all to either CANDIDATE or PROVISIONED status after the servers were rebooted. That’s odd because if the disks were really removed from the groups, they would appear as FORMER as indicated in the ASM documentation.

Some context: This happened in a RAC cluster running Oracle on Linux 64bits. A handful of disks were new to the cluster, having been added to the servers a few days ago.

When the servers got restarted, ASM instances failed to mount the disk groups automatically as it would be expected. The error log in the ASM alert file was:

Sun Abc 01 01:01:01 CST 0101
ERROR: no PST quorum in group 2: required 2, found 0
Sun Abc 01 01:01:01 CST 0101
NOTE: cache dismounting group 2/0xDB297CEB (DATA_1)
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DATA_1 was not mounted

Subsequent query to v$asm_disk showed HEADER_STATUSes as CANDIDATE or PROVISIONED:

------------ ----------- ------- -------- ---------- ---------- ------------ ----------------------------- -------------
0 5 ONLINE NORMAL 276210 0 /oradata/asm/datalun01 PROVISIONED
0 2 ONLINE NORMAL 276210 0 /oradata/asm/datalun02 PROVISIONED
0 6 ONLINE NORMAL 276210 0 /oradata/asm/datalun03 CANDIDATE
0 7 ONLINE NORMAL 276210 0 /oradata/asm/datalun04 CANDIDATE

But nothing (no rows selected) on v$asm_diskgroup.

Disks were reachable to ASM, so the asm_diskstring parameter was good. Disk permissions on linux were also good.

The Oracle-supplied kfod utility also showed the disks as CANDIDATE:

$ kfod a='/oradata/asm/*' di=all _asm_a=FALSE n=TRUE op=DISKS status=TRUE
276210 CANDIDATE /oradata/asm/datalun01
276210 CANDIDATE /oradata/asm/datalun02

But then, a direct read from the disk with the strings command showed actual content:

$ strings /oradata/asm/datalun1 |head

So I knew the data was there and was reachable, just ASM was not seeing it the way it was supposed to. After searching on the web and on Oracle Support Site, I found nothing definitive on this. I then opened a Service Request with Oracle and got a quick response.

The Oracle representative had me build the kfed utility (it’s available by default from 11.1 on, but can be built on 10.2) and with it we fixed the disk checksums, making them appear as MEMBERs again and then mounting the disk groups. For some reason the disk internal checksums got bad and thus ASM was not recognizing them.

Then we ran an “alter diskgroup data_1 check all norepair;” on all disk groups just to make sure. All output of it goes to the ASM alert log. Gladly there were no further issues on the groups, so the database was open shortly after.

I’ll refrain from writing here the exact commands to fix the issue as they write the disks in an unusual manner which I believe one should only use under Oracle Support supervision.

While writing the blog post, however, I found this forum thread in which “alp” found his own way out of a similar situation on completely different HW and OS. Use at your own discretion.

The Oracle Support Note “ASM REACTING TO PARTITION ERRORS [ID 1062954.1]” also presents similar conditions, mentions the recent addition of new disks, but does not give a definitive root cause for the issue.

I hope this information helped you in some way, although incomplete.

Posted in Availability, Oracle, Troubleshoot | Tagged , , , , | Leave a comment