2007/05/29

a short break

Due to family reasons, I'll be away from posting for a while.
Someone very close passed away and I had to travel extensively
and on short notice.
Will try to resume normality as soon as possible, folks.

Thanks to all who sent their condolences via other means or here.

2007/05/11

Oracle v7 bad blocks, disk capacity and other things

Been a while, folks. Busy as heck here.

A coupla weeks ago we found that one of our interim reporting databases, left over from a third party venture a few years ago, was playing up. A few ODBC Access reports were erroring off with a strange error message.

After a lot of crystal ball gazing we found the problem. A very old v7.3 database in an equally old Windows NT4 server had a bad block in a datafile. Worse: only daily exports, no archive log mode on, weekly cold backups, etcetc.

The usual: some third party concern comes in, installs a "canned solution" and then we're left to pick up the pieces if something cracks later on. Meanwhile: no doco, no information, nothing. Heck, we didn't even know this was a v7 database or where this server was!

Typical large company IT...

Time to break out the old notes: it's been > 10 years since I last recovered a release 7 database, not exactly the freshest of subjects in my mind...

I did have a bell going off somewhere that there was an event that would let us bypass bad blocks. At least that should allow us to export the good stuff and perhaps then we could figure out what was missing from previous intact backups.

After a bit of searching in Metaclick, Note:33405.1 showed up.
YES! Event 10231 was definitely what we needed!

A bit of juggling of the init.ora file, a couple of bounces later and bang: the export quietly extracted all possible data minus one corrupt data block.

OK, time to re-create the database in a new vmware instance of the NT server - fun and games there too. Perhaps I'll blog about that later... - and import the recovered export. No problems.

Then we pulled in an old export of the database prior to the corruption and imported the table in question from that one. A comparison of the two showed there were 16 rows missing from older sections of the new table. So we pulled in those 16 from the old table and bang: 100% recovery from a corrupt block, without archive redo log mode!



Before we had time to catch our breath, another request comes in. Turns out one of the outfeeds from our DW database needs to be able to recover to any point in time for at least a year.

Sure: after all disk is cheap, no?

Except of course we'd need about 1500TB of it to keep all our DW data for a year. Given that we already manage a 500TB(!) data backup capacity at any day of the year, that would really put a spanner in our expansion plans...

Turns out of course that the real solution is to daily move the required data to build the outfeed into a transportable read-only tablespace that we can compress/archive and recover at the drop of a hat with RMAN from a suitable FRA. Then all we need is to keep the FRA for a year. Not really as bad as 1.5Petabytes, but still a very respectable 50TB overall.

Kevin Closson is so right in his rants over structured vs unstructured data volumes! We have about 3TB of structured databases and nearly 20TB of unstructured data. All this is backed up daily and kept for indefinite periods on various schedules and sub-areas. All in all a total capacity of nearly 500TB!

Yes Virginia: we're well on our way to the first Petabyte...



Ah well, some nice stuff from a new lens with a new film and a new location:

I call it "late afternoon fireworks". You can of course call it whatever you want! :-)


sometimes I have to wonder if we haven't got our priorities wrong:

what I wouldn't give to be in that bloke's skin!


catchyalata folks!