2014/03/06

Unintended, but interesting consequences

It's interesting how from time to time something happens that makes sense and seems logical afterwards, but at the time it causes a bit of a surprise.  Part of the fun of working with this type of software!

A few days ago we had an incident in an Oracle DW database when a developer tried to load an infinitely big file from a very large source.  Yeah, you got it: big-data-ish! 

Suffice to say: 180GB and still going before it hit the limits I've set for that particular tablespace.
Yes, I do use auto-extend. It's a very effective way to ensure things don't get out of control while still allowing for the odd mis-hap without any major consequences.  Particularly when the storage is a SAN and any considerations about block clustering and disk use distribution are mostly irrelevant.

And no: if carefully setup it really does not impact performance that much.  Yes, I know it CAN be proven in certain circumstances to cause problems.  "can" != "must", last time I looked!

Anyways, the developer did the right thing in that the table was loaded without indexes, no-logging, etcetc.  It just ran out of space, due to my tight control of that resource.

So the developer did the next best thing: he killed the session and waited for the rollback.
And waited...
And waited...
...waited...

You got it: one of those "never-ending" single-threaded rollbacks that Oracle goes into in some circumstances. The tablespace is locally managed and uniform size 40MB, so there were quite a few things pending for the rollback to go through.

No biggie, seen a LOT worse!  But of course, it needed attention. And I didn't really want to crash the instance and do a startup parallel rollback - read about that here .

Upon investigation, I noticed that indeed the session was marked "KILLED" and upon checking v$transaction (yes, I know about v$longops: that doesn't work in this case!) it looked like we were up for another 4 hours at least of single-threaded rollback!

Pain!  And of course we needed the space after the rollback returned the tablespace to nearly empty, to run other tasks of the DW.

I also noticed that the session's background process (Unix, single-thread dedicated) was still there.

Now, when a session gets killed the background process eventually goes away once the rollback is done.
In this case, it of course didn't - rollback still going.

But it wasn't clocking up any CPU time.  If it was doing anything, it'd have been totally I/O bound.  PMON and DBWR on the other hand were quite busy.  To be expected - that is the correct behavior for these conditions.

Trouble is: PMON and DBWR do this in a single-thread fashion.  Which takes FOREVER!

Now, here comes the interesting bit.

I decided based on past experience to actually kill the original background process.  As in "kill -9" after getting its "pid" through a suitable v$session/v$process query.  This, because in theory SMON should then take over the session and roll it back, freeing up any contention around PMON and DBWR.

Now, SMON is ALSO the process that does this if the database crashes and a startup rollback takes place. The thing is: it does it in parallel to make the startup faster, if one has FAST_START_PARALLEL_ROLLBACK set to LOW or HIGH. We have it set to LOW. My hope was that it would also do a parallel recovery here.

And indeed it did!  While watching the proceedings via OEM monitors, I saw the same session number show up in a totally different colour (pink) in the Top Activity screen, this time associated with the SMON process.  After clicking on the session, I saw that a number of PQ processes were actually active associated with it!

And guess what?  The rollback finished in around 30 minutes.  Instead of 4 hours!

Now, THAT is what I call a somewhat interesting outcome.  I did not know for sure SMON would do a "startup recovery" with full parallelism.  My hope was it would take over DBWR work and do a partial parallel rollback.  As it turns out, it did a LOT better than that!

And that is indeed something to be happy about!

Now, before anyone transforms this into another "silver bullet" to be performed everytime a rollback of a killed session takes too long:


  1. This worked for Aix 7.1 and 11.2.0.3 patched up. Don't go around trying this in your 7.3.4 old db!!!
  2. It was a decision taken not lightly, after consulting the manuals and checking with the developer the exact conditions of what the program had done and how and for how long.
  3. This database is under full recovery control with archive logging.  Worst came to worst, I could just lose the tablespace, recover it online from backup and roll it forward to get things back into shape without major outages of the whole lot.

As such, think before taking off on tangents off this. If I get some free time, I'll try and investigate a bit more about it and how far it can be taken.  But given the current workload, fat chance!  So if anyone else wants to take over the checking, be my guest: there is enough detail above to construct a few test cases and it doesn't need to be 180GB! A third of that should be more than enough to already cause a few delays!

Anyways, here it is for future reference and who knows: one day it might actually help someone?

Speaking of which: our Sydney Oracle Meetup is about to restart its operations for this year.  We'll be trying as hard as usual to make it interesting and fun to attend.  And we need someone to replace Yury, who is leaving us for the Californian shores.  So, if you like to contribute to the community and want to be part of a really active techno-geek group, STEP UP!






Now, for the lighter side of my posts.  There is such a thing as a lighter side of life!



Almost every Summer we get these beautiful guests in our garden.  They are StAndrews spiders, around 3 inches in size.  Mostly they build their web in protected areas of the garden where wind and birds can't easily get to them.  And like this one, they pick up a partner and lay heaps of eggs. The partner unfortunately ends up as fodder...   
They are also some of the best weather predictors I've ever seen: want to know if tomorrow is going to rain?  Check the spider late evening.  If it is building a new web, it won't rain. Otherwise, it's virtually guaranteed it will!  100% hit rate, better than Connor's "Custom BCHR"!



 

Folks who follow me on FB know that we have a little pooch.  Above is Tiago, our now fully grown Great Dane.  All 75Kgs of him, against a young Irish Wolfhound who particularly loves to play with him at the local dog park.  He is indeed one huge dog, and quite docile to boot!



Perhaps the best way of conveying his size is to picture him next to something folks can relate to.  That is my daughter's Toyota 4Runner.  For those who know these cars, they are BIG.  Check Tiago against it!



He's also a major sook. Gentle as gentle can be, thank the Gods! Otherwise, something this big would be completely unmanageable.  Lovely animal, we love it to pieces. Came from Manning Danes and his dad is Ghostwalker, an absolute champion.  We don't show him - he's family.




Catchyalata folks, and keep smiling!

2014/03/04

Latest for the folks who have to deal with Peoplesoft

Dang, been a while since the last posts!  A lot of water under the bridge since then.

We've ditched a few people that were not really helping anything, and are now actively looking at cloud solutions, "big data" use, etcetc.

Meanwhile, there is the small detail that business as usual has to continue: it's very easy to parrot about the latest gimmick/feature/funtastic technology that will revolutionize how to boil water.
But if the monthly payroll and invoicing and purchasing fail, I can promise you a few HARD and concrete financial surprises!

Speaking of payroll...

A few years ago I made some posts here on the problem with PAYCALC runs in Peoplesoft and how hard it is to have correct CBO statistics for the multitude of scratchpad tables this software uses to store intermediate results during a pay calculation run - be it daily, weekly or monthly.  Those posts are regularly visited and comments added to them by folks experiencing the same problem.

(Fancy that!  A problem that is common to others and not caused by yours truly  - the "bad dba", according to marketeers from so many companies hellbent on selling us con-sultancy!)

This lack of statistics is mostly caused by the fact that the software truncates the tables at the end of the pay calc process. And does not calculate statistics after re-populating them with intermediate results during the next pay cycle!

With the obvious result that no matter how often and regularly a dba recalculates statistics on these tables, they cannot possibly ever be correct at the most important time!

One way around this I mentioned at the time was to actually turn on debugging for the modules that use those tables - assuming it's easy to find them, not all folks know how to do that!

The end result is Peoplesoft code will detect the debug flag and fire off a ANALYZE just after re-populating those tables.  Not perfect ( analyze if anything is a deprecated command!) but better than no stats!   At the end of the run, the output of the debug is simply thrown away.  Slow, but much better than before!

Another way was to force estimates to be taken for those tables by the CBO.  Again, not perfect.  But still better than pay calc runtimes in the tens of hours!

Enter our upgrade last year to Peoplesoft HCM and EPM 9.2!  Apart from using the "fusion" architecture for screen handling (slow as molasses...) this release introduces a CAPITAL improvement to all pay calc runs!

What Oracle has done is very simple: from 9.2 onwards, they do NOT truncate the scratchpad tables at the end of each pay calc!

So instead of guessing/analyzing stats for empty tables, the dbas can now analyze them with significant, relevant and timely data contents, do histograms if that is their penchant, etcetcetc!  With the immediate result that instead of the previous pot luck runtime for the paycalcs, now they are reliably constant in their duration and much, much faster! In simple terms: the CBO finally has SOMETHING accurate to base its "cost" on!


Man!  You mean to tell me it took Peoplesoft to become available in Oracle cloud for them to realize what was being done before was eminently STUPID and making it impossible for any dba to have a predictable paycalc execution time?


Apparently when it became their cloud responsibility, the solution was quickly found and implemented...


Of course, if they had actually LISTENED to the frequent and insistent complaints by dbas who actually deal with their software day to day for years on end rather than playing "ace" cards, things would have perhaps progressed a lot faster!... 



But let's ignore stupidity and rejoice: there is a light at the end of the tunnel and it's not incoming!

Get those upgrade juices going, folks, and try to convince your management that 9.2 is THE release of Peoplesoft to aim at.

It'll help everyone. (Including Oracle - but I don't think their marketing is smart enough to detect that...)





Anyways, on to funnier and hopefully nicer things.



As some know, I go to East Timor for holidays every time I can afford it.  Mostly because I grew up for a while there, many of my school time friends are still there and we are still very close.  But also because I get a chance to view things like the ones below, and actually visit these places.






Hope you enjoy them as much as I did!




The above was taken from Fazenda Algarve.  It belongs to the well known Carrascalão family and it is likely one of the largest coffee farms in the world.  Some of my good friends already departed are buried nearby. 
The flat empty concrete floor you see on the bottom is where the coffee beans are sun dried and lose their skin and pulp (coffee beans are the seed of a small, very sweet fruit).  Once dry, they are put in the tanks on the left and shaken until the dry husks fall off and the beans can be packed and sent for sale.
I don't know about you but with a view like this I wouldn't be doing much work!  The far mountains are around 40Kms away already in Indonesian territory and the dry river bed is the largest "river" of East Timor.  It's called "Lois".  I first saw this as a young teenager.  To now come back and see this magnificent, stunning  view once again is indeed a rare, truly cherished privilege!



And looking directly back from the previous image, is the above view.  That peak in the distance (around 20Ks) is Tata-Mai-Lau, aka Mt Ramelau, the highest point in the whole island: >3000m
A truly magnificent mountain from where the whole island can be seen at sunrise on a good, clear day.
I have not yet been back there.  It's a very long climb done at night and I'm no spring chicken anymore, so I have to pass on climbing it again without mechanical aid.  But recently a friend of mine climbed it for the first time and had an epiphany experience at the top during sunrise.  I don't think she'll ever forget it.  It is one of the most impressive views available anywhere in the world!




 Dunno about you, but I'd definitely like to have an office with a view like this!  But don't be deceived - that guy is not having fun, he's very likely collecting dinner for his family.  Otherwise the only thing they'll eat on that night will be boiled rice and coconut.  It's a hard life for the population over there, and why I always leave the place with a sharp pain in my heart...


Speaking of heart pain, that's my Yggdrasil.  Look it up, Norse mythology.  Part of my paternal family is descendent from a Norse crusader who joined the first crusade and who fought in Jerusalem and after settled in Portugal.  I lived for 3 years right behind this tree and there was not one morning I did not marvel at its size.  If anything it is even bigger now.
For the curious, it's a Banyan tree. Banyans are the national tree of the Republic of India, and for good reason: they are venerated over there as almost sacred.  The name was originally a Portuguese word.  It later was adopted into English and there are quite a few members of this family of trees around the world, including at least 3 varieties in Australia, one specifically from Sydney.  A really large Indian Banyan tree can grow to an hectare in width, by spreading across other trees as a parasite.
In Timor, they are known as "gondoeiro" and are thought to harbor good spirits from folks dead a long time ago.  Dunno, but everytime I visit Díli I go to this tree, hug it and talk to it.  It always feels warm and soft to the touch and against it I can only hear the breeze in its foliage.  Hallucination?  Maybe.  Don't care. It's my Yggdrasil and it's the closest thing in my life to true religion.



Catchyalata, folks. Do me a favour: smile and have fun!