Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

1704

October 10th, 2011 09:00

Is Oracle RAC itself becoming a corner case?

Given that

     vSphere 5 now scales to 32 CPUs per VM

     vSphere 5 HA can bring up an instance in the time it takes to reboot a VM + ~20 seconds for HA

     RAC is rather expensive on top of Oracle Enterprise Edition

Is RAC even appropriate for the vast majority of organizations? I see two major selling points with Oracle RAC

     1. When your instance requires more CPUs than you physically have on a host

     2. When even the downtime associated with a VMware HA event is unacceptable to the business

Are there any others?  I haven't seen 70,000 Oracle on EMC customers like EMC itself has, but I suspect there's very few businesses that need more than 32 CPUs for an instance - in fact, looking at slide 65 and 66 from BCA2320, it appears even EMC's environment may not need 32 CPUs (the UCS shown has 128 cores and the CPU usage appears to be about 10%) - and that's one of the 5 largest Oracle EBS environments in the world, so clearly not typical.

It seems that as the single VM limitations of vSphere go higher and higher, RAC becomes less and less of interest except for those cases where the time to bring up an instance after an HA event is unacceptable...

Am I missing something?    

28 Posts

November 1st, 2011 14:00

Your post inspired me to dive in much deeper.  Check out my blog on this topic. Is Oracle RAC Becoming More of a Corner Case?

28 Posts

October 11th, 2011 12:00

You are correct.  We require RAC for both scalability and availability for our eBusiness suite.  While 10% utilization of 128 cores would mathematically translate to 40% utilization of 32 cores, it really does not.  Let's get into the fuzzy math.  Keep in mind that since 10g Oracle has 7 individual shared pools.  With 4 nodes, I have 28 shared pools.  I think you could extrapolate. The other problem we have is PGA, with 15,000 concurrent sessions and up to 7,000 sessions per node (batch and online session are seperated) our PGA approaches 90GB, so memory is parth of the equation.

40 Posts

October 11th, 2011 13:00

Hopefully you're ok with me asking a couple of follow up questions on the environment. I don't get to see systems that large

1) In the video and comments at this page ( http://virtualgeek.typepad.com/virtual_geek/2011/10/update-on-work-life-of-chad-emc-oraclemsftsap.html#comments ) Chad mentions you all are going to 11gR2 as part of the migration to moving the instance under vSphere 5.  Given the changes to memory management (AMM) in 11g which can move memory back and forth between the SGA and PGA as needed, any plans to leverage AMM (or even 10g ASMM) to better manage the memory usage? You mentioned that your PGA approaches 90GB - I believe a single B440 M2 scales up to 512GB of RAM ... I guess I'm not understanding the full equation? Theoretically why wouldn't a 32 CPU 500GB RAM single instance work? I get the desire for availability and for EMC the RAC cost for 128 cores might be financially justifiable (especially as it's a sunk cost), but is that it?

2) I realize that EMC can probably implement / use whatever amount of storage it wants - from looking at slide 65 from that slide deck, I see no mention of any use of SSDs / Flash Cache in the environment. Was there a particular reason for this? Also, given that the production EBS system is 8TB (slide 63) and if it's like any other EBS environment I've been to, there's numerous clones of it for dev/test, has there been any consideration of leveraging Oracle's Advanced Compression to reduce the physical space usage (and I/O) of the database? I realize Advanced Compression does use a bit more CPU resource, but it appears EMC has huge amounts of CPU spare capacity.

3) Are there plans to leverage SRM + Recoverpoint for the EBS environment as part of a DR strategy once it's virtualized? If not, why? Perhaps a better question might be, of the 4 possible storage options mentioned in the BCA2320 slide deck (slide 26) are you all going with Config 1 (block / rdm) ?

28 Posts

October 11th, 2011 14:00

Sorry for the delay, I had not seen Chad's video, so I wanted to see what he said.  A main reason for not being virtual today is the 11gR2 requirement for support.  While Oracle has made great strides in automatic memory management, it is still lacking in efficiencies for high performance databases.  We only use it for our databases where performance is not very important.  The primary reason for this is the tremendous overhead to reclaim memory allocated from the shared_pool to give to the buffer cache.  The reverse is true but to a much lesser degree.  Through some fairly cursory checks, a DBA should be able to derive the proper settings for very efficient use of database memory.  For 10g, the PGA memory is a different topic, but we do utilize automatic workload management.  We have tested varying workloads utilizing AMM, but we have not found it to be responsive enough.

I will respond to followup 2 and 3 tomorrow.

28 Posts

October 11th, 2011 17:00

#2) The focus of the presentation was not storage best practices, hence the missing tiering.  We are using EFDs, specifically 2TB, out of 12TB is flash.  For eBusiness, the objects best suited for flash are the JTF schema tables, not all of them, but most of them benefit the most from their use.  Also, since we are on RAC, most of the concurrent manager tables are also good candidates! Not due to the random reads, but to minimize contention over the interconnect.

We do not used advanced compression for our eBusiness Suite.  We have choosen to implement aggressive archive and purge to control our database size.  For instance only the quotes that have been active in the last 6 months are in the database, the older ones have been archived using a tool from Informatica.  This allows the users to bring them back If they later decide they need them.

28 Posts

October 12th, 2011 06:00

#3)  We are not currently running SRM for our CRM eBusiness Suite app tiers today as we are running them as active-active between data centers.  This provides us with fewer points of failure, a greater level of confidence in our DR and allows us to respond to a DR event much faster. 

We also are not planning to utilize SRM for our database tier as we virtualize it.  Since our DR is extremely far from our primary data center, we are utilizing a bunker site in our replication.  SRM does not support this, nor does it support a managed recovery database.  Utilizing the bunker gives us a zero data loss solution for a localized DR event and rolling forward a managed recovery database utilizing a combination SRDF sync and async gives us a reporting database, extra level of protection from database corruptions and confidence that the DR database is ready for action.  This is also why we use pRDMs, as refered to in the presentation.

No Events found!

Top