Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

4478

December 14th, 2011 17:00

HA comparison EMC products versus Oracle exadata

Recently i got a chance to do the high availability test on an exadata equipment and i was surprised to note that the data was intact even after pulling 7 disks out of the total 84 disks. I started thinking is Exadata outperforming in terms on HA compared to EMC?? i dont know the how Oracle get this done, but one thing i  know is the disk group is cofigured with three way  mirroring.

what is so extra special here which i  dont know yet? what you guys think?

225 Posts

December 20th, 2011 00:00

http://blog.enkitec.com/wp-content/uploads/2011/02/Enkitec-Exadata-Storage-Layout11.pdf

this artical is talking about Exadata storage layout and redundancy. simply speaking, Exadata is using four layer abstraction, Cell --> 12 Physical disks (harddisks) --> LUNs --> Celldisks --> Griddisks --> ASM diskgroups, and ASM failure group to ensure data even across cell and diskes and ensure mirror copies of data not on same cell or disk.

Eddy

161 Posts

December 14th, 2011 17:00

Exdata is really hot topic recently. You mean pulling 7 disks randomly out of the array? And the data kept intact without any rebuilt overhead or seamless?

1.3K Posts

December 14th, 2011 18:00

randomly; but on this case 7 randomly out of the 12 on a cell node( like a DAE with 12 disk and its own OS). If i got it correctly Oracle officially assures data availability with ONLY 2 disk failures( limitations?), so we were expecting the db to go down as soon as i pulled the third one .  As of now i did not get any logical explanation from Oracle though

161 Posts

December 14th, 2011 19:00

Okay. One answer in Oracle only lost of datafiles from system tablespace/controlfiles/current logfile which will bring the db down. Otherwise, db is still online but some functions not working.

One question is where you get the instruction on availability with ONLY 2 disk failures? We are not sure Oracle will direct to disks failures because we got the failure group definition in ASM redundancy mechanisim. ASM automatically stores redundant copies of file extents in seperate failure group.

Lu

1.3K Posts

December 15th, 2011 03:00

I knew about external, normal, high redundancy in ASM, but never heard of a "failur group".. now all make sense.

We even completly powered 2 complete cell nodes, DB was still up and this is due to high redundancy and failure group distribtion. Now  i can see where this "2 disk failure can sustain" comes from. So if we happen to remove  three disks from different cell node, there is no data protection guaranteed.

i have the answers, i would still wait for other possible inputs before rating the thread

46 Posts

December 15th, 2011 03:00

You could lose all disks in one cell node without losing data. Exadata uses ASM normal or high redundancy and the failure groups are configured so that an ASM chunk and it's mirror never live on the same cell.

So you could lose a whole storage cell without losing data (performance impact is a different issue).

But pull two random disks from two *different* storage cells and you're likely in very big trouble. Unless you use high redundancy, then you need to pull 3 disks from different cells to get data loss.

A bigger issue on Exadata is data integrity. Exadata has JBOD (just a bunch of disks) in the backend, no dual-ported drives and no disk scrubbing/checksumming etc. Just posted a blog article yesterday to explain why that is important.

Ask Oracle if Exadata is compliant with T10-DIF and how they detect and repair silent data corruption. They will probably answer that it's no issue. Partly true, using DB checksums they can *detect* corrupt blocks but then they probably don't know how to fix them...

46 Posts

December 15th, 2011 05:00

Just out of curiosity; did you do the testing at Oracle's labs? With guidance from Oracle people?

Because they should have been able to explain how high availability works in Exadata.

Asking because I have a few customers who tested Exadata at Oracle briefing centers but Oracle made a few strange twists on the proof of concepts (in order to make things look better than they really are)...

1.3K Posts

December 15th, 2011 05:00

not at oralce labs, we own one now and i was testing it from storage HA perspective. pulling disks same cell node does not make sense now; if i knew about the failure group if would have pulled them from different cell nodes rather I was told that all the disks were part of single ASM disk group(like a large pool or disk group in EMC)

1.3K Posts

December 15th, 2011 10:00

i pulled the disk vertically now( one disk each from each cell node, but same slot number) and the database is good; Only thing i can imagine now  is ,somehow the mirror copies of the same data is not on those seven disks.which i pulled

pretty interesting, but as said earlier Oracle cant guarantee more than 2 disk failures

No Events found!

Top