Start a Conversation

Unsolved

This post is more than 5 years old

S

1725

August 7th, 2015 16:00

Help with a cx700 that is stuck transitioning/rebuilding

Hi Folks,

Hoping someone here can lend a hand. I am stuck and have reached my wits end, and was really hoping someone here might be able to help

1. I have a cx700

2. Last friday a disk failed in my 500gb lun. (raid 5 group w/hotspare)

3. fortunately the hotspare kicked in and started rebuilding the raid group.

4. While the transition has happening another disk failed.

5. I changed out both failed disks on sunday morning, but I knew the damage was done. Raid 5, so the data was lost, but its ok because i was going to kill it off soon anyway.

6. I thought the transition would stop, but it never did. It got stuck at 2% (im guessing this is where it was when the 2nd disk died).

Now my cx700 is stuck in transitioning state.

1. I tried pulling all of the host spares. this faulted the hostspare raid groups, and the lun that was rebuilding. I then tried to destroy the lun, but navisphere would not let me. it kept giving me errors, the only thing i was able to do was remove it from the storage group which i did hopeing that would release whatever was using the lun. no luck. now the lun is a private lun, and i still cant unbind it or destroy the raid group. Please help??

thank you,

Jason

I ran the getlun stack command and got nothing back.

superman# ./navicli -h 192.168.10.10 getlun -messner 1 -stack

Listed Driver:            

superman#

the failed disk says

superman# ./navicli -h 192.168.10.10 getdisk 0_0_11

Bus 0 Enclosure 0  Disk 11

Vendor Id:               SEAGATE

Product Id:              ST373405 CLAR72

Product Revision:        4A3C

Lun:                     2

Type:                    2: RAID5

State:                   Enabled

Hot Spare:               2: NO

Prct Rebuilt:            2: 2

Prct Bound:              2: 100

Serial Number:           3EK10MQF

Sectors:                 126652416 (61842)

Capacity:                68238

Private:                 2: 13099082

Bind Signature:          0x2897, 0, 11

Hard Read Errors:        0

Hard Write Errors:       0

Soft Read Errors:        0

Soft Write Errors:       0

Read Retries:     N/A

Write Retries:    N/A

Remapped Sectors:        N/A

Number of Reads:         924

Number of Writes:        369

Number of Luns:          1

Raid Group ID:           0

Clariion Part Number:    DG118032243 

Request Service Time:    N/A

Read Requests:           924

Write Requests:          369

Kbytes Read:             41717

Kbytes Written:          6438

Stripe Boundary Crossing: 0

superman#

hot spare says this using the getdisks comand:

Bus 0 Enclosure 1  Disk 14

Vendor Id:               SEAGATE

Product Id:              ST314680 CLAR146

Product Revision:        7A0A

Lun:                     223

Type:                    223: Hot Spare

State:                   Rebuilding

Hot Spare:               223: YES

Hot Spare Replacing:     0_0_7

Prct Rebuilt:            223: 100

Prct Bound:              223: 100

Serial Number:           3HYY406N

Sectors:                 280278656 (136854)

Capacity:                68238

Private:                 223: 69704

Bind Signature:          0x6ebe, 1, 14

Hard Read Errors:        0

Hard Write Errors:       0

Soft Read Errors:        0

Soft Write Errors:       0

Read Retries:     N/A

Write Retries:    N/A

Remapped Sectors:        N/A

Number of Reads:         999

Number of Writes:        373

Number of Luns:          1

Raid Group ID:           223

Clariion Part Number:    DG118032462

Request Service Time:    N/A

Read Requests:           999

/

Number of Reads:         5

Number of Writes:        0

Number of Luns:          5

Raid Group ID:           8

Clariion Part Number:    DG118032260

Request Service Time:    N/A

Read Requests:           5

Write Requests:          0

Kbytes Read:             200

Kbytes Written:          0

Stripe Boundary Crossing: 0

August 11th, 2015 22:00

To resolve the issue, these steps were followed: As per Article Number:000028251

  1. Logged on to the array and attempted to unbind the hot spare currently replacing drive 1_1_15. Removing the hot spare should have caused the Flare driver to detect the hot spare being removed and then detect that drive 1_1_15 was present and start a rebuild directly to drive 1_1_15. This did not work.
  2. Attempted to unbind the LUN (67) from flarecons on SP-A, but since the LUN was currently owned by SP-B this failed.
  3. Trespassed the LUN to SP-A. After the trespass completed, a rebuild on drive 1_1_7 started and completed. After the rebuild to 1_1_7 completed, the equalize to drive 1_1_5 started and ultimately completed. Then LUN was unbound and its RAID group was destroyed.

Also refer below primus

https://emc--c.na5.visual.force.com/apex/KB_BreakFix_1?id=kA17000000017hX

4.5K Posts

August 12th, 2015 07:00

When one disk in a Raid 5 fails the hot spare should kick in and start rebuilding to the hot spare, during this time if a second disk fails, the rebuild stops. Sometimes you can get the rebuild started again by re-seating the second disk that failed. If the second disk does not come on-line, then you're facing a double faulted raid group. EMC recommends that if you remove the disks, that you mark each disk as to its location if it needs to be re-inserted.

To recover or destroy the raid group would require EMC service to access the array directly - you'll need to open a service request with EMC.

https://support.emc.com/kb/5548 How Do I Recover From a Double Faulted RAID Group On Celerra/CLARiiON/NS600

https://support.emc.com/kb/33430 Unable to destroy a RAID group

glen

No Events found!

Top