Alasdair on Everything


The case for RAIDZ2

April 10th, 2010

We have an old x4500 knocking around which is getting on for 3 years old now. At the beginning of last month, we did a scrub, and to our horror discovered checksum errors on almost all the drives:

  pool: pool01
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: scrub completed after 23h0m with 0 errors on Wed Mar  3 12:55:36 2010
config:

        NAME         STATE     READ WRITE CKSUM
        pool01       DEGRADED     0     0     0
          raidz1-0   ONLINE       0     0     0
            c11t3d0  ONLINE       0     0     4  2.50K repaired
            c10t3d0  ONLINE       0     0     0
            c13t3d0  ONLINE       0     0     4  1.50K repaired
            c7t1d0   ONLINE       0     0     0
            c8t3d0   ONLINE       0     0     5  1K repaired
            c7t3d0   ONLINE       0     0     4  2K repaired
            c10t2d0  ONLINE       0     0     3  1K repaired
            c13t2d0  ONLINE       0     0     2  1K repaired
            c11t6d0  ONLINE       0     0     3  1K repaired
            c8t2d0   ONLINE       0     0    16  7K repaired
            c7t2d0   ONLINE       0     0     4  2.50K repaired
          raidz1-1   DEGRADED     0     0     0
            c11t7d0  ONLINE       0     0     6  64K repaired
            c10t7d0  DEGRADED     0     0    58  too many errors
            c13t7d0  ONLINE       0     0     4  3.50K repaired
            c12t7d0  ONLINE       0     0     3  7K repaired
            c8t7d0   ONLINE       0     0     2  4.50K repaired
            c7t7d0   ONLINE       0     0     4  11.5K repaired
            c10t6d0  ONLINE       0     0     4  11K repaired
            c13t6d0  ONLINE       0     0     8  86K repaired
            c12t6d0  ONLINE       0     0     0
            c8t6d0   ONLINE       0     0     2  1K repaired
            c7t6d0   ONLINE       0     0     2  2.50K repaired
          raidz1-2   DEGRADED     0     0     0
            c11t5d0  ONLINE       0     0     1  9K repaired
            c10t5d0  ONLINE       0     0     1  13K repaired
            c13t5d0  ONLINE       0     0     2  1.50K repaired
            c12t5d0  ONLINE       0     0     1  1K repaired
            c8t5d0   DEGRADED     0     0   135  too many errors
            c7t5d0   ONLINE       0     0     2  1.50K repaired
            c10t4d0  ONLINE       0     0     8  44K repaired
            c13t4d0  ONLINE       0     0     3  5K repaired
            c12t4d0  ONLINE       0     0     3  2K repaired
            c8t4d0   ONLINE       0     0     2  6.50K repaired
            c7t4d0   ONLINE       0     0     2  13.5K repaired

errors: No known data errors

Thankfully it’s not used for production, so this didn’t bother us a huge amount. ZFS repaired the data errors without issue (hurrah for ZFS!), and we have been replacing the worst affected disks. We’re now doing weekly scrubs to keep the data “fresh” and stop it rotting away.

However one interesting issue that cropped up. We’re using RAIDZ1, which only stores enough parity for 1 disk to be out of service. Since ZFS uses the parity data to reconstruct blocks with checksum errors, if you’re one disk down, and have a block with a checksum error, you’re in trouble - it can’t repair it and you’re data is corrupted.

So when you replace a failed disk in a RAIDZ1 set, you had better hope you don’t encounter any checksum errors on the other disks during the resilver process. Because ZFS has to read in all the data from the other disks to resilver the new disk, you’re at a high risk of encountering checksum errors, especially in our situation where the disks are wearing out.

And this is precisely what happened next. We replaced a failed disk, and during the resilver, ZFS encountered checksum errors on the other disks it couldn’t repair, and we started to lose data:

  pool: pool01
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: resilver completed after 15h47m with 219 errors on Sat Apr 10 16:14:59 2010
config:

        NAME         STATE     READ WRITE CKSUM
        pool01       DEGRADED     0     0   331
          raidz1-0   ONLINE       0     0     0
            c11t3d0  ONLINE       0     0     0
            c10t3d0  ONLINE       0     0     0
            c13t3d0  ONLINE       0     0     0
            c8t5d0   ONLINE       0     0     0
            c8t3d0   ONLINE       0     0     0
            c7t3d0   ONLINE       0     0     0
            c10t2d0  ONLINE       0     0     0
            c13t2d0  ONLINE       0     0     0
            c11t6d0  ONLINE       0     0     0
            c8t2d0   ONLINE       0     0     0
            c7t2d0   ONLINE       0     0     0
          raidz1-1   ONLINE       0     0     0
            c11t7d0  ONLINE       0     0     0
            c11t2d0  ONLINE       0     0     0
            c13t7d0  ONLINE       0     0     0
            c12t7d0  ONLINE       0     0     0
            c8t7d0   ONLINE       0     0     1
            c7t7d0   ONLINE       0     0     0
            c10t6d0  ONLINE       0     0     0
            c13t6d0  ONLINE       0     0     0
            c12t6d0  ONLINE       0     0     0
            c8t6d0   ONLINE       0     0     0
            c7t6d0   ONLINE       0     0     0
          raidz1-2   DEGRADED     0     0   888
            c11t5d0  DEGRADED     0     0     0  too many errors
            c10t5d0  DEGRADED     0     0     0  too many errors
            c13t5d0  DEGRADED     0     0     0  too many errors
            c12t5d0  ONLINE       0     0     0  401G resilvered
            c12t3d0  DEGRADED     0     0     0  too many errors
            c7t5d0   DEGRADED     0     0     0  too many errors
            c10t4d0  DEGRADED     0     0     0  too many errors
            c13t4d0  DEGRADED     0     0     0  too many errors
            c12t4d0  DEGRADED     0     0     0  too many errors
            c8t4d0   DEGRADED     0     0     0  too many errors
            c7t4d0   DEGRADED     0     0     0  too many errors

errors: 219 data errors, use '-v' for a list

Ouch! 219 data errors.

Thankfully ZFS knows precisely which files are affected, and you can just delete/replace/restore the affected files/snapshots and it keeps on running.

However after this, I’m sold on RAIDZ2. I don’t think I’ll be using RAIDZ1 again - the risk of losing data when you’re replacing a failed disk is just too high.

Entry Filed under: General

3 Comments Add your own

  • 1. Ytsejamer1  |  April 2nd, 2011 at 4:40 pm

    Just curious…we too have a 4500 and been using it for non-critical backups. We do have RAIDZ2 setup on our zpool but one of the disks appears to have too many checksum errors. I ran a zpool clear to have it rescan it and indeed, it looks like it may be defective. I have NO idea how to tell what disk it actually is in the system. In the iLom, everything looks peachy…no errors found. The disk I had in question is c13t3d0. Do you know any way to ID the drive?

  • 2. Alasdair  |  April 2nd, 2011 at 4:41 pm

    Hi There,

    On the Tools & Drivers CD for the x4500 there is a utility which can do this, for more info please see:

    http://download.oracle.com/docs/cd/E19962-01/820-1120-22/chapter2.html

  • 3. Ytsejamer1  |  April 2nd, 2011 at 5:58 pm

    awesome!!! Thanks for the link! Regards!

Leave a Comment

Required

Required, hidden

Some HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Trackback this post  |  Subscribe to the comments via RSS Feed