The case for RAIDZ2
April 10th, 2010
We have an old x4500 knocking around which is getting on for 3 years old now. At the beginning of last month, we did a scrub, and to our horror discovered checksum errors on almost all the drives:
pool: pool01
state: DEGRADED
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-9P
scrub: scrub completed after 23h0m with 0 errors on Wed Mar 3 12:55:36 2010
config:
NAME STATE READ WRITE CKSUM
pool01 DEGRADED 0 0 0
raidz1-0 ONLINE 0 0 0
c11t3d0 ONLINE 0 0 4 2.50K repaired
c10t3d0 ONLINE 0 0 0
c13t3d0 ONLINE 0 0 4 1.50K repaired
c7t1d0 ONLINE 0 0 0
c8t3d0 ONLINE 0 0 5 1K repaired
c7t3d0 ONLINE 0 0 4 2K repaired
c10t2d0 ONLINE 0 0 3 1K repaired
c13t2d0 ONLINE 0 0 2 1K repaired
c11t6d0 ONLINE 0 0 3 1K repaired
c8t2d0 ONLINE 0 0 16 7K repaired
c7t2d0 ONLINE 0 0 4 2.50K repaired
raidz1-1 DEGRADED 0 0 0
c11t7d0 ONLINE 0 0 6 64K repaired
c10t7d0 DEGRADED 0 0 58 too many errors
c13t7d0 ONLINE 0 0 4 3.50K repaired
c12t7d0 ONLINE 0 0 3 7K repaired
c8t7d0 ONLINE 0 0 2 4.50K repaired
c7t7d0 ONLINE 0 0 4 11.5K repaired
c10t6d0 ONLINE 0 0 4 11K repaired
c13t6d0 ONLINE 0 0 8 86K repaired
c12t6d0 ONLINE 0 0 0
c8t6d0 ONLINE 0 0 2 1K repaired
c7t6d0 ONLINE 0 0 2 2.50K repaired
raidz1-2 DEGRADED 0 0 0
c11t5d0 ONLINE 0 0 1 9K repaired
c10t5d0 ONLINE 0 0 1 13K repaired
c13t5d0 ONLINE 0 0 2 1.50K repaired
c12t5d0 ONLINE 0 0 1 1K repaired
c8t5d0 DEGRADED 0 0 135 too many errors
c7t5d0 ONLINE 0 0 2 1.50K repaired
c10t4d0 ONLINE 0 0 8 44K repaired
c13t4d0 ONLINE 0 0 3 5K repaired
c12t4d0 ONLINE 0 0 3 2K repaired
c8t4d0 ONLINE 0 0 2 6.50K repaired
c7t4d0 ONLINE 0 0 2 13.5K repaired
errors: No known data errors
Thankfully it’s not used for production, so this didn’t bother us a huge amount. ZFS repaired the data errors without issue (hurrah for ZFS!), and we have been replacing the worst affected disks. We’re now doing weekly scrubs to keep the data “fresh” and stop it rotting away.
However one interesting issue that cropped up. We’re using RAIDZ1, which only stores enough parity for 1 disk to be out of service. Since ZFS uses the parity data to reconstruct blocks with checksum errors, if you’re one disk down, and have a block with a checksum error, you’re in trouble - it can’t repair it and you’re data is corrupted.
So when you replace a failed disk in a RAIDZ1 set, you had better hope you don’t encounter any checksum errors on the other disks during the resilver process. Because ZFS has to read in all the data from the other disks to resilver the new disk, you’re at a high risk of encountering checksum errors, especially in our situation where the disks are wearing out.
And this is precisely what happened next. We replaced a failed disk, and during the resilver, ZFS encountered checksum errors on the other disks it couldn’t repair, and we started to lose data:
pool: pool01
state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
scrub: resilver completed after 15h47m with 219 errors on Sat Apr 10 16:14:59 2010
config:
NAME STATE READ WRITE CKSUM
pool01 DEGRADED 0 0 331
raidz1-0 ONLINE 0 0 0
c11t3d0 ONLINE 0 0 0
c10t3d0 ONLINE 0 0 0
c13t3d0 ONLINE 0 0 0
c8t5d0 ONLINE 0 0 0
c8t3d0 ONLINE 0 0 0
c7t3d0 ONLINE 0 0 0
c10t2d0 ONLINE 0 0 0
c13t2d0 ONLINE 0 0 0
c11t6d0 ONLINE 0 0 0
c8t2d0 ONLINE 0 0 0
c7t2d0 ONLINE 0 0 0
raidz1-1 ONLINE 0 0 0
c11t7d0 ONLINE 0 0 0
c11t2d0 ONLINE 0 0 0
c13t7d0 ONLINE 0 0 0
c12t7d0 ONLINE 0 0 0
c8t7d0 ONLINE 0 0 1
c7t7d0 ONLINE 0 0 0
c10t6d0 ONLINE 0 0 0
c13t6d0 ONLINE 0 0 0
c12t6d0 ONLINE 0 0 0
c8t6d0 ONLINE 0 0 0
c7t6d0 ONLINE 0 0 0
raidz1-2 DEGRADED 0 0 888
c11t5d0 DEGRADED 0 0 0 too many errors
c10t5d0 DEGRADED 0 0 0 too many errors
c13t5d0 DEGRADED 0 0 0 too many errors
c12t5d0 ONLINE 0 0 0 401G resilvered
c12t3d0 DEGRADED 0 0 0 too many errors
c7t5d0 DEGRADED 0 0 0 too many errors
c10t4d0 DEGRADED 0 0 0 too many errors
c13t4d0 DEGRADED 0 0 0 too many errors
c12t4d0 DEGRADED 0 0 0 too many errors
c8t4d0 DEGRADED 0 0 0 too many errors
c7t4d0 DEGRADED 0 0 0 too many errors
errors: 219 data errors, use '-v' for a list
Ouch! 219 data errors.
Thankfully ZFS knows precisely which files are affected, and you can just delete/replace/restore the affected files/snapshots and it keeps on running.
However after this, I’m sold on RAIDZ2. I don’t think I’ll be using RAIDZ1 again - the risk of losing data when you’re replacing a failed disk is just too high.
Entry Filed under: General

3 Comments Add your own
1. Ytsejamer1 | April 2nd, 2011 at 4:40 pm
Just curious…we too have a 4500 and been using it for non-critical backups. We do have RAIDZ2 setup on our zpool but one of the disks appears to have too many checksum errors. I ran a zpool clear to have it rescan it and indeed, it looks like it may be defective. I have NO idea how to tell what disk it actually is in the system. In the iLom, everything looks peachy…no errors found. The disk I had in question is c13t3d0. Do you know any way to ID the drive?
2. Alasdair | April 2nd, 2011 at 4:41 pm
Hi There,
On the Tools & Drivers CD for the x4500 there is a utility which can do this, for more info please see:
http://download.oracle.com/docs/cd/E19962-01/820-1120-22/chapter2.html
3. Ytsejamer1 | April 2nd, 2011 at 5:58 pm
awesome!!! Thanks for the link! Regards!
Leave a Comment
Some HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>
Trackback this post | Subscribe to the comments via RSS Feed