Archive for April, 2010
OpenSolaris 2010.03 (2010.04)
Like many people, I’ve been waiting eagerly and anxiously for OpenSolaris 2010.03 to come out. Over the past 9 months, OpenSolaris has come along leaps and bounds with a never ending procession of new features and enhancements, such as ZFS Deduplication and COMSTAR. While the OpenSolaris dev builds provide access to these new features, are generally stable enough for production use if you pick the right build, it’s obviously best to stick to stable releases for production environments.
However March came and went, and we’re now one third of the way through April, and still no sight of 2010.03. This has caused a lot of people to become quite anxious regarding OpenSolaris’ future.
Oracle have stated they’re going to invest more in Solaris than Sun did, and have stated their intention to keep OpenSolaris going. However they have also recently revoked the free version of Solaris 10, making it a 90 day trial. My understanding is that now, to run Solaris 10, you need an entitlement to do so, which comes from having a valid support contract for Solaris on Sun^H^H^HOracle hardware.
This has serious implications for many businesses, including EveryCity. Our current platform is Solaris 10 based, and while we started off using Sun servers, their Intel Nehalem range is simply too expensive, so we’ve been purchasing Dell R410 and R610 machines.
Thankfully, we have already planned to move to OpenSolaris for some time now, and our forthcoming platform will be OpenSolaris based. So this is no real issue for us - it just means Solaris 10 update 8 will be the last update to our Solaris 10 platform, and at some point down the line all our Solaris 10 Zones will become branded zones under OpenSolaris.
The future is uncertain, but thankfully OpenSolaris shows no sign of going away. Oracle’s culture is quite different from the one at Sun; it’s clear they’re very corporate, and rather secretive. They have also stated that they take a very “hands off” approach to their User Groups, so for example the LoSUG group now has to be managed by non-Oracle employees, and there’s no longer any free food/drink for attending. I was tempted to volunteer to help organise LoSUG, but unfortunately I just don’t have the time at present. Hopefully that will change in the future.
Oracle are hell bent on making their investment profitable, and as long as OpenSolaris continues to develop at the pace it has been going thus far, and continues to be free and open, I’m happy. If anything Oracle’s behaviour may bring the OpenSolaris community outside of Sun/Oracle closer together, and foster more community involvement and development, which can only be a GoodThing[tm].
And while there’s been a lot of suggestion that OpenSolaris 2010.03 may never materialise on the grape vine, this is clearly not the case, and it seems it may be just round the corner:
Re: osol-discuss - So who has been able to update to the b136 image? by Alan Coopersmith alan.coopersmith@oracle... 2010-04-07T13:58:54+00:00. Chad Welsh wrote: > I have run update from IPS package manager and from the > pkg image-update from the command line and nothing. are > the packages only in the ON Gate? If so when will they be > released into the wild for us to use? Packages for the later builds have been built for all the gates, not just ON, but not published to pkg.opensolaris.org while the 2010.03 release is being finished.
Hi Sarah, Due to some security concerns and other issues, I'd flipped a coin and await the forthcoming snv_b138 kernel release. As for any independent distro releases between now and before snv_b138, my opinion is to wait on doing any major system upgrades, 'production' related migrations, or journalistic reviews until snv_138 is officially released. This is for mainly commented for current users using OpenSolaris for 'very' high-end production-grade audio/video workstations or high-availability servers with several TBs of in-flux data. If you are having ANY major issues with a prior OpenSolaris release, just give OSOL 2010.03 until April 16th or await the snv_138 kernel release. You'll be 'very' glad you did. ~ Ken Mays
1 comment April 10th, 2010
The case for RAIDZ2
We have an old x4500 knocking around which is getting on for 3 years old now. At the beginning of last month, we did a scrub, and to our horror discovered checksum errors on almost all the drives:
pool: pool01
state: DEGRADED
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-9P
scrub: scrub completed after 23h0m with 0 errors on Wed Mar 3 12:55:36 2010
config:
NAME STATE READ WRITE CKSUM
pool01 DEGRADED 0 0 0
raidz1-0 ONLINE 0 0 0
c11t3d0 ONLINE 0 0 4 2.50K repaired
c10t3d0 ONLINE 0 0 0
c13t3d0 ONLINE 0 0 4 1.50K repaired
c7t1d0 ONLINE 0 0 0
c8t3d0 ONLINE 0 0 5 1K repaired
c7t3d0 ONLINE 0 0 4 2K repaired
c10t2d0 ONLINE 0 0 3 1K repaired
c13t2d0 ONLINE 0 0 2 1K repaired
c11t6d0 ONLINE 0 0 3 1K repaired
c8t2d0 ONLINE 0 0 16 7K repaired
c7t2d0 ONLINE 0 0 4 2.50K repaired
raidz1-1 DEGRADED 0 0 0
c11t7d0 ONLINE 0 0 6 64K repaired
c10t7d0 DEGRADED 0 0 58 too many errors
c13t7d0 ONLINE 0 0 4 3.50K repaired
c12t7d0 ONLINE 0 0 3 7K repaired
c8t7d0 ONLINE 0 0 2 4.50K repaired
c7t7d0 ONLINE 0 0 4 11.5K repaired
c10t6d0 ONLINE 0 0 4 11K repaired
c13t6d0 ONLINE 0 0 8 86K repaired
c12t6d0 ONLINE 0 0 0
c8t6d0 ONLINE 0 0 2 1K repaired
c7t6d0 ONLINE 0 0 2 2.50K repaired
raidz1-2 DEGRADED 0 0 0
c11t5d0 ONLINE 0 0 1 9K repaired
c10t5d0 ONLINE 0 0 1 13K repaired
c13t5d0 ONLINE 0 0 2 1.50K repaired
c12t5d0 ONLINE 0 0 1 1K repaired
c8t5d0 DEGRADED 0 0 135 too many errors
c7t5d0 ONLINE 0 0 2 1.50K repaired
c10t4d0 ONLINE 0 0 8 44K repaired
c13t4d0 ONLINE 0 0 3 5K repaired
c12t4d0 ONLINE 0 0 3 2K repaired
c8t4d0 ONLINE 0 0 2 6.50K repaired
c7t4d0 ONLINE 0 0 2 13.5K repaired
errors: No known data errors
Thankfully it’s not used for production, so this didn’t bother us a huge amount. ZFS repaired the data errors without issue (hurrah for ZFS!), and we have been replacing the worst affected disks. We’re now doing weekly scrubs to keep the data “fresh” and stop it rotting away.
However one interesting issue that cropped up. We’re using RAIDZ1, which only stores enough parity for 1 disk to be out of service. Since ZFS uses the parity data to reconstruct blocks with checksum errors, if you’re one disk down, and have a block with a checksum error, you’re in trouble - it can’t repair it and you’re data is corrupted.
So when you replace a failed disk in a RAIDZ1 set, you had better hope you don’t encounter any checksum errors on the other disks during the resilver process. Because ZFS has to read in all the data from the other disks to resilver the new disk, you’re at a high risk of encountering checksum errors, especially in our situation where the disks are wearing out.
And this is precisely what happened next. We replaced a failed disk, and during the resilver, ZFS encountered checksum errors on the other disks it couldn’t repair, and we started to lose data:
pool: pool01
state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
scrub: resilver completed after 15h47m with 219 errors on Sat Apr 10 16:14:59 2010
config:
NAME STATE READ WRITE CKSUM
pool01 DEGRADED 0 0 331
raidz1-0 ONLINE 0 0 0
c11t3d0 ONLINE 0 0 0
c10t3d0 ONLINE 0 0 0
c13t3d0 ONLINE 0 0 0
c8t5d0 ONLINE 0 0 0
c8t3d0 ONLINE 0 0 0
c7t3d0 ONLINE 0 0 0
c10t2d0 ONLINE 0 0 0
c13t2d0 ONLINE 0 0 0
c11t6d0 ONLINE 0 0 0
c8t2d0 ONLINE 0 0 0
c7t2d0 ONLINE 0 0 0
raidz1-1 ONLINE 0 0 0
c11t7d0 ONLINE 0 0 0
c11t2d0 ONLINE 0 0 0
c13t7d0 ONLINE 0 0 0
c12t7d0 ONLINE 0 0 0
c8t7d0 ONLINE 0 0 1
c7t7d0 ONLINE 0 0 0
c10t6d0 ONLINE 0 0 0
c13t6d0 ONLINE 0 0 0
c12t6d0 ONLINE 0 0 0
c8t6d0 ONLINE 0 0 0
c7t6d0 ONLINE 0 0 0
raidz1-2 DEGRADED 0 0 888
c11t5d0 DEGRADED 0 0 0 too many errors
c10t5d0 DEGRADED 0 0 0 too many errors
c13t5d0 DEGRADED 0 0 0 too many errors
c12t5d0 ONLINE 0 0 0 401G resilvered
c12t3d0 DEGRADED 0 0 0 too many errors
c7t5d0 DEGRADED 0 0 0 too many errors
c10t4d0 DEGRADED 0 0 0 too many errors
c13t4d0 DEGRADED 0 0 0 too many errors
c12t4d0 DEGRADED 0 0 0 too many errors
c8t4d0 DEGRADED 0 0 0 too many errors
c7t4d0 DEGRADED 0 0 0 too many errors
errors: 219 data errors, use '-v' for a list
Ouch! 219 data errors.
Thankfully ZFS knows precisely which files are affected, and you can just delete/replace/restore the affected files/snapshots and it keeps on running.
However after this, I’m sold on RAIDZ2. I don’t think I’ll be using RAIDZ1 again - the risk of losing data when you’re replacing a failed disk is just too high.
3 comments April 10th, 2010
