Signing Windows Drivers
We were signing some Open Source GPL Windows drivers to use on Windows Server 2008 x64 edition (which only accepts signed drivers) and were encountering the following Windows boot error after installing our supposedly successfully signed drivers:
0xc0000428 Windows cannot verify digital signature for this file
We have an official Verisign code signing certificate, and were a bit stumped/confused. The files seemed to be signed and everything seemed to be okay.
The problem was that we were not adding a cross-certificate. A cross certificate basically provides a chain of authority so that Windows is able to trust your certificate. You can find out more information about this from this helpful blog post over here.
If we had bothered to TRFM, the Code Signing walkthrough does kind of tell you you need to sign your drivers with a cross certificate, so we probably could have saved ourselves a lot of time by reading this first.
Anyway, you can obtain the cross certs from here, and the option to use with signtool.exe is /ac, so for example you’d type:
signtool.exe sign /v /ac MSCV-VSClass3.cer /s my /n "Every City Limited" /t http://timestamp.verisign.com/scripts/timestamp.dll xenusb\%BUILDDIR%\blah.cat
Hopefully this post will help other people save some time, as we spent all day trying to figure this one out.
Add comment July 6th, 2010
The Oracle OpenSolaris Farce
Well it was meant to be OpenSolaris 2010.03. Then Oracle happened, and they stated 2010.1H (first half). This has come and gone.
Is it ever going to arrive?
According to the rumour mill September 2010 is a possible new target date. Or just "Sometime in 2010". But who knows.
Some are even starting to wonder if Oracle are turning OpenSolaris 2010.03 into Solaris 11 proper. If they supported this on Dell+HP hardware in addition to their own (As Sun did prior to the acquisition), that would be an extremely enticing proposition.
Time will tell. Oracle are not stupid. But this silence and complete lack of communication is insanely frustrating.
Some might be wondering, why all the hoohaa anyway? It’s just an operating system, right? Well.. because OpenSolaris is evolving at an incredible rate, and the new features list since OpenSolaris 2009.06 came out is absolutely enormous. Those who have used the development branches have tasted the forbidden fruit, and they want it. They really god damn want it.
OpenSolaris 2009.06 was an interesting toy. But OpenSolaris 2010.? has finally reached the prime time. Solaris 10 containers, COMSTAR iSCSI/FCoE framework, the new Crossbow Networking stack, the latest ZFS features including separate cache and log devices, plus of course ZFS Deduplication, Xen 3.4 support via xVM.. the feature list is enormous.
In the mean time Nexenta is chugging away churning out new releases with backported fixes from the latest sources. I might have to check out Nexenta again. I never thought I’d say this, but I actually much prefer IPS and beadm to apt-get and apt-clone. Plus I dread to think how they’ve implemented Zones on Nexenta. But the full GNU Userland is rather appealing.
My advice to Oracle: Don’t fuck this up.
2 comments July 1st, 2010
UK OpenSolaris IPS Mirror
This is just a quick post to let people know that we run a UK OpenSolaris IPS mirror, including the full /dev branch.
To use it, simply run:
# Add the mirror repository pkg set-authority -m http://pkg.osol.mirror.everycity.co.uk opensolaris.org
It’s also worth mentioning that when using the mirror, pkg search/install still requests the metadata from the origin server, which by default is in North America. Sun/Oracle run a European mirror in the Czech Republic (which is faster from the UK). You can set your publisher by doing:
# Set new European origin pkg set-publisher -g http://pkg-eu-2.opensolaris.org/dev opensolaris.org # Remove USA origin pkg set-publisher -G http://pkg.opensolaris.org/dev opensolaris.org
What IPS mirrors won’t do for you
As I mentioned, unfortunately a mirror doesn’t mirror the metadata, so when you do a search or request an install, pkg still connects to the origin server. This kind of defeats the purpose of running a mirror as the metadata operations are often the slowest bit, and is a yet another stupid limitation of IPS as a system.
If you like to live life dangerously, there’s an abusive method of creating your own package authority/origin by tearing the packages out of the repo, but it hammers the origin server for each package request and is frowned upon. Don’t let that stop you - perhaps the IPS gods will get the message that they need to implement this feature for people who want to deploy IPS mirrors in closed networks.
I’m not going to run our own origin until there’s an official supported method of doing this, as I’m not sure of the consequences. Sucking down all the packages this way resets all the timestamps, and I’m not sure if this might cause problems later down the line. Hopefully an official method of creating your own origin will spring up soon.
Fellow LoSUG presenter Andrew Watkins wrote up rather good details of how to create your own origin server using the unsupported method over here. I believe he had pretty good success with it, and presented this method at his LoSUG talk on the Automated Installer. His article is a tidied up version on a post by Christopher Kampmeier. I’ve posted a comment on his article asking for comments on this method.
Running your own mirror
Running your own mirror isn’t too hard, with instructions here. As of the date of this blog post, the whole mirror is 56GB on a compression=on ZFS dataset. Dedupe might reduce this significantly, but I’m not that brave.
If you create a mirror and your IPS Repo reports "0 packages", it’s because you’ve started a full repo server and not a mirror. Double check you did this bit:
# svccfg -s pkg/server > add mirror > select mirror > addpg pkg application > addpg start method > setprop pkg/mirror = boolean: true > setprop pkg/inst_root = astring: "/export/pkg" > setprop pkg/threads = count: 50 > exit # svcadm refresh pkg/server:mirror # svcadm enable pkg/server:mirror
Enjoy!
Add comment June 26th, 2010
More Solaris Broadcom Driver Information
Update 2010-07-01: Sun got back to one of the blog commenters regarding the issue with Broadcom NICs dropping out on HP servers and stated the issue relates to the HP supplied Broadcom drivers, and Sun recommended using these. So HP people may be seeing a different issue. Please see this blog comment for details. Many thanks for passing this information on Daniel!.
As previously mentioned, we’ve been having a nightmare with Broadcom NICs suddenly dropping out / hanging / freezing. All network traffic ceases / halts, despite the interfaces being up and showing no signs of any issues. This issue started affecting us after rolling out an upgrade to Solaris 10 update 8, but it also affects recent OpenSolaris builds. This has been on Dell R410 servers and R710 servers, and we’ve heard about people on HP servers having the same issue.
We thankfully found a workaround for it, which basically consists of disabling C-States in the BIOS. This is a power saving feature and support for it was added into Solaris 10 update 8, which is where we’re seeing the issue.
However prior to finding this workaround, I contacted Broadcom via their “Submit a support request” feature on their website. Nobody got back to me, and we were getting rather desperate so I was rather naughty and dropped one of their Kernel driver engineers a direct email. I won’t say who as he probably doesn’t want others mailing him directly.
The chap replied promptly, which was very impressive. He was very polite and explained that he couldn’t really help customers directly, as the OEM suppliers get upset, but he did offer some hints/tips. He mentioned that MSI-X was causing issues on Linux and suggested disabling it if we’re using v5.2.3 drivers or later. We’re not, we’re on 5.2.2 and 5.2.2 is the newest release available on the Broadcom website, so that was quite interesting.
He attached the release notes for the 6.0.1 driver which isn’t publicly available yet. Here is a snippet of the contents:
Broadcom NetXtreme II Gigabit Ethernet Driver
For Solaris 10 for i386 platform
Copyright (c) 2000-2010 Broadcom Corporation
All rights reserved.
Version 6.0.1 (21 May, 2010)
============================
Fixes
-----
1) Problem : default MTU now set to 1500, fixed jumboframe
and vlan issues.
Cause : buffer sizes weren't being allocated properly
to account for MAC header overhead w/ vlan tags
Change : allocations are now correct
2) Problem : when MSIX interrupt allocation failed driver
fails to attach
Cause : code didn't exist to revert down to Fixed
Change : driver now reverts to Fixed when MSIX interrupt
allocation fails
Version 5.2.3 (23 March, 2010)
==============================
Enhancements
------------
1) Change : Reworked interrupt code to no longer use deprecated
Solaris interrupt APIs.
2) Change : Added support for MSI-X interrupts. MSI-X is now used
by default and can be turned off via "disable_msix"
inside bnx.conf. When MSI-X is disabled then Fixed
level interrupts are used.
2) Change : Added a new "statistics" group to kstat which contains
driver version and interrupt information.
Version 5.2.2 (14 December, 2009)
=================================
Fixes
-----
1) Problem : Kernel Panic in the send routine:
assertion failed: umpacket->mp == NULL,
Cause : The umpacket->mp was not scrubbed properly because
the umpacket never went through the
bnx_xmit_ring_reclaim() function.
Change : After recycling the packet in the TX routine,
the packet is now reclaimed before it is being used.
The 6.x driver for Solaris 10 should hopefully be available later this year. The one that’s in OpenSolaris unfortunately can’t be used with Solaris 10 due to network stack differences.
But the interesting thing is that there *is* a newer 5.2.3 Driver out there that came out in March this year. So I had a google, and it looks like that this driver has been supplied to OEMs but still isn’t available from Broadcom directly. So I downloaded an IBM Driver ISO Image that contains this newer driver, and it installs fine. We’re going to be using this in conjunction with disabling C-States and I’ll report back on how that combination is going.
After discovering the C-States workaround for the NIC dropouts I mailed the Broadcom guy again to let him know, and stated we’d be disabling C-States to see if it fixes the issue. He replied with:
Please let me know if this works for you so that I can pass it on to our Solaris developers. I checked with them to see if this was a known issue and they replied that they had been trying to duplicate the problem but had not been successful to date. When performance testing we often disable certain CPU features in order to maximize Ethernet throughput so it may be that the system BIOS settings are the key difference here.
So this is very encouraging - hopefully this tip will enable the Broadcom Solaris engineers to reproduce the issue and fix it.
Another final thing - to keep all our servers identical, in addition to flashing the system bios, DRAC Firmware and LSI/SAS6i Firmware, we’ve now started upgrading the Firmware on all the Broadcom NICs too.
This is easier said than done. My method involved producing a 2.88MB Dos boot image with the appropriate files, taken from various places. I nabbed the latest Dell Broadcom NIC Firmware Linux package to get the firmware files. I then pinched the DOS uxdiag.exe tool from the Broadcom diagnostics ISO to do the upgrades. I then produced a .bat file which runs:
uxdiag -c 1 -t abcd -F -fbc bc09x50b.bin uxdiag -c 2 -t abcd -F -fbc bc09x50b.bin uxdiag -c 1 -t abcd -F -fncsi ncsifw_x.205 uxdiag -c 2 -t abcd -F -fncsi ncsifw_x.205 uxdiag -c 1 -t abcd -F -fib_ipv4n6 ib6btv41.06 uxdiag -c 2 -t abcd -F -fib_ipv4n6 ib6btv41.06 uxdiag -c 1 -t abcd -F -fmba bxmba508.nic uxdiag -c 2 -t abcd -F -fmba bxmba508.nic uxdiag -c 1 -t abcd -mfw 0 uxdiag -c 2 -t abcd -mfw 0
What a lot of faffing about. You’d think Dell would make this stuff easier to do. Anyway, if you’re interested, please feel free to download my Broadcom DOS Firmware update disk image.
12 comments June 26th, 2010
Update to Broadcom NIC Dropping out on Solaris issue
Update 2010-07-01: Sun got back to one of the blog commenters regarding the issue with Broadcom NICs dropping out on HP servers and stated the issue relates to the HP supplied Broadcom drivers, and Sun recommended using these. So HP people may be seeing a different issue. Please see this blog comment for details. Many thanks for passing this information on Daniel!.
BREAKING NEWS - 2010-06-25 11:30 BST (GMT+1): I’ve just spoken with a chap called mui on #opensolaris on irc.freenode.net who reports that this issue relates to “C States”. Disabling “C States” in the BIOS (It’s in “Processor Settings” on Dell boxes) supposedly will work-around the issue. C States support was added in Solaris 10 update 8, so this is probably why our Solaris 10 update 7 boxes are unaffected.
Supposedly Sun/Oracle have a patch internally they can supply to you for Solaris 10 if you have a support contract. If you’re on OpenSolaris, Mui has made this package available that works with snv_134. DISCLAIMER: Please test this prior to putting it into production as it’s provided with no warranty. Alternatively you might be able to grab the latest 6.0.1 BNX driver from the on-closed-bins.i386.tar.bz2 package on the OpenSolaris website.
Here’s the rest of the (now somewhat out of date) post…
Right, I have an update on the Broadcom NIC issue.
It seems the BIOS was a bit of a red herring, the Broadcom FW is completely independent of the system BIOS and downgrading this doesn’t change the Broadcom FW version. Pretty obvious really - I have no idea where I read that the two were linked.
Anyway, I did find a broadcom firmware tool called lnxfwnx2 which Dell distributes in the Broadcom firmware update packages. It’s a Linux tool and it lets you save out/restore firmware from Broadcom NICs.
Unfortunately I couldn’t find 4.x.x Firmware releases for the card, only 5.x.x releases. It’s highly frustrating Broadcom don’t provide these things directly.
However we have two Dell R410 boxes running Solaris 10 update 7 which have been running for over 200 days and never had any network issues at all. They have the 4.6.4 firmware on them. I am planning on taking one of these out of service, saving out the Broadcom firmware with the tool, and then loading this firmware onto the new misbehaving Dells.
I’ll also copy across the same BRCMbnx driver package from the boxes that haven’t had any issues as well. I’m also planning on putting the same Dell System BIOS on the new machines as the working ones. This way the Broadcom FWs will match, the System BIOS will match, and the Drivers will match. The only difference will be Solaris 10 update 7 vs Solaris 10 update 8.
We can then see if the new boxes behave themselves…
The Dell package is here: ftp://ftp.us.dell.com/network/NETW_FRMW_LX_R259547.BIN
I couldn’t get it to run on Ubuntu/Debian based distros, but it runs fine on the CentOS 5 32bit live CD:
http://mirror.sov.uk.goscomb.net/centos/5/isos/i386/CentOS-5.5-i386-LiveCD-Release2.iso
Once you’ve booted the LiveCD, configure the network, then do:
# wget ftp://ftp.us.dell.com/network/NETW_FRMW_LX_R259547.BIN # chmod 755 NETW_FRMW_LX_R259547.BIN # ./NETW_FRMW_LX_R259547.BIN --extract r259 # cd r259 # ./lnxfwnx2
It’s an interactive tool and you can type “help” to get a list of commands.
Here’s an example of saving/restoring:
0> dumpnvram nic-fw-backup.bin 0> restorenvram new-nic-frmw.bin
I got these instructions from here. It’s also possible saving the NVRAM will save all options, including the MAC address, so double check this when restoring the NVRAM on a different machine.
It also looks like the DOS based diagnostics ISO from Broadcom’s website has a similar tool called uxdiag.exe which can program the firmware and turn various features of the card on/off (such as WOL (Wake on LAN), the ‘mba’ (MultiBoot Agemt), the ‘management firmware’ (Still don’t know what this does). You can get the iso from:
http://www.broadcom.com/support/ethernet_nic/driver-sla.php?driver=NX2-diag
The boot menu gives the option “Install FreeDOS to harddisk” which is the option you want - you can opt later on not to do this but to run FreeDOS from the CD. A bit confusing. The uxdiag tool has a manual here.
I also spotted this thread on forums.sun.com which suggests a lot of HP people are having the same issue, irrespective of the FW version. So it remains to be seen what the root cause actually is.
Add comment June 25th, 2010
Broadcom NICs dropping out on Solaris 10
Update 2010-07-01: Sun got back to one of the blog commenters regarding the issue with Broadcom NICs dropping out on HP servers and stated the issue relates to the HP supplied Broadcom drivers, and Sun recommended using these. So HP people may be seeing a different issue. Please see this blog comment for details. Many thanks for passing this information on Daniel!.
BREAKING NEWS - 2010-06-25 11:30 BST (GMT+1): I’ve just spoken with a chap called mui on #opensolaris on irc.freenode.net who reports that this issue relates to “C States”. Disabling “C States” in the BIOS (It’s in “Processor Settings” on Dell boxes) supposedly will work-around the issue. C States support was added in Solaris 10 update 8, so this is probably why our Solaris 10 update 7 boxes are unaffected.
Supposedly Sun/Oracle have a patch internally they can supply to you for Solaris 10 if you have a support contract. If you’re on OpenSolaris, Mui has made this package available that works with snv_134. DISCLAIMER: Please test this prior to putting it into production as it’s provided with no warranty. Alternatively you might be able to grab the latest 6.0.1 BNX driver from the on-closed-bins.i386.tar.bz2 package on the OpenSolaris website.
Here’s the rest of the (now somewhat out of date) post…
We’ve encountered this bug quite a few times and up until I found these bug reports, we weren’t sure what was causing the issue:
S10 bnx NICs randomly hang/drop out of the network
The symptoms are basically that the server loses network connectivity - traffic just stalls. Because this keeps happening on production boxes we have to reboot pretty damn quickly so haven’t had an opportunity to diagnose the issue in detail. We tried a number of fixes to no avail, and I was at my wits end until I encountered the above bug report.
Our servers are Dell R410 machines and we’ve seen this happening on Dell R710 machines as well, with Solaris 10 update 8. We’re running with the latest Solaris 10 patches and the latest Broadcom drivers from the Broadcom website (5.2.2). I believe we’ve seen this issue with the stock drivers shipped with Solaris 10 update 8 as well.
From the bug reports, the issue seems related to the firmware running on the cards - version 5* is affected, version 4* isn’t. I believe the Firmware is tied to the Dell BIOS running on the machine. Here’s the output from one of our affected boxes:
# prtdiag | head -n 2 System Configuration: Dell Inc. PowerEdge R410 BIOS Configuration: Dell Inc. 1.3.9 04/07/2010 # grep -i BCM /var/adm/mes* /var/adm/messages:Jun 12 03:21:38 bnx: [ID 995108 kern.info] NOTICE: bnx0: BCM5709 device with F/W Ver500000b is initialized. /var/adm/messages:Jun 12 03:21:38 bnx: [ID 995108 kern.info] NOTICE: bnx1: BCM5709 device with F/W Ver500000b is initialized.
Here is the output from a machine that’s not affected:
# prtdiag | head -n 2 System Configuration: Dell Inc. PowerEdge R410 BIOS Configuration: Dell Inc. 1.1.5 07/29/2009 # grep BCM /var/adm/messages* /var/adm/messages.2:May 27 15:11:43 bnx: [ID 995108 kern.info] NOTICE: bnx1: BCM5709 device with F/W Ver4060004 is initialized. /var/adm/messages.2:May 27 15:11:43 bnx: [ID 995108 kern.info] NOTICE: bnx0: BCM5709 device with F/W Ver4060004 is initialized.
My understanding is that the fix is to downgrade the BIOS of the machine to a previous release that uses a 4* Broadcom Firmware release. We haven’t yet tested this but should be able to later this week. So far it doesn’t look like Sun/Oracle have released a publicly available patch to address the issue.
Update: 2010-06-25 - Upgrading/Downgrading the system BIOS makes no difference to the Broadcom FW (duh! silly me). I’ve written an updated post with more information here: http://blogs.everycity.co.uk/alasdair/2010/06/update-to-broadcom-nic-dropping-out-on-solaris-issue/
13 comments June 14th, 2010
Rebooting a Dell DRAC
Strangely Dell don’t provide a way to reset the DRAC card via the web interface. I got into a situation with one where I couldn’t upload a firmware update, and I figured rebooting the DRAC would sort the issue.
So to get around this, I enabled IPMI access over the LAN via the (thankfully still working) web interface, and then issued the following ipmitool command:
ipmitool -H ipaddress -U root mc reset cold
This rebooted the drac which then behaved itself :-)
Add comment May 25th, 2010
Apple Mac Keyboard Shortcuts
With my trusty Dell XPS M1330 is on its last legs, I decided to get with the times and buy a MacBook Pro 13". I’m very, very happy with it. However switching from the PC there were a few main keyboard shortcuts I struggled to find until I asked around/googled. For the benefits of others they are:
PC: ctrl-shift-left/right
Mac: alt-shift-left/right
Selects whole words, useful when deleting the last few words you typed
PC: Page Down Key
Mac: Fn-Down
Useful: Apple - ~
Cycle between windows of same app
Gosh I felt there were more than these. May add more to this as I figure them out :-)
Update: I also found that you can make dialogue box widgets tabble onto (OK/Cancel buttons etc) by doing: System Preferences —> Keyboard —> Keyboard Shortcuts and at the bottom of the page, click on the All controls radio button. Very handy.
Also Ben Summers helpfully recommended a useful tool called Alfred which I’d recommend checking out - helps you quickly launch things by hitting alt-spacebar.
2 comments May 9th, 2010
OpenSolaris 2010.03 (2010.04)
Like many people, I’ve been waiting eagerly and anxiously for OpenSolaris 2010.03 to come out. Over the past 9 months, OpenSolaris has come along leaps and bounds with a never ending procession of new features and enhancements, such as ZFS Deduplication and COMSTAR. While the OpenSolaris dev builds provide access to these new features, are generally stable enough for production use if you pick the right build, it’s obviously best to stick to stable releases for production environments.
However March came and went, and we’re now one third of the way through April, and still no sight of 2010.03. This has caused a lot of people to become quite anxious regarding OpenSolaris’ future.
Oracle have stated they’re going to invest more in Solaris than Sun did, and have stated their intention to keep OpenSolaris going. However they have also recently revoked the free version of Solaris 10, making it a 90 day trial. My understanding is that now, to run Solaris 10, you need an entitlement to do so, which comes from having a valid support contract for Solaris on Sun^H^H^HOracle hardware.
This has serious implications for many businesses, including EveryCity. Our current platform is Solaris 10 based, and while we started off using Sun servers, their Intel Nehalem range is simply too expensive, so we’ve been purchasing Dell R410 and R610 machines.
Thankfully, we have already planned to move to OpenSolaris for some time now, and our forthcoming platform will be OpenSolaris based. So this is no real issue for us - it just means Solaris 10 update 8 will be the last update to our Solaris 10 platform, and at some point down the line all our Solaris 10 Zones will become branded zones under OpenSolaris.
The future is uncertain, but thankfully OpenSolaris shows no sign of going away. Oracle’s culture is quite different from the one at Sun; it’s clear they’re very corporate, and rather secretive. They have also stated that they take a very “hands off” approach to their User Groups, so for example the LoSUG group now has to be managed by non-Oracle employees, and there’s no longer any free food/drink for attending. I was tempted to volunteer to help organise LoSUG, but unfortunately I just don’t have the time at present. Hopefully that will change in the future.
Oracle are hell bent on making their investment profitable, and as long as OpenSolaris continues to develop at the pace it has been going thus far, and continues to be free and open, I’m happy. If anything Oracle’s behaviour may bring the OpenSolaris community outside of Sun/Oracle closer together, and foster more community involvement and development, which can only be a GoodThing[tm].
And while there’s been a lot of suggestion that OpenSolaris 2010.03 may never materialise on the grape vine, this is clearly not the case, and it seems it may be just round the corner:
Re: osol-discuss - So who has been able to update to the b136 image? by Alan Coopersmith alan.coopersmith@oracle... 2010-04-07T13:58:54+00:00. Chad Welsh wrote: > I have run update from IPS package manager and from the > pkg image-update from the command line and nothing. are > the packages only in the ON Gate? If so when will they be > released into the wild for us to use? Packages for the later builds have been built for all the gates, not just ON, but not published to pkg.opensolaris.org while the 2010.03 release is being finished.
Hi Sarah, Due to some security concerns and other issues, I'd flipped a coin and await the forthcoming snv_b138 kernel release. As for any independent distro releases between now and before snv_b138, my opinion is to wait on doing any major system upgrades, 'production' related migrations, or journalistic reviews until snv_138 is officially released. This is for mainly commented for current users using OpenSolaris for 'very' high-end production-grade audio/video workstations or high-availability servers with several TBs of in-flux data. If you are having ANY major issues with a prior OpenSolaris release, just give OSOL 2010.03 until April 16th or await the snv_138 kernel release. You'll be 'very' glad you did. ~ Ken Mays
1 comment April 10th, 2010
The case for RAIDZ2
We have an old x4500 knocking around which is getting on for 3 years old now. At the beginning of last month, we did a scrub, and to our horror discovered checksum errors on almost all the drives:
pool: pool01
state: DEGRADED
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-9P
scrub: scrub completed after 23h0m with 0 errors on Wed Mar 3 12:55:36 2010
config:
NAME STATE READ WRITE CKSUM
pool01 DEGRADED 0 0 0
raidz1-0 ONLINE 0 0 0
c11t3d0 ONLINE 0 0 4 2.50K repaired
c10t3d0 ONLINE 0 0 0
c13t3d0 ONLINE 0 0 4 1.50K repaired
c7t1d0 ONLINE 0 0 0
c8t3d0 ONLINE 0 0 5 1K repaired
c7t3d0 ONLINE 0 0 4 2K repaired
c10t2d0 ONLINE 0 0 3 1K repaired
c13t2d0 ONLINE 0 0 2 1K repaired
c11t6d0 ONLINE 0 0 3 1K repaired
c8t2d0 ONLINE 0 0 16 7K repaired
c7t2d0 ONLINE 0 0 4 2.50K repaired
raidz1-1 DEGRADED 0 0 0
c11t7d0 ONLINE 0 0 6 64K repaired
c10t7d0 DEGRADED 0 0 58 too many errors
c13t7d0 ONLINE 0 0 4 3.50K repaired
c12t7d0 ONLINE 0 0 3 7K repaired
c8t7d0 ONLINE 0 0 2 4.50K repaired
c7t7d0 ONLINE 0 0 4 11.5K repaired
c10t6d0 ONLINE 0 0 4 11K repaired
c13t6d0 ONLINE 0 0 8 86K repaired
c12t6d0 ONLINE 0 0 0
c8t6d0 ONLINE 0 0 2 1K repaired
c7t6d0 ONLINE 0 0 2 2.50K repaired
raidz1-2 DEGRADED 0 0 0
c11t5d0 ONLINE 0 0 1 9K repaired
c10t5d0 ONLINE 0 0 1 13K repaired
c13t5d0 ONLINE 0 0 2 1.50K repaired
c12t5d0 ONLINE 0 0 1 1K repaired
c8t5d0 DEGRADED 0 0 135 too many errors
c7t5d0 ONLINE 0 0 2 1.50K repaired
c10t4d0 ONLINE 0 0 8 44K repaired
c13t4d0 ONLINE 0 0 3 5K repaired
c12t4d0 ONLINE 0 0 3 2K repaired
c8t4d0 ONLINE 0 0 2 6.50K repaired
c7t4d0 ONLINE 0 0 2 13.5K repaired
errors: No known data errors
Thankfully it’s not used for production, so this didn’t bother us a huge amount. ZFS repaired the data errors without issue (hurrah for ZFS!), and we have been replacing the worst affected disks. We’re now doing weekly scrubs to keep the data “fresh” and stop it rotting away.
However one interesting issue that cropped up. We’re using RAIDZ1, which only stores enough parity for 1 disk to be out of service. Since ZFS uses the parity data to reconstruct blocks with checksum errors, if you’re one disk down, and have a block with a checksum error, you’re in trouble - it can’t repair it and you’re data is corrupted.
So when you replace a failed disk in a RAIDZ1 set, you had better hope you don’t encounter any checksum errors on the other disks during the resilver process. Because ZFS has to read in all the data from the other disks to resilver the new disk, you’re at a high risk of encountering checksum errors, especially in our situation where the disks are wearing out.
And this is precisely what happened next. We replaced a failed disk, and during the resilver, ZFS encountered checksum errors on the other disks it couldn’t repair, and we started to lose data:
pool: pool01
state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
scrub: resilver completed after 15h47m with 219 errors on Sat Apr 10 16:14:59 2010
config:
NAME STATE READ WRITE CKSUM
pool01 DEGRADED 0 0 331
raidz1-0 ONLINE 0 0 0
c11t3d0 ONLINE 0 0 0
c10t3d0 ONLINE 0 0 0
c13t3d0 ONLINE 0 0 0
c8t5d0 ONLINE 0 0 0
c8t3d0 ONLINE 0 0 0
c7t3d0 ONLINE 0 0 0
c10t2d0 ONLINE 0 0 0
c13t2d0 ONLINE 0 0 0
c11t6d0 ONLINE 0 0 0
c8t2d0 ONLINE 0 0 0
c7t2d0 ONLINE 0 0 0
raidz1-1 ONLINE 0 0 0
c11t7d0 ONLINE 0 0 0
c11t2d0 ONLINE 0 0 0
c13t7d0 ONLINE 0 0 0
c12t7d0 ONLINE 0 0 0
c8t7d0 ONLINE 0 0 1
c7t7d0 ONLINE 0 0 0
c10t6d0 ONLINE 0 0 0
c13t6d0 ONLINE 0 0 0
c12t6d0 ONLINE 0 0 0
c8t6d0 ONLINE 0 0 0
c7t6d0 ONLINE 0 0 0
raidz1-2 DEGRADED 0 0 888
c11t5d0 DEGRADED 0 0 0 too many errors
c10t5d0 DEGRADED 0 0 0 too many errors
c13t5d0 DEGRADED 0 0 0 too many errors
c12t5d0 ONLINE 0 0 0 401G resilvered
c12t3d0 DEGRADED 0 0 0 too many errors
c7t5d0 DEGRADED 0 0 0 too many errors
c10t4d0 DEGRADED 0 0 0 too many errors
c13t4d0 DEGRADED 0 0 0 too many errors
c12t4d0 DEGRADED 0 0 0 too many errors
c8t4d0 DEGRADED 0 0 0 too many errors
c7t4d0 DEGRADED 0 0 0 too many errors
errors: 219 data errors, use '-v' for a list
Ouch! 219 data errors.
Thankfully ZFS knows precisely which files are affected, and you can just delete/replace/restore the affected files/snapshots and it keeps on running.
However after this, I’m sold on RAIDZ2. I don’t think I’ll be using RAIDZ1 again - the risk of losing data when you’re replacing a failed disk is just too high.
Add comment April 10th, 2010
