Archive for July, 2010
ZFS runs *really* slowly when free disk usage goes above 80%
You’re sat at your desk, sipping a nice beverage, all is well in the world. Your busy database is sat happily in the background as always. Suddenly, out of the blue, performance drops significantly. You’re perplexed - nothing has changed. The DBAs go wild - fingers get pointed. They claim the DB is fine - the hardware/OS is at fault. You investigate, you see a lot of IO, but you can’t pin it down. What’s going on?
Well, you could very well just have hit 80% disk space usage, and now your disk performance has gone through the toilet.
You can fix the issue by running this:
echo "metaslab_df_free_pct/W 4" | mdb -kw
And you can make it permanent by doing:
echo "set zfs:metaslab_df_free_pct=4" >> /etc/system
What does this do? Well, ZFS normally uses “first fit” block allocation policy. When you hit 80% disk space usage, it switches to best fit. To quote the source code:
50 * The minimum free space, in percent, which must be available
51 * in a space map to continue allocations in a first-fit fashion.
52 * Once the space_map's free space drops below this level we dynamically
53 * switch to using best-fit allocations.
All current Solaris 10 releases, and versions of OpenSolaris prior to 22-Nov-2009 use a default of 30 for this value. How does a value of 30 equate to 80% disk space usage? I have no idea - I’ve never figured that one out. All I know is, run the above commands, and the problem goes away. Perhaps someone with more knowledge of how metaslabs work can enlighten me :)
1 comment July 18th, 2010
Using mbuffer to speed up slow zfs send | zfs receive
So, you find yourself doing a zfs send | receive, perhaps a large incremental send. You find the transfer process going really slowly - trickling along at less than 1MB/sec. Yet, you know the sender and the receiver is capable of far far more than this. What’s the deal?
Well, basically, zfs receive is bursty - it can spend ages computing something, doing no receiving, then blat the data out to disk. The issue with this is that it stalls the sender, resulting in a bursty and slow transfer process.
The solution is to deploy mbuffer into the mix. MBuffer will buffer the stream, which you can do at both ends. While ZFS receive can’t receive, mbuffer buffers, when zfs receive can receive, mbuffer sends it data as fast as it can. Let’s see an example:
# Start the receiver first. This listens on port 9090, has a 1GB buffer,
and uses 128kb chunks (same as zfs):
mbuffer -s 128k -m 1G -I 9090 | zfs receive data/filesystem
# Now we send the data, also sending it through mbuffer:
zfs send -i data/filesystem@1 data/filesystem@2 | mbuffer -s
128k -m 1G -O 10.0.0.1:9090
You’ll get a lovely output such as:
in @ 15.8 MB/s, out @ 8923 kB/s, 1.0 GB total, buffer 6% full
And hopefully, you’ll find your zfs send|receive suddenly go a lot lot quicker.
Here is the link to download the useful mbuffer program.
I personally found this approach decreased my send time for 40GB from over 4 hours down to around 30 minutes. Hurrah! :)
9 comments July 18th, 2010
Solaris 10 for free, or on Non-Sun hardware, is dead
Update, 28th July: I was perhaps a bit premature in declaring this. HP is now selling Solaris Support once again, and I believe you can still get Solaris support on some Dell models. IBM however are no longer offering it.
So the title should more accurately read: Solaris 10 for free for production use, is dead
I think quite a lot of us have been living in denial about this, even after Oracle altered the Solaris 10 license to make the free download a 90 day trial. People sort of shrugged and said “Well, you can still buy a support license from HP and Dell”. Even after Oracle cancelled the HP deal, people were still hopeful. “Perhaps this was a negotiating tactic!” I heard people cry on IRC.
Well, the fact is, the truth should have been obvious as far back as February. On February 23rd this year, Dan Roberts, Director of Solaris Product Management at Oracle told the OpenSolaris Governing Board:
Q - PT - What about support on third-party hardware?
A - DR - At this point Oracle is very focused on places where they can make revenue and margin. Unfortunately for us, we have not seen a good uptake on those standalone subscriptions. Has seen more emails on the topic than the total number of systems sold. Hard to make a case. At this point, there are no plans to support non-Sun systems. We will continue to honor existing contracts for the term of that contract. Over time, we hope to move folks over to Sun hardware.
Q - PT - What about regular Solaris?
A - DR - Same answer as above.
Q - PT - Will the ability to download and run it without support continue?
A - DR - Look at the licenses carefully. Production deployments will require a support agreement which is sold on Sun systems only.
In plain english, Oracle has no intention of providing support for OpenSolaris, nor for Solaris on Non-Sun Hardware. Nor will you be allowed to run Solaris 10 on a production system without a support contract.
This relegates the OpenSolaris distribution to a useless toy not fit for production, and means if you want to use Solaris 10 you have to buy a Sun server from Oracle and buy a support contract.
This effectively makes Solaris 10 unviable for a large number of users. While Oracle’s Sun Servers are beautiful pieces of engineering, they are vastly over priced, and you can get equivalent Dell kit for half the price.
Dan did say some somewhat positive things, stating:
* Oracle is increasing investment in Solaris and Oracle considers OpenSolaris a part of Solaris.
* Will continue to support the community.
* Will continue to contribute to the source base.
* Plan to continue OpenSolaris releases.
* Solaris releases will continue.
* What will Oracle do to support OpenSolaris as a distribution? We will continue to support Solaris offerings and we will continue to include OpenSolaris. The form will change. We will no longer offer independent support offerings for Solaris or OpenSolaris. They will be part of Systems Support Offerings that include Sun hardware.
If the community wants to continue to be able to run some form of Solaris on their non-Sun hardware, (or on their Sun hardware but without a support contract), the community is going to have to step up and do something.
I have very strong reason to believe the community is about to do just that. I can’t provide details just yet, but something big may be coming RealSoonNow[tm]. Stay tuned.
4 comments July 17th, 2010
OpenSolaris - July Update
Well, the OpenSolaris Governing Board has given Oracle an ultimatum: Make contact by August 16th, or they resign and hand control of the community back to Oracle.
To quote the above linked forum post..
"Without the Oracle part of the partnership at the table, there is effectively nothing for the OGB - or development community - to do. The flagship OpenSolaris distro is absent, the IPS repositories are stagnant, the build instructions no longer work for the sources that exist, even the architectural reviews of community-developed components are being held behind Oracle’s closed doors. It is as if the spirit of open, collaborative development centered around the Solaris operating system has died."
Nobody really knows what Oracle are up to, but their decision not to even talk to the OpenSolaris Governing Board strongly suggests Oracle are disinterested in the health of the community. My personal opinion, based on what I’ve read and observed, is that Larry wants Solaris for Oracle’s enterprise systems at the top end, and doesn’t give two shits about OpenSolaris or the community.
As such, the best the community can hope for is that Oracle will continue to provide the source code to OpenSolaris. Worse case, this disappears. I don’t even want to contemplate this, as it essentially means we’ll have to formulate a “Solaris Exit Plan”. Effectively this means NetApp and Ubuntu.
Anyway, lets assume for now Oracle will continue to provide the OpenSolaris source code to the community. If they do, then I have some opinions on what the community should do.
Here is what I posted to the OpenSolaris discuss and ogb mailing list:
IMHO, The Oracle/Sun provided OpenSolaris reference distribution (henceforth referred to as Indiana to avoid confusion) has done the community a disservice, in the sense that it has prevented a community from producing something itself.
All the other OpenSolaris based distributions such as Schillix, Nexenta etc all cater for particular niches, but what what’s needed is a community produced version of Indiana. One with the same (or at least, similar) goals with an identical/similar architecture including aspects such as IPS, Automated Installer, Zones, etc.
As long as Oracle/Sun continue to release their own distribution, the community has no real reason to do so. Well, perhaps now is the time for this to happen. Perhaps what is needed is an agreement with Oracle along the lines of:
1. Oracle agrees to continue to provide the source code for OpenSolaris (nevada), along with constituent parts (such as IPS/pkg). Oracle continue to provide bug and security fixed updates to the closed source binaries.
2. OpenSolaris 2010.xx is never released, but becomes Solaris Next.
3. The community steps up and produces it’s own version of Indiana, tracking Solaris Next as best it can in a binary and package compatible way.
4. The community maintains it’s own source code repository that developers can commit to, and Oracle takes community improvements that they want.
This frees Oracle from their obligation to the community, and allows them to maintain their secrecy and radio silence. But it forges an even stronger community that can stand on it’s own legs.
Obviously the issue the community has is that we’ve never had the ability to produce the distribution itself. We don’t have the ability to build all the packages that go into the IPS repo, nor produce the Live CD, nor do we have an installer. And of course, finding people to do the actual work would present a significant challenge.
The good news is that there is a community out there. There are the community members who have been involved with the OpenSolaris derived distributions. There are ex Sun/Oracle staff who have moved to other companies, such as Nexenta. There are projects such as OSUnix who are trying to produce their own OS from the OpenSolaris codebase by replacing the closed binaries/code (such as the internationalised bits of libc).
Not to mention, there’s Blastwave and OpenCSW who are already building large amounts of software for Solaris/OpenSolaris, and if one/both decided to contribute, we have a huge source of software packages for the community based distro.
If the fragmented OpenSolaris community rallied round and came together, I’m quite confident a community based distribution could thrive. Indeed, if Solaris Next does become an “Oracle Hardware Only” OS, then an entire company providing support for the community based distribution would definitely have legs, and this could potentially afford to pay staff to work on building the distribution full time. Solaris is run by a very large number of people on Dell/HP/etc kit and these users would no doubt be eager to jump onto such a distribution.
I’m going to be talking about my thoughts on this at the London OpenSolaris Users Group later this month, if anyone is in London and wants to come along. And of course I’d appreciate peoples comments here on this thread.
Alasdair
1 comment July 15th, 2010
Signing Windows Drivers
We were signing some Open Source GPL Windows drivers to use on Windows Server 2008 x64 edition (which only accepts signed drivers) and were encountering the following Windows boot error after installing our supposedly successfully signed drivers:
0xc0000428 Windows cannot verify digital signature for this file
We have an official Verisign code signing certificate, and were a bit stumped/confused. The files seemed to be signed and everything seemed to be okay.
The problem was that we were not adding a cross-certificate. A cross certificate basically provides a chain of authority so that Windows is able to trust your certificate. You can find out more information about this from this helpful blog post over here.
If we had bothered to TRFM, the Code Signing walkthrough does kind of tell you you need to sign your drivers with a cross certificate, so we probably could have saved ourselves a lot of time by reading this first.
Anyway, you can obtain the cross certs from here, and the option to use with signtool.exe is /ac, so for example you’d type:
signtool.exe sign /v /ac MSCV-VSClass3.cer /s my /n "Every City Limited" /t http://timestamp.verisign.com/scripts/timestamp.dll xenusb\%BUILDDIR%\blah.cat
Hopefully this post will help other people save some time, as we spent all day trying to figure this one out.
Add comment July 6th, 2010
The Oracle OpenSolaris Farce
Well it was meant to be OpenSolaris 2010.03. Then Oracle happened, and they stated 2010.1H (first half). This has come and gone.
Is it ever going to arrive?
According to the rumour mill September 2010 is a possible new target date. Or just "Sometime in 2010". But who knows.
Some are even starting to wonder if Oracle are turning OpenSolaris 2010.03 into Solaris 11 proper. If they supported this on Dell+HP hardware in addition to their own (As Sun did prior to the acquisition), that would be an extremely enticing proposition.
Time will tell. Oracle are not stupid. But this silence and complete lack of communication is insanely frustrating.
Some might be wondering, why all the hoohaa anyway? It’s just an operating system, right? Well.. because OpenSolaris is evolving at an incredible rate, and the new features list since OpenSolaris 2009.06 came out is absolutely enormous. Those who have used the development branches have tasted the forbidden fruit, and they want it. They really god damn want it.
OpenSolaris 2009.06 was an interesting toy. But OpenSolaris 2010.? has finally reached the prime time. Solaris 10 containers, COMSTAR iSCSI/FCoE framework, the new Crossbow Networking stack, the latest ZFS features including separate cache and log devices, plus of course ZFS Deduplication, Xen 3.4 support via xVM.. the feature list is enormous.
In the mean time Nexenta is chugging away churning out new releases with backported fixes from the latest sources. I might have to check out Nexenta again. I never thought I’d say this, but I actually much prefer IPS and beadm to apt-get and apt-clone. Plus I dread to think how they’ve implemented Zones on Nexenta. But the full GNU Userland is rather appealing.
My advice to Oracle: Don’t fuck this up.
2 comments July 1st, 2010
