<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Adjusting drive timeouts with mdb on Solaris or OpenIndiana</title>
	<atom:link href="http://blogs.everycity.co.uk/alasdair/2011/05/adjusting-drive-timeouts-with-mdb-on-solaris-or-openindiana/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.everycity.co.uk/alasdair/2011/05/adjusting-drive-timeouts-with-mdb-on-solaris-or-openindiana/</link>
	<description></description>
	<lastBuildDate>Thu, 16 May 2013 16:31:38 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.4.1</generator>
	<item>
		<title>By: Alasdair</title>
		<link>http://blogs.everycity.co.uk/alasdair/2011/05/adjusting-drive-timeouts-with-mdb-on-solaris-or-openindiana/#comment-765</link>
		<dc:creator>Alasdair</dc:creator>
		<pubDate>Mon, 16 Jul 2012 16:33:17 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.everycity.co.uk/alasdair/?p=429#comment-765</guid>
		<description>Hi Chris,

Sorry for taking a while to respond!

We are using Crucial SSDs, www.crucial.com. We&#039;ve found them to be reliable, reasonable performance, cost effective, and they supply a 3 year warranty. A good all-rounder, even for enterprise workloads.</description>
		<content:encoded><![CDATA[<p>Hi Chris,</p>
<p>Sorry for taking a while to respond!</p>
<p>We are using Crucial SSDs, <a href="http://www.crucial.com" rel="nofollow">http://www.crucial.com</a>. We&#8217;ve found them to be reliable, reasonable performance, cost effective, and they supply a 3 year warranty. A good all-rounder, even for enterprise workloads.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Chris</title>
		<link>http://blogs.everycity.co.uk/alasdair/2011/05/adjusting-drive-timeouts-with-mdb-on-solaris-or-openindiana/#comment-759</link>
		<dc:creator>Chris</dc:creator>
		<pubDate>Thu, 31 May 2012 03:34:19 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.everycity.co.uk/alasdair/?p=429#comment-759</guid>
		<description>Hi Alasdair, out of curiosity, what type of 256GB SSDs are you using on that server? We are working on a config almost exactly like yours and I happened to stumble on your post.</description>
		<content:encoded><![CDATA[<p>Hi Alasdair, out of curiosity, what type of 256GB SSDs are you using on that server? We are working on a config almost exactly like yours and I happened to stumble on your post.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Alasdair</title>
		<link>http://blogs.everycity.co.uk/alasdair/2011/05/adjusting-drive-timeouts-with-mdb-on-solaris-or-openindiana/#comment-752</link>
		<dc:creator>Alasdair</dc:creator>
		<pubDate>Thu, 22 Mar 2012 05:36:37 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.everycity.co.uk/alasdair/?p=429#comment-752</guid>
		<description>Hi Jon,

It&#039;s not too hard if you&#039;re using NFS since NFS is stateless. You also have to use SAS drives and SAS arrays as you can plug a dual-head SAS array into two physical servers and both nodes can see the disks at the same time.

You then just need to write some scripts to:

a. Detect failure (we use NRPE to check writing to an nfs mount mounted via a loopback cable between the primary and the secondary node)
b. Have the secondary pull the plug on the primary (we do this via IPMI)
c. Forcibly import the zpool and bring up the IPs

To improve the c failover time we use VNICs with the same MAC on both machines, so hosts don&#039;t have to learn new MAC addresses.

It&#039;s not a huge amount of code.

With a big array (20TB+) its not fast enough to handle storing virtual machine disk images on it, as the failover can take &gt; 180 seconds which is long enough for Windows and Linux to declare their disks dead and get very unhappy. But it&#039;s fine for NFS clients which recover quite happily.</description>
		<content:encoded><![CDATA[<p>Hi Jon,</p>
<p>It&#8217;s not too hard if you&#8217;re using NFS since NFS is stateless. You also have to use SAS drives and SAS arrays as you can plug a dual-head SAS array into two physical servers and both nodes can see the disks at the same time.</p>
<p>You then just need to write some scripts to:</p>
<p>a. Detect failure (we use NRPE to check writing to an nfs mount mounted via a loopback cable between the primary and the secondary node)<br />
b. Have the secondary pull the plug on the primary (we do this via IPMI)<br />
c. Forcibly import the zpool and bring up the IPs</p>
<p>To improve the c failover time we use VNICs with the same MAC on both machines, so hosts don&#8217;t have to learn new MAC addresses.</p>
<p>It&#8217;s not a huge amount of code.</p>
<p>With a big array (20TB+) its not fast enough to handle storing virtual machine disk images on it, as the failover can take > 180 seconds which is long enough for Windows and Linux to declare their disks dead and get very unhappy. But it&#8217;s fine for NFS clients which recover quite happily.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Alasdair</title>
		<link>http://blogs.everycity.co.uk/alasdair/2011/05/adjusting-drive-timeouts-with-mdb-on-solaris-or-openindiana/#comment-751</link>
		<dc:creator>Alasdair</dc:creator>
		<pubDate>Thu, 22 Mar 2012 05:31:07 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.everycity.co.uk/alasdair/?p=429#comment-751</guid>
		<description>Hi Aaron,

Basically the setting didn&#039;t help the situation I was experiencing. The sd timeouts don&#039;t seem to be used by the mpt_sas driver. So adjusting them is completely pointless.

The way we reduced the possibility of bad disks impacting our storage is to preemptively zpool offline disks that show any signs of misbehaviour, such as taking drives out of service that exhibit a single hard error, or show any SMART errors. So far this has worked quite well.

But ultimately the storage subsystem and/or drivers in the OS need improving. There are bugs open about this:

https://www.illumos.org/issues/1553
https://www.illumos.org/issues/1069</description>
		<content:encoded><![CDATA[<p>Hi Aaron,</p>
<p>Basically the setting didn&#8217;t help the situation I was experiencing. The sd timeouts don&#8217;t seem to be used by the mpt_sas driver. So adjusting them is completely pointless.</p>
<p>The way we reduced the possibility of bad disks impacting our storage is to preemptively zpool offline disks that show any signs of misbehaviour, such as taking drives out of service that exhibit a single hard error, or show any SMART errors. So far this has worked quite well.</p>
<p>But ultimately the storage subsystem and/or drivers in the OS need improving. There are bugs open about this:</p>
<p><a href="https://www.illumos.org/issues/1553" rel="nofollow">https://www.illumos.org/issues/1553</a><br />
<a href="https://www.illumos.org/issues/1069" rel="nofollow">https://www.illumos.org/issues/1069</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jon Strabala</title>
		<link>http://blogs.everycity.co.uk/alasdair/2011/05/adjusting-drive-timeouts-with-mdb-on-solaris-or-openindiana/#comment-729</link>
		<dc:creator>Jon Strabala</dc:creator>
		<pubDate>Fri, 10 Feb 2012 17:18:53 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.everycity.co.uk/alasdair/?p=429#comment-729</guid>
		<description>I second this - can you write an article detailing how to setup a “NFS cluster” with OpenIndiana?  I would rather start off with something rather than roll my own understanding the &quot;risks&quot; are all mine. 

Yes I read your comment in the oi_a51a release: &quot;if you need clustering, then you should be able to justify the budget for it. Clustering on the cheap is a recipe for disaster&quot;

My justification same reason both illumos and openindiana exist - need I say more?

Thanks in Advance</description>
		<content:encoded><![CDATA[<p>I second this &#8211; can you write an article detailing how to setup a “NFS cluster” with OpenIndiana?  I would rather start off with something rather than roll my own understanding the &#8220;risks&#8221; are all mine. </p>
<p>Yes I read your comment in the oi_a51a release: &#8220;if you need clustering, then you should be able to justify the budget for it. Clustering on the cheap is a recipe for disaster&#8221;</p>
<p>My justification same reason both illumos and openindiana exist &#8211; need I say more?</p>
<p>Thanks in Advance</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Aaron Knodel</title>
		<link>http://blogs.everycity.co.uk/alasdair/2011/05/adjusting-drive-timeouts-with-mdb-on-solaris-or-openindiana/#comment-677</link>
		<dc:creator>Aaron Knodel</dc:creator>
		<pubDate>Mon, 12 Dec 2011 15:50:19 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.everycity.co.uk/alasdair/?p=429#comment-677</guid>
		<description>Hi Alasdair,

Can you please comment further on the failure of this setting using mpt_sas? It sounds like what you&#039;re saying is, this setting should in theory work, but the driver is causing issues and it never gets to the ZFS level to use this setting. Is that right? Did you test using the known bad drive from the beginning of the article? Any other details would be appreciated if you have them.

Thanks</description>
		<content:encoded><![CDATA[<p>Hi Alasdair,</p>
<p>Can you please comment further on the failure of this setting using mpt_sas? It sounds like what you&#8217;re saying is, this setting should in theory work, but the driver is causing issues and it never gets to the ZFS level to use this setting. Is that right? Did you test using the known bad drive from the beginning of the article? Any other details would be appreciated if you have them.</p>
<p>Thanks</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Matt Connolly</title>
		<link>http://blogs.everycity.co.uk/alasdair/2011/05/adjusting-drive-timeouts-with-mdb-on-solaris-or-openindiana/#comment-649</link>
		<dc:creator>Matt Connolly</dc:creator>
		<pubDate>Tue, 29 Nov 2011 11:09:02 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.everycity.co.uk/alasdair/?p=429#comment-649</guid>
		<description>Interesting that this can affect small setups as well as large:  I&#039;ve seen this problem on my home NAS running OpenIndiana with two mirrored drives. In my case, one of them was a Western Digital green drive which slowed the whole machine to a near-halt by being busy.

I solved the problem by yanking that drive out and throwing it in the bin... Didn&#039;t know about this then! :)</description>
		<content:encoded><![CDATA[<p>Interesting that this can affect small setups as well as large:  I&#8217;ve seen this problem on my home NAS running OpenIndiana with two mirrored drives. In my case, one of them was a Western Digital green drive which slowed the whole machine to a near-halt by being busy.</p>
<p>I solved the problem by yanking that drive out and throwing it in the bin&#8230; Didn&#8217;t know about this then! :)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Alasdair</title>
		<link>http://blogs.everycity.co.uk/alasdair/2011/05/adjusting-drive-timeouts-with-mdb-on-solaris-or-openindiana/#comment-622</link>
		<dc:creator>Alasdair</dc:creator>
		<pubDate>Mon, 31 Oct 2011 23:33:40 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.everycity.co.uk/alasdair/?p=429#comment-622</guid>
		<description>Hi Richard,

Thanks for the comment about the timeout, it&#039;s a good suggestion.

However in testing with failing harddrives (on mpt_sas anyway), we see that the sd timeouts are completely ignored so my entire post above is moot!</description>
		<content:encoded><![CDATA[<p>Hi Richard,</p>
<p>Thanks for the comment about the timeout, it&#8217;s a good suggestion.</p>
<p>However in testing with failing harddrives (on mpt_sas anyway), we see that the sd timeouts are completely ignored so my entire post above is moot!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Richard Elling</title>
		<link>http://blogs.everycity.co.uk/alasdair/2011/05/adjusting-drive-timeouts-with-mdb-on-solaris-or-openindiana/#comment-619</link>
		<dc:creator>Richard Elling</dc:creator>
		<pubDate>Fri, 14 Oct 2011 15:10:47 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.everycity.co.uk/alasdair/?p=429#comment-619</guid>
		<description>For most HDD suppliers, 5 seconds is too low. If you consult the HDD specifications, the proper value is usually documented. For example, most Seagate nearline SAS models have a &lt;7 second specification. Recommend setting to 8 seconds instead of 5 for two reasons: 1) fits the specs for many low-cost &quot;hardware&quot; RAID cards, 2) avoids false positives.

 -- richard</description>
		<content:encoded><![CDATA[<p>For most HDD suppliers, 5 seconds is too low. If you consult the HDD specifications, the proper value is usually documented. For example, most Seagate nearline SAS models have a &lt;7 second specification. Recommend setting to 8 seconds instead of 5 for two reasons: 1) fits the specs for many low-cost &#8220;hardware&#8221; RAID cards, 2) avoids false positives.</p>
<p> &#8212; richard</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: whatever</title>
		<link>http://blogs.everycity.co.uk/alasdair/2011/05/adjusting-drive-timeouts-with-mdb-on-solaris-or-openindiana/#comment-617</link>
		<dc:creator>whatever</dc:creator>
		<pubDate>Sun, 04 Sep 2011 19:11:56 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.everycity.co.uk/alasdair/?p=429#comment-617</guid>
		<description>Hey, Alasdair:
Can you write an article detailing how to setup a &quot;NFS cluster&quot; with OpenIndiana?

As far as I know, when Oracle closed down OpenSolaris, they stopped contributing to OpenHA, which was only working with OpenSolaris 2009.06.  The source code of OpenHA is now moved to Illumos gate, but activities in that project is low.  You can&#039;t even compile it nowadays.  Considering Oracle Solaris Cluster 3.3u1 doesn&#039;t run on Illumos based distro, or even Solaris 11 Express, I wonder which software package you used to setup HA ZFS and HA NFS for VM storage.  (The only thing I can think of is to license RSF-1 just like NexentaStor HA)

Thanks</description>
		<content:encoded><![CDATA[<p>Hey, Alasdair:<br />
Can you write an article detailing how to setup a &#8220;NFS cluster&#8221; with OpenIndiana?</p>
<p>As far as I know, when Oracle closed down OpenSolaris, they stopped contributing to OpenHA, which was only working with OpenSolaris 2009.06.  The source code of OpenHA is now moved to Illumos gate, but activities in that project is low.  You can&#8217;t even compile it nowadays.  Considering Oracle Solaris Cluster 3.3u1 doesn&#8217;t run on Illumos based distro, or even Solaris 11 Express, I wonder which software package you used to setup HA ZFS and HA NFS for VM storage.  (The only thing I can think of is to license RSF-1 just like NexentaStor HA)</p>
<p>Thanks</p>
]]></content:encoded>
	</item>
</channel>
</rss>
