<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>

<channel>
	<title>Alasdair on Everything</title>
	<atom:link href="http://blogs.everycity.co.uk/alasdair/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.everycity.co.uk/alasdair</link>
	<description></description>
	<pubDate>Wed, 01 Feb 2012 13:56:48 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6.3</generator>
	<language>en</language>
			<item>
		<title>Compiling QT with Webkit on Solaris 10</title>
		<link>http://blogs.everycity.co.uk/alasdair/2011/11/compiling-qt-with-webkit-on-solaris-10/</link>
		<comments>http://blogs.everycity.co.uk/alasdair/2011/11/compiling-qt-with-webkit-on-solaris-10/#comments</comments>
		<pubDate>Fri, 25 Nov 2011 11:12:34 +0000</pubDate>
		<dc:creator>Alasdair</dc:creator>
		
		<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://blogs.everycity.co.uk/alasdair/?p=453</guid>
		<description><![CDATA[Getting QT on Solaris 10 to build is a PITA, but getting it to build with Webkit enabled is even worse. But fret not, after some Googling the patches can be found.
You can find our build recipe for it over here:
http://hg.openindiana.org/users/aszeszo/s10-userland/file/c473cd11bbd3/components/qt4
We&#8217;re using GCC 4.4 to build QT which, although not the officially supported compiler on [...]]]></description>
			<content:encoded><![CDATA[<p>Getting QT on Solaris 10 to build is a PITA, but getting it to build with Webkit enabled is even worse. But fret not, after some Googling the patches can be found.</p>
<p>You can find our build recipe for it over here:</p>
<p><a href="http://hg.openindiana.org/users/aszeszo/s10-userland/file/c473cd11bbd3/components/qt4">http://hg.openindiana.org/users/aszeszo/s10-userland/file/c473cd11bbd3/components/qt4</a></p>
<p>We&#8217;re using GCC 4.4 to build QT which, although not the officially supported compiler on Solaris platforms (They specify Sun Studio), it works just fine.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.everycity.co.uk/alasdair/2011/11/compiling-qt-with-webkit-on-solaris-10/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Fixing &#8220;No active dataset&#8221; on zone attach</title>
		<link>http://blogs.everycity.co.uk/alasdair/2011/10/fixing-no-active-dataset-on-zone-attach/</link>
		<comments>http://blogs.everycity.co.uk/alasdair/2011/10/fixing-no-active-dataset-on-zone-attach/#comments</comments>
		<pubDate>Mon, 31 Oct 2011 23:30:27 +0000</pubDate>
		<dc:creator>Alasdair</dc:creator>
		
		<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://blogs.everycity.co.uk/alasdair/?p=450</guid>
		<description><![CDATA[When moving zones between OpenIndiana (and OpenSolaris) hosts, you can often end up with the following dreaded error:

# zoneadm -z zonename attach -U
Log File: /var/tmp/zonename.attach_log.B8aWed
ERROR: no active dataset.
                    Result: Attach Failed.

This can happen for a variety [...]]]></description>
			<content:encoded><![CDATA[<p>When moving zones between OpenIndiana (and OpenSolaris) hosts, you can often end up with the following dreaded error:</p>
<pre>
# zoneadm -z zonename attach -U
Log File: /var/tmp/zonename.attach_log.B8aWed
ERROR: no active dataset.
                    Result: Attach Failed.
</pre>
<p>This can happen for a variety of reasons, such as not detaching the zone before moving it, and not transferring the ZFS properties with the zone. But personally I blame the half-arsed zone attach scripts that could do with some work.</p>
<p>To get around it, here is a super-quick/dirty script that should allow the zone to attach:</p>
<pre>
#!/bin/bash

zfsfs=$1
root=${zfsfs}/ROOT
zbe=${root}/zbe

for i in $zbe $root $zfsfs ; do
  for j in zoned mountpoint ; do
    zfs inherit $j $i
  done
done

zfs set mountpoint=legacy $root
zfs set zoned=on $root
zfs set canmount=noauto $zbe
zfs set org.opensolaris.libbe:active=on $zbe

rbe=`zfs list -H -o name /`
uuid=`zfs get -H -o value org.opensolaris.libbe:uuid $rbe`

zfs set org.opensolaris.libbe:parentbe=$uuid $zbe
</pre>
<p>The script takes one argument, the zfs filesystem the zone lives in (the parent of &#8220;ROOT&#8221; for the zone). Ignore any errors about &quot;dataset is used in a non-global zone&quot;, and once it has run, manually mount the dataset and attach it with:</p>
<pre>
mount -F zfs dataset/ROOT/zbe /zones/zonename/root
zoneadm -z zonename attach
</pre>
<p>This guide is pretty rough but should hopefully set people in roughly the right direction.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.everycity.co.uk/alasdair/2011/10/fixing-no-active-dataset-on-zone-attach/feed/</wfw:commentRss>
		</item>
		<item>
		<title>vasprintf and asprintf on Solaris 10</title>
		<link>http://blogs.everycity.co.uk/alasdair/2011/07/vasprintf-and-asprintf-on-solaris-10/</link>
		<comments>http://blogs.everycity.co.uk/alasdair/2011/07/vasprintf-and-asprintf-on-solaris-10/#comments</comments>
		<pubDate>Tue, 19 Jul 2011 13:32:08 +0000</pubDate>
		<dc:creator>Alasdair</dc:creator>
		
		<category><![CDATA[Solaris]]></category>

		<guid isPermaLink="false">http://blogs.everycity.co.uk/alasdair/?p=446</guid>
		<description><![CDATA[Update: Martin in the comments suggested using the vasprintf definition in the OpenSolaris source.

If you get errors such as this on Solaris 10, it&#8217;s due to a lack of modern helpful string functions (which thankfully were added to OpenSolaris, so no problem here on OpenIndiana):

Undefined           [...]]]></description>
			<content:encoded><![CDATA[<p><b>Update</b>: Martin in the comments suggested using the vasprintf definition in the <a href="http://hg.openindiana.org/upstream/oracle/onnv-gate/file/b23a4dab3d50/usr/src/lib/libc/port/print/asprintf.c">OpenSolaris source</a>.</p>
<hr />
<p>If you get errors such as this on Solaris 10, it&#8217;s due to a lack of modern helpful string functions (which thankfully were added to OpenSolaris, so no problem here on OpenIndiana):</p>
<pre>
Undefined                       first referenced
 symbol                             in file
asprintf                            ../../bin/gcc/libgpac.so
</pre>
<p>There&#8217;s quite a nice implementation (no idea how safe it is to use, but at least the program now compiles!) over at Stack Overflow - <a href="http://stackoverflow.com/questions/4899221/substitute-or-workaround-for-asprintf-on-aix">http://stackoverflow.com/questions/4899221/substitute-or-workaround-for-asprintf-on-aix</a>. Take the latter one, by Jonathan Leffler.</p>
<p>You can drop it in like so:</p>
<pre>
#if (defined (__SVR4) &#038;&#038; defined (__sun))
int vasprintf(char **ret, const char *format, va_list args)
{
        va_list copy;
        va_copy(copy, args);

        /* Make sure it is determinate, despite manuals indicating otherwise */
        *ret = 0;

        int count = vsnprintf(NULL, 0, format, args);
        if (count >= 0) {
                char* buffer = malloc(count + 1);
                if (buffer != NULL) {
                        count = vsnprintf(buffer, count + 1, format, copy);
                        if (count < 0)
                                free(buffer);
                        else
                                *ret = buffer;
                }
        }
        va_end(args);  // Each va_start() or va_copy() needs a va_end()

        return count;
}

int asprintf(char **strp, const char *fmt, ...)
{
        s32 size;
        va_list args;
        va_start(args, fmt);
        size = vasprintf(strp, fmt, args);
        va_end(args);
        return size;
}
#endif
</pre>
<p>Since the code I&#8217;m compiling will not be facing the internet and only run by trusted users, I&#8217;m not too worried about how buffer overflow safe this code is. If you are concerned about that, you might want to take a look at gnulib, which has a nice properly portable version, although it&#8217;s a lot bigger.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.everycity.co.uk/alasdair/2011/07/vasprintf-and-asprintf-on-solaris-10/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Adjusting drive timeouts with mdb on Solaris or OpenIndiana</title>
		<link>http://blogs.everycity.co.uk/alasdair/2011/05/adjusting-drive-timeouts-with-mdb-on-solaris-or-openindiana/</link>
		<comments>http://blogs.everycity.co.uk/alasdair/2011/05/adjusting-drive-timeouts-with-mdb-on-solaris-or-openindiana/#comments</comments>
		<pubDate>Sat, 14 May 2011 11:37:53 +0000</pubDate>
		<dc:creator>Alasdair</dc:creator>
		
		<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://blogs.everycity.co.uk/alasdair/?p=429</guid>
		<description><![CDATA[Update: These timeouts don&#8217;t work nearly as well as one would hope, unfortunately the sd timeouts get passed to the driver which in the case of mpt/mpt_sas, appear to do very little with them. I have raised this as an issue within the Illumos community and the debate was quite polarising; the kernel developers deny [...]]]></description>
			<content:encoded><![CDATA[<p><b>Update:</b> These timeouts don&#8217;t work nearly as well as one would hope, unfortunately the sd timeouts get passed to the driver which in the case of mpt/mpt_sas, appear to do very little with them. I have raised this as an issue within the Illumos community and the debate was quite polarising; the kernel developers deny there is a problem or disagree on how to solve it, despite lots of people complaining of the same symptoms. Unfortunately I think it&#8217;s a difficult problem to solve due to the wide variety of hardware types that ZFS/Illumos is deployed on.</p>
<p>Our way of coping with dodgy drives is to preempt their failure via trigger happy SMART/iostat monitoring scripts that zpool offline bad drives before they fail.</p>
<hr />
<p><img src="http://p.gzhls.at/404661.jpg" align="left" style="padding-right: 7px"></p>
<p>Yesterday we suffered our first disk failure in our shiny new NFS cluster that has been operating flawlessly for 3 months. The NFS cluster we have is quite nice - it consists of a pair of NFS servers (96GB of RAM, Dual Intel E5620 CPUs) dual-attached to a set of LSI SAS 6Gbps JBOD arrays, with lots of Seagate Constellation ES 2TB enterprise SAS drives. For good measure there&#8217;s 1.5TB of SSD cache (6&#215;256GB SSDs) acting as a read cache (L2ARC), and a ZeusRAM SSD acting as the write cache (ZIL). It runs a custom build of <a href="http://www.openindiana.org">OpenIndiana</a>.</p>
<p>Ordinarily a disk failure would result in at most a few minutes of stall while the OS waits for the drive to recover, and gives up. However, this drive decided simply to run glacially slowly, so it didn&#8217;t get removed in a timely fashion. In fact, it didn&#8217;t get removed at all, resulting in all IO to the SAN being stuck, causing a rather severe outage. 45 minutes in total.</p>
<p>When things became unresponsive, we logged in, and &quot;iostat -xn&quot; showed a 100% busy time on one of the disks, while the others did nothing. We attempted to &quot;zpool offline baddisk&quot;. Nothing much happened, presumably because the OS thought the drive was fine and was waiting on some queued IO finishing, or something along those lines. We had no immediate way of yanking the disk out, so we decided to failover the cluster from the primary NFS node to the secondary. This consists of powering off the primary node and letting the cluster software import the ZFS zpool and bring NFS services online.</p>
<p>When the secondary NFS node started importing the zpool, iostat once again showed a 100% busy time on the bad disk. Crap. Andrzej had the bright idea of deleting the disk entries from /dev, and sure enough this prompted ZFS to think the drive had disappeared, and the pool finally imported.</p>
<p>So immediately the question springs to mind, why did the OS not take this bad disk out of service? We consulted with our upstream vendor (contacted the folks over at <a href="http://www.illumos.org">Illumos</a>) and all became clear.</p>
<p>The answer lays in the defaults in the Solaris SCSI subsystem. The default timeout for IO is 60 seconds with 5 retries (or 3 retries if its fibre channel/eSAS). For a storage array like ours, this is a 3 minute timeout for a single IO - or in other words, a very long time. Since the disk was accepting a trickle of IO, this timeout was never really reached.</p>
<p>Thankfully the timeouts can be adjusted, and Garrett D&#8217;Amore, the founder of Illumos and one of the lead developers who works at Nexenta, strongly suggested tuning the timeout to 5 seconds, with 3 retries.</p>
<p>Setting the timeout value is quite easy - its the system wide tunable sd_io_time. Keep in mind this will affect all disks. Edit /etc/system and drop in:</p>
<pre>
set sd:sd_io_time=5
</pre>
<p>If you have desktop SATA drives you&#8217;ll probably want a higher timeout, especially if you don&#8217;t have TLER (Time limited error recovery) on them, which limits error recovery to around 7 seconds.</p>
<p>The number of retries is set via /kernel/drv/sd.conf via sd-config-list. This file allows the setting to be set per-disk type via sd-config-list. To get 3 retries, the variable would be &quot;retries-timeout:3&quot;. The format of this file is a bit weird, here is an example for two disks:</p>
<pre>
sd-config-list = "STEC    ZeusRAM         ", "throttle-max:32, disksort:false, cache-nonvolatile:true",
                 "SEAGATE ST32000444SS    ", "retries-timeout:3";
</pre>
<p>The bit where you define the disk type is a fixed length field, consisting of 8 characters for the vendor, and 16 characters for the product. So you have to pad the field out to the correct length with spaces.</p>
<p>Once these are set, reboot to activate. You can check the values are set by doing:</p>
<pre>
## Print system wide sd_io_time timeout value:
# echo "sd_io_time::print" | mdb -k
0x3c

## Print per-disk timeout and retry values:
# echo "::walk sd_state | ::grep '.!=0' | ::sd_state" | mdb -k | egrep "^un|un_retry_count|un_cmd_timeout"
un: ffffff093239d9c0
    un_retry_count = 0x3
    un_cmd_timeout = 0x5
un: ffffff093239d380
    un_retry_count = 0x3
    un_cmd_timeout = 0x5
...
</pre>
<p>The return values are in hexadecimal, so for example 0&#215;3c is 60 seconds.</p>
<h3>Adjusting values without rebooting</h3>
<p>We have a number of storage servers in production, some of which we really didn&#8217;t want to reboot just to change the timeout value. After discussions with some of the Illumos kernel developers,<br />
we worked out how to set the property at runtime using the modular Solaris debugger, mdb. This allows editing kernel values at runtime.</p>
<p>The system wide sd_io_time is used to populate a per-disk timeout value which is also stored in the same structure as the per-disk retry count. So changing the values is pretty similar.</p>
<p>First, we want to obtain the memory values for the settings we wish to edit:</p>
<pre>
# echo "::walk sd_state | ::grep '.!=0' | ::print -a struct sd_lun un_cmd_timeout" | mdb -k > /tmp/un_cmd_timeouts

# cat /tmp/un_cmd_timeouts
ffffff0d347a3a7c un_cmd_timeout = 0x3c
ffffff0d247983bc un_cmd_timeout = 0x3c
ffffff0d3429d3fc un_cmd_timeout = 0x3c
ffffff0d55daf37c un_cmd_timeout = 0x3c
...
</pre>
<p>Now we have the values in /tmp/un_cmd_timeouts, we can set the value using mdb -kw:</p>
<pre>
# for i in `cat /tmp/un_cmd_timeouts  | awk '{print $1}'` ; do echo ${i}/W 0x5 | mdb -kw ; done
</pre>
<p>We can then check the value was set by re-running:</p>
<pre>
# echo "::walk sd_state | ::grep '.!=0' | ::print -a struct sd_lun un_cmd_timeout" | mdb -k
</pre>
<p>Now we can do the same for un_retry_count:</p>
<pre>
# echo "::walk sd_state | ::grep '.!=0' | ::print -a struct sd_lun un_retry_count" | mdb -k > /tmp/un_retry_count
# for i in `cat /tmp/un_retry_count  | awk '{print $1}'` ; do echo ${i}/W 0x3 | mdb -kw ; done
</pre>
</p>
<p>Hey presto, we just adjusted boot time kernel parameters on the fly :-)</p>
<p>If you need to know which disk is which, you can assume the output from mdb is ordered, and do:</p>
<pre>
echo "::walk sd_state | ::grep '.!=0' | ::print struct sd_lun un_sd | ::print struct scsi_device sd_dev | ::devinfo -q" | mdb -k
</pre>
<p>This returns the sd instance id, which can be seen from &quot;iostat -E&quot;. StackOverflow has some answers for <a href="http://stackoverflow.com/questions/555427/map-sd-sdd-names-to-solaris-disk-names">mapping from sd to device name</a> should you need to.</p>
<h3>Concluding Remarks</h3>
<p>With these values in place, our timeout is reduced from upwards of 3 minutes, to a mere 15 seconds. This is far more likely to cause the OS to offline dodgy disks like the one we were experiencing issues with.</p>
<p>There has been some recent discussion on the Illumos mailing lists regarding the default sd_io_time value, suggesting that the default should be lowered to 8 seconds. This has caused a bit of a furore, as people using Solaris with fibre channel disk arrays require higher timeouts, say 180 seconds. So there are people on both sides of the fence. But one thing is for sure - its a setting more people should know about.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.everycity.co.uk/alasdair/2011/05/adjusting-drive-timeouts-with-mdb-on-solaris-or-openindiana/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Autoconf, Automake and Libtoolized version of bzip2</title>
		<link>http://blogs.everycity.co.uk/alasdair/2011/03/autoconf-automake-and-libtoolized-version-of-bzip2/</link>
		<comments>http://blogs.everycity.co.uk/alasdair/2011/03/autoconf-automake-and-libtoolized-version-of-bzip2/#comments</comments>
		<pubDate>Mon, 28 Mar 2011 16:28:14 +0000</pubDate>
		<dc:creator>Alasdair</dc:creator>
		
		<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://blogs.everycity.co.uk/alasdair/?p=424</guid>
		<description><![CDATA[Autoconf, Automake and libtool are 3 utilities designed to simultaneously help and hinder those of us that have to compile software. They together produce the familiar &#8220;./configure ; make ; make install&#8221; procedure most of us have used time and time again.
Although these tools are universally hated for being overly complex, slow and hard to [...]]]></description>
			<content:encoded><![CDATA[<p>Autoconf, Automake and libtool are 3 utilities designed to simultaneously help and hinder those of us that have to compile software. They together produce the familiar &#8220;./configure ; make ; make install&#8221; procedure most of us have used time and time again.</p>
<p>Although these tools are universally hated for being overly complex, slow and hard to use, thankfully most projects use them, because the alternative (usually some shitty Makefile that only works on Linux) is far far far worse.
<p>BZip2 is one of those very simple system utilities we all require, where the author only ships a Makefile. Thankfully a helpful SuSE developer has <a href="http://ftp.suse.com/pub/people/sbrabec/bzip2/">autoconfized it</a>. So grab a copy of those files into your bzip2-1.0.6 folder, and run autogen.sh</p>
<p>If only someone would do this for libxvid and ffmpeg&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.everycity.co.uk/alasdair/2011/03/autoconf-automake-and-libtoolized-version-of-bzip2/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Lame, nasm, and text relocations (textrels)</title>
		<link>http://blogs.everycity.co.uk/alasdair/2011/03/lame-nasm-and-text-relocations-textrels/</link>
		<comments>http://blogs.everycity.co.uk/alasdair/2011/03/lame-nasm-and-text-relocations-textrels/#comments</comments>
		<pubDate>Sat, 26 Mar 2011 13:40:08 +0000</pubDate>
		<dc:creator>Alasdair</dc:creator>
		
		<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://blogs.everycity.co.uk/alasdair/?p=420</guid>
		<description><![CDATA[Well, this took some debugging.
I&#8217;ve filed it all in a nasm bug report. To cut a long story short, if you compile LAME with Nasm 2.09, you&#8217;ll end up with TEXTRELs in the resultant libmp3lame.so.
What is a TEXTREL you may ask? Something bad! It stops the code being fully PIC (position independent), which stops the [...]]]></description>
			<content:encoded><![CDATA[<p>Well, this took some debugging.</p>
<p>I&#8217;ve filed it all in a nasm <a href="https://sourceforge.net/tracker/?func=detail&#038;aid=3246990&#038;group_id=6208&#038;atid=106208">bug report</a>. To cut a long story short, if you compile LAME with Nasm 2.09, you&#8217;ll end up with TEXTRELs in the resultant libmp3lame.so.</p>
<p>What is a TEXTREL you may ask? Something bad! It stops the code being fully PIC (position independent), which stops the shared object being loaded into memory once and mapped multiple times. But worse, it causes Solaris ld to explode when linking:</p>
<pre>
gcc -shared -Wl,-h -Wl,libmp3lame.so.0 -o .libs/libmp3lame.so.0.0.0 .libs/VbrTag.o .libs/bitstream.o .libs/encoder.o .libs/fft.o .libs/gain_analysis.o .libs/id3tag.o .libs/lame.o .libs/newmdct.o .libs/presets.o .libs/psymodel.o .libs/quantize.o .libs/quantize_pvt.o .libs/reservoir.o .libs/set_get.o .libs/tables.o .libs/takehiro.o .libs/util.o .libs/vbrquantize.o .libs/version.o .libs/mpglib_interface.o -Wl,-z -Wl,allextract ../libmp3lame/i386/.libs/liblameasmroutines.a ../libmp3lame/vector/.libs/liblamevectorroutines.a ../mpglib/.libs/libmpgdecoder.a -Wl,-z -Wl,defaultextract -lm -lsocket -lnsl -lc -maccumulate-outgoing-args
Text relocation remains referenced
against symbol offset in file
<unknown> 0x6e ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o)
<unknown> 0x75 ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o)
<unknown> 0x9a ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o)
<unknown> 0xa1 ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o)
<unknown> 0xa8 ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o)
<unknown> 0x12b ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o)
<unknown> 0x133 ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o)
<unknown> 0x1a0 ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o)
<unknown> 0x1aa ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o)
<unknown> 0x1b4 ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o)
<unknown> 0x1c2 ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o)
<unknown> 0x24c ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o)
<unknown> 0x25d ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o)
<unknown> 0x39 ../libmp3lame/i386/.libs/liblameasmroutines.a(fft3dn.o)
<unknown> 0x56 ../libmp3lame/i386/.libs/liblameasmroutines.a(fft3dn.o)
<unknown> 0x128 ../libmp3lame/i386/.libs/liblameasmroutines.a(fft3dn.o)
<unknown> 0x142 ../libmp3lame/i386/.libs/liblameasmroutines.a(fft3dn.o)
<unknown> 0x26e ../libmp3lame/i386/.libs/liblameasmroutines.a(fft3dn.o)
<unknown> 0x2b9 ../libmp3lame/i386/.libs/liblameasmroutines.a(fft3dn.o)
<unknown> 0x2d6 ../libmp3lame/i386/.libs/liblameasmroutines.a(fft3dn.o)
<unknown> 0x398 ../libmp3lame/i386/.libs/liblameasmroutines.a(fft3dn.o)
<unknown> 0x4ce ../libmp3lame/i386/.libs/liblameasmroutines.a(fft3dn.o)
<unknown> 0x2c ../libmp3lame/i386/.libs/liblameasmroutines.a(fftsse.o)
<unknown> 0x7a ../libmp3lame/i386/.libs/liblameasmroutines.a(fftsse.o)
<unknown> 0x88 ../libmp3lame/i386/.libs/liblameasmroutines.a(fftsse.o)
<unknown> 0xc4 ../libmp3lame/i386/.libs/liblameasmroutines.a(fftsse.o)
<unknown> 0xd9 ../libmp3lame/i386/.libs/liblameasmroutines.a(fftsse.o)
<unknown> 0xe7 ../libmp3lame/i386/.libs/liblameasmroutines.a(fftsse.o)
<unknown> 0x1d0 ../libmp3lame/i386/.libs/liblameasmroutines.a(fftsse.o)
<unknown> 0x1e4 ../libmp3lame/i386/.libs/liblameasmroutines.a(fftsse.o)
<unknown> 0x20b ../libmp3lame/i386/.libs/liblameasmroutines.a(fftsse.o)
<unknown> 0x219 ../libmp3lame/i386/.libs/liblameasmroutines.a(fftsse.o)
t1l 0x189 ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o)
largetbl 0xde ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o)
largetbl 0x105 ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o)
largetbl 0x10f ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o)
table23 0x245 ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o)
table56 0x256 ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o)
ld: fatal: relocations remain against allocatable but non-writable sections
collect2: ld returned 1 exit status
</pre>
<p>The way to fix the problem is to use NASM 2.08 or earlier, or wait until the bug gets fixed (although they might point their finger at LAME). I&#8217;m going to try yasm instead of nasm and see if that works, as an alternative.</p>
<p>If you don&#8217;t care about TEXTRELs, on Linux you don&#8217;t have to do anything (GNU ld allows them by default), but on Solaris you can tell the Solaris linker to allow impure text segments by adding &quot;-mimpure-text -lrt&quot; to your LDFLAGS. Or, you can use the GNU linker. This is quite hard, but I wrote a <a href="http://blogs.everycity.co.uk/alasdair/2011/03/using-the-gnu-ld-linker-on-solaris/">blog post</a> about it.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.everycity.co.uk/alasdair/2011/03/lame-nasm-and-text-relocations-textrels/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Using the GNU ld Linker on Solaris</title>
		<link>http://blogs.everycity.co.uk/alasdair/2011/03/using-the-gnu-ld-linker-on-solaris/</link>
		<comments>http://blogs.everycity.co.uk/alasdair/2011/03/using-the-gnu-ld-linker-on-solaris/#comments</comments>
		<pubDate>Fri, 25 Mar 2011 16:40:18 +0000</pubDate>
		<dc:creator>Alasdair</dc:creator>
		
		<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://blogs.everycity.co.uk/alasdair/?p=417</guid>
		<description><![CDATA[On Solaris, GCC by default is compiled with the option &#8211;with-ld=/usr/ccs/bin/ld, telling it to use the Solaris linker. Unfortunately GCC uses this value above all else, meaning it will ignore LD= environment variables to set an alternative linker, such as /usr/sfw/bin/gld
Although tools like libtool/autoconf will pick up your LD= environment variable, and detect which options [...]]]></description>
			<content:encoded><![CDATA[<p>On Solaris, GCC by default is compiled with the option &#8211;with-ld=/usr/ccs/bin/ld, telling it to use the Solaris linker. Unfortunately GCC uses this value above all else, meaning it will ignore LD= environment variables to set an alternative linker, such as /usr/sfw/bin/gld</p>
<p>Although tools like libtool/autoconf will pick up your LD= environment variable, and detect which options the linker supports (and whether its GNU ld or not), libtool unfortunately still calls gcc for the linking stage, which then ignores LD=. This makes it near-impossible to use GNU ld without actually doing a nasty hack, like &#8220;mv /usr/ccs/bin/ld /usr/ccs/bin/ld.off ; ln -s /usr/sfw/bin/ld /usr/ccs/bin/ld&#8221;. Yuck!</p>
<p>However, today when trying to get lame to compile using nasm (which generates objects that refuse to link with Solaris LD), I found Solaris LD accepts a very useful environment variable. The variable is <b>LD_ALTEXEC</b>.</p>
<p>Solaris LD will actually re-exec the value of LD_ALTEXEC, meaning that if you set LD_ALTEXEC to /usr/sfw/bin/gld, when /usr/ccs/bin/ld gets called, it immediately instead calls /usr/sfw/bin/gld with the arguments passed on. Thus, you can use whatever linker you wish. Hurrah! :-)</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.everycity.co.uk/alasdair/2011/03/using-the-gnu-ld-linker-on-solaris/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Building IPS / pkg5 on Solaris 10</title>
		<link>http://blogs.everycity.co.uk/alasdair/2011/01/building-ips-pkg5-on-solaris-10/</link>
		<comments>http://blogs.everycity.co.uk/alasdair/2011/01/building-ips-pkg5-on-solaris-10/#comments</comments>
		<pubDate>Sun, 23 Jan 2011 00:24:47 +0000</pubDate>
		<dc:creator>Alasdair</dc:creator>
		
		<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://blogs.everycity.co.uk/alasdair/?p=409</guid>
		<description><![CDATA[IPS/pkg5 is the native package manager on OpenSolaris, and thus by extension on OpenIndiana (The OpenSolaris fork I started last year). Over the past 6 months I&#8217;ve become familiar with IPS, and I can honestly say I&#8217;ve fallen in love with it. It&#8217;s very powerful, useful and fairly easy to use (If you forgive it&#8217;s [...]]]></description>
			<content:encoded><![CDATA[<p>IPS/pkg5 is the native package manager on OpenSolaris, and thus by extension on OpenIndiana (The OpenSolaris fork I started last year). Over the past 6 months I&#8217;ve become familiar with IPS, and I can honestly say I&#8217;ve fallen in love with it. It&#8217;s very powerful, useful and fairly easy to use (If you forgive it&#8217;s obscure error messages).</p>
<p>It was developed from scratch to be cross-platform, allowing Sun to deliver packages to other systems such as Linux, AIX, and Solaris 10. I decided it might be a good idea for us to roll it out on our Solaris 10 cloud for use with managing software. Our clients love the power of Solaris 10, but they sure do hate the lack of native package management, so IPS could really be a big win for us.</p>
<p>But boy, is getting it working on Solaris 10 no easy task. IPS itself is written mostly in Python, however the dependency list is huge, and some of the packages are a real pain to compile. The IPS build system also makes a few assumptions that aren&#8217;t correct on Solaris 10 which complicated things.</p>
<p>Whilst other guides bypass a lot of the problems by using OpenCSW/Blastwave packages such as <a href="http://probably.co.uk/howto-build-ips-on-solaris-10.html">this one here</a>, I wanted a little self-contained &quot;/opt/pkg&quot; directory with it&#8217;s own Python install and any dependencies. The whole point of my deployment of IPS is to get away from OpenCSW/Blastwave and friends, which introduce a whole other stack of software you have to keep up to date.</p>
<p>While I don&#8217;t have time to go into the build process in detail, I can offer some hints to help out.</p>
<p>I found I had to build the following packages (the ordering here is completely incorrect, sorry):</p>
<pre>
gettext
expat
rarian
intltool
python2.6
setuptools
swig
pyOpenSSL
gnome-doc-utils
libxml2
libxml2-python
libxslt
</pre>
<p>You&#8217;ll want to skip building the gui tools, update manager and the brand stuff, so in pkg-gate/src/Makefile, change the SUBDIRS variable as such:</p>
<pre>
#SUBDIRS=web gui um po util/misc brand
SUBDIRS=web util/misc
</pre>
<p>Also remember to set PYTHON= to your new python.</p>
<p>I had to patch M2Crypto - it uses SWIG to generate Python bindings, and assumes ENGINE_load_openssl is present in the OpenSSL library. When running pkg I was getting:</p>
<pre>
ImportError: ld.so.1: python2.6: fatal: relocation error: file /opt/pkg/python26/lib/python2.6/site-packages/M2Crypto/__m2crypto.so: symbol ENGINE_load_openssl: referenced symbol not found
</pre>
<p>This is because the Solaris 10 OpenSSL install is missing the ENGINE_load_openssl function - it has been yanked out for crypto export reasons (that now probably don&#8217;t apply as OpenSolaris contains it). I removed references to it, and managed to coerce it to work. The patches for M2Crypto are here:</p>
<pre>
# pwd
/root/pkg-gate/src/patch/M2Crypto
# cat pkg-gate_m2c.patch
--- SWIG/_engine.i.orig 2011-01-22 23:32:17.583271086 +0000
+++ SWIG/_engine.i      2011-01-22 23:32:50.478960838 +0000
@@ -26,9 +26,6 @@
 %rename(engine_load_dynamic) ENGINE_load_dynamic;
 extern void ENGINE_load_dynamic(void);

-%rename(engine_load_openssl) ENGINE_load_openssl;
-extern void ENGINE_load_openssl(void);
-
 %rename(engine_cleanup) ENGINE_cleanup;
 extern void ENGINE_cleanup(void);

# cat setup.patch
--- setup.py.orig       2011-01-22 23:49:21.466821165 +0000
+++ setup.py    2011-01-22 23:49:32.286055614 +0000
@@ -40,7 +40,7 @@
             self.openssl = 'c:\\pkg'
         else:
             self.libraries = ['ssl', 'crypto']
-            self.openssl = '/usr'
+            self.openssl = '/usr/sfw'

     def finalize_options(self):
</pre>
<p>Lastly some tips - the NetBSD pkgsrc system contains useful patches for getting some of the above dependencies to compile on Solaris 10. I can&#8217;t remember which ones I used but it did come in handy. And don&#8217;t forget about your CFLAGS/LDFLAGS/PATH. I also found I had to temporarily rename Solaris patch to patch.off and symlink gpatch to get pkg5 to auto-patch M2Crypto as it assumes GNU flags. You may also need to add -lintl and -lsocket at some point during the dependency build process to your LDFLAGS (I can&#8217;t remember where).</p>
<p>I&#8217;m delighted to have pkg5 working on Solaris 10 now. I&#8217;ll report back at a later date how I&#8217;m getting on. For those that want to cheat, I have a tar&#8217;d version you can stick at /opt/pkg <a href="http://blogs.everycity.co.uk/alasdair/pub/pkg5-20100123.tgz">here</a>. It&#8217;s a strange layout - forgive me. And keep in mind, I haven&#8217;t tried it much yet.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.everycity.co.uk/alasdair/2011/01/building-ips-pkg5-on-solaris-10/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Obtaining the serial number for disks on LSI RAID cards via CentOS Linux</title>
		<link>http://blogs.everycity.co.uk/alasdair/2010/11/obtaining-the-serial-number-for-disks-on-lsi-raid-cards-via-centos-linux/</link>
		<comments>http://blogs.everycity.co.uk/alasdair/2010/11/obtaining-the-serial-number-for-disks-on-lsi-raid-cards-via-centos-linux/#comments</comments>
		<pubDate>Wed, 17 Nov 2010 10:19:49 +0000</pubDate>
		<dc:creator>Alasdair</dc:creator>
		
		<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://blogs.everycity.co.uk/alasdair/?p=406</guid>
		<description><![CDATA[This is just a quick reminder for myself basically. To get the serial number of the disks of a CentOS system, you can do:

yum install lsscsi sg3_utils
modprobe sg
/usr/bin/lsscsi -g
smartctl -a /dev/sg0

Unfortunately I couldn&#8217;t find a way to see the serial number via lsiutil, however lsiutil is still very useful.
]]></description>
			<content:encoded><![CDATA[<p>This is just a quick reminder for myself basically. To get the serial number of the disks of a CentOS system, you can do:</p>
<pre>
yum install lsscsi sg3_utils
modprobe sg
/usr/bin/lsscsi -g
smartctl -a /dev/sg0
</pre>
<p>Unfortunately I couldn&#8217;t find a way to see the serial number via lsiutil, however lsiutil is still very useful.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.everycity.co.uk/alasdair/2010/11/obtaining-the-serial-number-for-disks-on-lsi-raid-cards-via-centos-linux/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Weird OpenSolaris/Crossbow issue with Aggregations and VLANs/VNICs</title>
		<link>http://blogs.everycity.co.uk/alasdair/2010/10/weird-opensolariscrossbow-issue-with-aggregations-and-vlansvnics/</link>
		<comments>http://blogs.everycity.co.uk/alasdair/2010/10/weird-opensolariscrossbow-issue-with-aggregations-and-vlansvnics/#comments</comments>
		<pubDate>Thu, 21 Oct 2010 15:48:59 +0000</pubDate>
		<dc:creator>Alasdair</dc:creator>
		
		<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://blogs.everycity.co.uk/alasdair/?p=398</guid>
		<description><![CDATA[Update: Seems this one is already known about in defect 15870 and bug 6950788. Should have googled/looked myself.

Another interesting issue. In this issue, I had a server, and when I pinged it from other hosts on the same network, I was getting duplicate ping responses:

64 bytes from 10.1.0.1: icmp_seq=1640 ttl=255 time=0.074 ms
64 bytes from 10.1.0.1: [...]]]></description>
			<content:encoded><![CDATA[<p><b>Update:</b> Seems this one is already known about in defect <a href="https://defect.opensolaris.org/bz/show_bug.cgi?id=15870">15870</a> and bug <a href="http://bugs.opensolaris.org/view_bug.do?bug_id=6950788">6950788</a>. Should have googled/looked myself.</p>
<hr />
<p>Another interesting issue. In this issue, I had a server, and when I pinged it from other hosts on the same network, I was getting duplicate ping responses:</p>
<pre>
64 bytes from 10.1.0.1: icmp_seq=1640 ttl=255 time=0.074 ms
64 bytes from 10.1.0.1: icmp_seq=1640 ttl=255 time=0.079 ms (DUP!)
64 bytes from 10.1.0.1: icmp_seq=1641 ttl=255 time=0.081 ms
64 bytes from 10.1.0.1: icmp_seq=1641 ttl=255 time=0.087 ms (DUP!)
64 bytes from 10.1.0.1: icmp_seq=1642 ttl=255 time=0.079 ms
64 bytes from 10.1.0.1: icmp_seq=1642 ttl=255 time=0.082 ms (DUP!)
</pre>
<p>Something not quite right there. The setup was:</p>
<pre>
  network-switch
     |      |
   igb0    igb1
     \__  __/
        \/
       aggr0
         |--- aggr0vlan1
         |--- aggr0vlan2
         |--- aggr0vlan3
         ...
</pre>
<p>In the diagram above, we have a Cisco 2960G with a Port-Channel set up across 2 nic ports, which are attached to two network interfaces on the server. This is an LACP ethernet aggregation, used for providing extra bandwidth and/or redundancy to a server.</p>
<p>On OpenSolaris b134 (and on OpenIndiana b147), the aggr0 interface was created with:</p>
<pre>
dladm create-aggr -l igb0 -l igb1 -P L4 -L active aggr0
</pre>
<p>I then had a collection of VNICs, provisioned off the aggregation, for example:</p>
<pre>
dladm create-vnic -l aggr0 -v 1 aggr0vlan1
dladm create-vnic -l aggr0 -v 2 aggr0vlan2
dladm create-vnic -l aggr0 -v 3 aggr0vlan3
...
</pre>
<p>I then stuck an IP address on each of these vnics using the usual:</p>
<pre>
ifconfig aggr0vlan1 plumb 10.1.0.1/16 up
ifconfig aggr0vlan2 plumb 10.2.0.1/16 up
ifconfig aggr0vlan3 plumb 10.3.0.1/16 up
...
</pre>
<p>There were no other interfaces configured, nothing else fancy at all.</p>
<p>When this was all set up, I got the duplicate ICMP ping packets. Very odd. I used snoop to track things down, and on the ICMP sender, it sent 1 ICMP packet, and received two back. When I snooped the server configured here, on aggr0, it was receiving two ICMP packets, hence sending two replies.</p>
<p>The strangest thing was, this issue would only occur at boot time! If I deleted all the VNICs, then set them up from scratch, no duplicate packets. If I rebooted the box, they came back. Weird!</p>
<p>So I simplified things down to one VNIC, aggr0vlan1. I rebooted, no duplicate packets. So I configured a second, rebooted. Then the duplicate packets were back.</p>
<p>Looks like a bug in Crossbow or the IGB driver to me. I tested on OpenIndiana b147 which had a 6 months newer kernel than OpenSolaris b134, but this didn&#8217;t fix the issue.</p>
<p>Then today (this all took place yesterday) on the bus on the way to work this morning, I remembered that crossbow has two internal constructs for doing vlan tagged virtual interfaces - a &#8220;vnic&#8221; interface with vlan tagging enabled (what I was using above), and a &#8220;vlan&#8221; interface. The syntax for the two commands is virtually identical (&quot;dladm create-vnic -l link0 -v vlanid 1 link0vlan1&quot; vs &quot;dladm create-vlan -l link0 -v vlanid 1 link0vlan1&quot;), and as far as I&#8217;m aware they should be logically the same, but I remember from a previous LOSUG one of the Oracle engineers mentioning that the implementations are different inside the kernel.</p>
<p>So instead of creating a vnic, I tried again with a vlan. BOOM! Fixed! No duplicate packets!</p>
<p>Very weird indeed. Glad I was able to work around the issue, but it did consume a fair whack of time. Perhaps someone with a bit more knowledge of Crossbow might be able to shed some light on this.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.everycity.co.uk/alasdair/2010/10/weird-opensolariscrossbow-issue-with-aggregations-and-vlansvnics/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>

