Windows Updates slow, takes hours after reboot updating registry under QEMU KVM
Hi Folks,
Another blog post primarily aimed at helping people searching for a particular problem. I really struggled to find anything on this, and I hate that, so when I found the problem and the fix, I thought I’d share.
Basically, if you’re running Windows Server 2008 under QEMU KVM, sometimes after a Windows Update it can get stuck for hours updating the registry. It will sit there for hours fiddling with things under Registry Machine Schema:
7444/17432 (\Registry\machine\Schema\wcm://Microsoft-windows...)

The problem isn’t with CPU or Disk IO, but actually with the default cirrus VGA driver QEMU uses. It has a pathological performance issue in text mode, resulting in it taking a long time for console messages to be written out. Since Windows Update’s post-reboot registry update will report progress by writing to the console potentially hundreds of thousands of times, it can take many hours for the update to finish.
To fix this, instead use the ""std graphics driver. On SmartOS you can do this by running:
vmadm update [UUID] vga=std
This significantly improved the performance – the registry updates took under 30 seconds after a reboot.
Add comment May 19th, 2013
Two Factor SSH Authentication with Yubico Yubikeys on SmartOS
Two factor authentication is a method of increasing the security of logging into a service (such as logging into a website or to your computer), by typically requiring "something you know" combined with "something you have". The idea here is that an attacker can’t get in unless they have both elements, meaning in addition to sniffing your password, they’d have to physically mug you too.

Traditionally, Two Factor authentication has been achieved using an expensive corporate solution such as RSA SecureID tokens (pictured above). These are a bit outside the reach of individuals or small businesses.
Thankfully, Yubico popped up with a fantastic product called the Yubikey that generates one-time-passwords (OTP) via an innovative USB Key. These one time passwords are cryptographically secure and unique to your key. When you go to log in to a service secured with Yubikey, you simply touch your Yubikey and it generates the OTP, which gets authenticated via Yubico’s server. If it’s correct, it lets you in. Combine this with the service asking you for a password (something you know), and you’ve got two-factor authentication.
They’re incredibly cheap, at around $25-$40 each. They come in a bunch of flavours, my favourite being the Nano edition, which is absolutely tiny and fits flush in your usb port, with just a tiny bit poking out for you to touch to authenticate. One nice feature of the Yubikey is that it presents itself as a simple USB Keyboard, and thus works with any Operating System out there.
Yubikeys can be used with an awful lot of services, including the browser plugin LastPass, which securely stores all your frequently used website passwords (I highly recommend using LastPass if you’re not already – it’s far more secure than using the same password for every website).
We’ve now added out of the box support for Yubikeys to our SmartOS package repo ec-userland, allowing you to do two-factor authentication to log into your server (or just log in with the Yubikey itself). To set this up (once you’ve bought and received your Yubikey), there are three bits – getting a Yubico API account, configuring PAM, and configuring SSH.
Getting your Yubico API Key
The Yubikey PAM configuration guides talk a lot about the “Client ID” and “Client Key”, this is referring to your API id and key. You can get this by visiting:
https://upgrade.yubico.com/getapikey/
You’ll need to enter your email address, and touch your Yubikey in the "YubiKey one-time password" field.
This gives you two things back, your Client ID and your Secret Key.
Installing the module and configuring PAM
To install the Yubikey PAM module, simply run:
pkg install -v library/security/yubico/yubico-pam
Then edit the /etc/pam.conf file, and place the following at the bottom:
sshd auth requisite /ec/lib/security/pam_yubico.so authfile=/etc/yubikey_mappings id=XXXX key=YYYY sshd auth requisite pam_authtok_get.so.1
Change the XXXX to your Client ID from above, and YYYY to your Secret Key.
The requisite keyword makes the method mandatory, and generates an immediate termination of the login process on failure. There are other options such as sufficient or required, see "man pam.conf" for more PAM options, and the yubico-pam module README for options specific to pam_yubico.so.
Next, edit the /etc/yubikey_mappings file, where you map Unix accounts to Yubikey IDs (the ID of your Yubikey itself). Your Yubikey ID is the first 12 characters of your OTP, so you just open a text editor, touch your Yubikey, and copy the first 12 characters. The file looks like:
alasdair: xxxxxxxxxxxx,xxxxxxxxxxxx bob: xxxxxxxxxxxx ...
That’s pretty much it as far as yubikey configuration goes.
Setting up SSH
On the EveryCity ec-userland based SmartOS images, we use OpenSSH rather than SunSSH, so this guide is for OpenSSH. You will need to edit /ec/etc/ssh/sshd_config and ensure the following are set:
PasswordAuthentication no ChallengeResponseAuthentication yes UsePAM yes
UsePam ensures that OpenSSH uses PAM, and "PasswordAuthentication no" ensures OpenSSH doesn’t prompt for a password separately to PAM (which has its own password prompting). ChallengeResponseAuthentication allows the use of the Yubikey.
You may also want to set "PubkeyAuthentication no", as unfortunately OpenSSH treats the presence of a public key as sufficient to grant login, regardless of the PAM configuration.
If all goes well, you should be able to log in:

Good luck, and if you have any questions, let me know!
2 comments January 18th, 2013
Building gcc 4.7.2 on SmartOS (and friends)
If you hit this lovely error when building gcc 4.7.2, fret not:
/bin/sh ./libtool --tag=CC --mode=link /ws/ec-userland/components/gcc47/build/i86/./gcc/xgcc -B/ws/ec-userland/components/gcc47/build/i86/./gcc/ -B/usr/i386-pc-solaris2.11/bin/ -B/usr/i386-pc-solaris2.11/lib/ -isystem /usr/i386-pc-solaris2.11/include -isystem /usr/i386-pc-solaris2.11/sys-include -march=i486 -mtune=i386 -fomit-frame-pointer -Wall -Werror -Wc,-pthread -g -O2 -Wl,-M,/ws/ec-userland/components/gcc47/source/gcc-4.7.2/libitm/clearcap.map -o libitm.la -version-info 1:0:0 -Wl,-M,libitm.map-sun -rpath /usr/lib aatree.lo alloc.lo alloc_c.lo alloc_cpp.lo barrier.lo beginend.lo clone.lo eh_cpp.lo local.lo query.lo retry.lo rwlock.lo useraction.lo util.lo sjlj.lo tls.lo method-serial.lo method-gl.lo method-ml.lo x86_sse.lo x86_avx.lo libtool: link: /ws/ec-userland/components/gcc47/build/i86/./gcc/collect-ld -r -o .libs/libitm.la-1.o .libs/aatree.o .libs/alloc.o .libs/alloc_c.o .libs/alloc_cpp.o .libs/barrier.o .libs/beginend.o .libs/clone.o .libs/eh_cpp.o .libs/local.o .libs/query.o .libs/retry.o .libs/rwlock.o .libs/useraction.o .libs/util.o .libs/sjlj.o .libs/tls.o .libs/method-serial.o .libs/method-gl.o .libs/method-ml.o .libs/x86_sse.o .libs/x86_avx.o ld: fatal: relocation error: R_386_32: file .libs/beginend.o: section [23].rel.debug_frame: symbol .text._ZN3GTM7aa_treeIjNS_16gtm_alloc_actionEE7clear_1EPNS_7aa_nodeIjS1_EE (section): symbol has been discarded with discarded section: [11].text._ZN3GTM7aa_treeIjNS_16gtm_alloc_actionEE7clear_1EPNS_7aa_nodeIjS1_EE make[5]: *** [libitm.la] Error 1 make[5]: Leaving directory `/ws/ec-userland/components/gcc47/build/i86/i386-pc-solaris2.11/libitm'
Simply add this lovely undocumented flag to ./configure: –disable-libitm
libitm is for transactional memory and not really needed. It seems buggered on Solaris. Unsure if this is due to a bug in Sun ld or not.
Add comment December 29th, 2012
Compiling QT with Webkit on Solaris 10
Getting QT on Solaris 10 to build is a PITA, but getting it to build with Webkit enabled is even worse. But fret not, after some Googling the patches can be found.
You can find our build recipe for it over here:
http://hg.openindiana.org/users/aszeszo/s10-userland/file/c473cd11bbd3/components/qt4
We’re using GCC 4.4 to build QT which, although not the officially supported compiler on Solaris platforms (They specify Sun Studio), it works just fine.
Add comment November 25th, 2011
Fixing “No active dataset” on zone attach
When moving zones between OpenIndiana (and OpenSolaris) hosts, you can often end up with the following dreaded error:
# zoneadm -z zonename attach -U
Log File: /var/tmp/zonename.attach_log.B8aWed
ERROR: no active dataset.
Result: Attach Failed.
This can happen for a variety of reasons, such as not detaching the zone before moving it, and not transferring the ZFS properties with the zone. But personally I blame the half-arsed zone attach scripts that could do with some work.
To get around it, here is a super-quick/dirty script that should allow the zone to attach:
#!/bin/bash
zfsfs=$1
root=${zfsfs}/ROOT
zbe=${root}/zbe
for i in $zbe $root $zfsfs ; do
for j in zoned mountpoint ; do
zfs inherit $j $i
done
done
zfs set mountpoint=legacy $root
zfs set zoned=on $root
zfs set canmount=noauto $zbe
zfs set org.opensolaris.libbe:active=on $zbe
rbe=`zfs list -H -o name /`
uuid=`zfs get -H -o value org.opensolaris.libbe:uuid $rbe`
zfs set org.opensolaris.libbe:parentbe=$uuid $zbe
The script takes one argument, the zfs filesystem the zone lives in (the parent of “ROOT” for the zone). Ignore any errors about "dataset is used in a non-global zone", and once it has run, manually mount the dataset and attach it with:
mount -F zfs dataset/ROOT/zbe /zones/zonename/root zoneadm -z zonename attach
This guide is pretty rough but should hopefully set people in roughly the right direction.
4 comments October 31st, 2011
vasprintf and asprintf on Solaris 10
Update: Martin in the comments suggested using the vasprintf definition in the OpenSolaris source.
If you get errors such as this on Solaris 10, it’s due to a lack of modern helpful string functions (which thankfully were added to OpenSolaris, so no problem here on OpenIndiana):
Undefined first referenced symbol in file asprintf ../../bin/gcc/libgpac.so
There’s quite a nice implementation (no idea how safe it is to use, but at least the program now compiles!) over at Stack Overflow – http://stackoverflow.com/questions/4899221/substitute-or-workaround-for-asprintf-on-aix. Take the latter one, by Jonathan Leffler.
You can drop it in like so:
#if (defined (__SVR4) && defined (__sun))
int vasprintf(char **ret, const char *format, va_list args)
{
va_list copy;
va_copy(copy, args);
/* Make sure it is determinate, despite manuals indicating otherwise */
*ret = 0;
int count = vsnprintf(NULL, 0, format, args);
if (count >= 0) {
char* buffer = malloc(count + 1);
if (buffer != NULL) {
count = vsnprintf(buffer, count + 1, format, copy);
if (count < 0)
free(buffer);
else
*ret = buffer;
}
}
va_end(args); // Each va_start() or va_copy() needs a va_end()
return count;
}
int asprintf(char **strp, const char *fmt, ...)
{
s32 size;
va_list args;
va_start(args, fmt);
size = vasprintf(strp, fmt, args);
va_end(args);
return size;
}
#endif
Since the code I'm compiling will not be facing the internet and only run by trusted users, I'm not too worried about how buffer overflow safe this code is. If you are concerned about that, you might want to take a look at gnulib, which has a nice properly portable version, although it's a lot bigger.
2 comments July 19th, 2011
Adjusting drive timeouts with mdb on Solaris or OpenIndiana
Update (New): These timeouts don’t do squat because mpt_sas doesn’t honour the timeouts. This was recently uncovered by Nexenta and a patch to fix it is about to hit Illumos shortly. I’ll post when it does. Another patch is in progress which will further improve how mpt_sas handles failed drives. Thanks to Albert Lee for his work on them – you, sir, rock!
Update (Old): These timeouts don’t work nearly as well as one would hope, unfortunately the sd timeouts get passed to the driver which in the case of mpt/mpt_sas, appear to do very little with them. I have raised this as an issue within the Illumos community and the debate was quite polarising; the kernel developers deny there is a problem or disagree on how to solve it, despite lots of people complaining of the same symptoms. Unfortunately I think it’s a difficult problem to solve due to the wide variety of hardware types that ZFS/Illumos is deployed on.
Our way of coping with dodgy drives is to preempt their failure via trigger happy SMART/iostat monitoring scripts that zpool offline bad drives before they fail.
Yesterday we suffered our first disk failure in our shiny new NFS cluster that has been operating flawlessly for 3 months. The NFS cluster we have is quite nice – it consists of a pair of NFS servers (96GB of RAM, Dual Intel E5620 CPUs) dual-attached to a set of LSI SAS 6Gbps JBOD arrays, with lots of Seagate Constellation ES 2TB enterprise SAS drives. For good measure there’s 1.5TB of SSD cache (6x256GB SSDs) acting as a read cache (L2ARC), and a ZeusRAM SSD acting as the write cache (ZIL). It runs a custom build of OpenIndiana.
Ordinarily a disk failure would result in at most a few minutes of stall while the OS waits for the drive to recover, and gives up. However, this drive decided simply to run glacially slowly, so it didn’t get removed in a timely fashion. In fact, it didn’t get removed at all, resulting in all IO to the SAN being stuck, causing a rather severe outage. 45 minutes in total.
When things became unresponsive, we logged in, and “iostat -xn” showed a 100% busy time on one of the disks, while the others did nothing. We attempted to “zpool offline baddisk”. Nothing much happened, presumably because the OS thought the drive was fine and was waiting on some queued IO finishing, or something along those lines. We had no immediate way of yanking the disk out, so we decided to failover the cluster from the primary NFS node to the secondary. This consists of powering off the primary node and letting the cluster software import the ZFS zpool and bring NFS services online.
When the secondary NFS node started importing the zpool, iostat once again showed a 100% busy time on the bad disk. Crap. Andrzej had the bright idea of deleting the disk entries from /dev, and sure enough this prompted ZFS to think the drive had disappeared, and the pool finally imported.
So immediately the question springs to mind, why did the OS not take this bad disk out of service? We consulted with our upstream vendor (contacted the folks over at Illumos) and all became clear.
The answer lays in the defaults in the Solaris SCSI subsystem. The default timeout for IO is 60 seconds with 5 retries (or 3 retries if its fibre channel/eSAS). For a storage array like ours, this is a 3 minute timeout for a single IO – or in other words, a very long time. Since the disk was accepting a trickle of IO, this timeout was never really reached.
Thankfully the timeouts can be adjusted, and Garrett D’Amore, the founder of Illumos and one of the lead developers who works at Nexenta, strongly suggested tuning the timeout to 5 seconds, with 3 retries.
Setting the timeout value is quite easy – its the system wide tunable sd_io_time. Keep in mind this will affect all disks. Edit /etc/system and drop in:
set sd:sd_io_time=5
If you have desktop SATA drives you’ll probably want a higher timeout, especially if you don’t have TLER (Time limited error recovery) on them, which limits error recovery to around 7 seconds.
The number of retries is set via /kernel/drv/sd.conf via sd-config-list. This file allows the setting to be set per-disk type via sd-config-list. To get 3 retries, the variable would be “retries-timeout:3″. The format of this file is a bit weird, here is an example for two disks:
sd-config-list = "STEC ZeusRAM ", "throttle-max:32, disksort:false, cache-nonvolatile:true",
"SEAGATE ST32000444SS ", "retries-timeout:3";
The bit where you define the disk type is a fixed length field, consisting of 8 characters for the vendor, and 16 characters for the product. So you have to pad the field out to the correct length with spaces.
Once these are set, reboot to activate. You can check the values are set by doing:
## Print system wide sd_io_time timeout value:
# echo "sd_io_time::print" | mdb -k
0x3c
## Print per-disk timeout and retry values:
# echo "::walk sd_state | ::grep '.!=0' | ::sd_state" | mdb -k | egrep "^un|un_retry_count|un_cmd_timeout"
un: ffffff093239d9c0
un_retry_count = 0x3
un_cmd_timeout = 0x5
un: ffffff093239d380
un_retry_count = 0x3
un_cmd_timeout = 0x5
...
The return values are in hexadecimal, so for example 0x3c is 60 seconds.
Adjusting values without rebooting
We have a number of storage servers in production, some of which we really didn’t want to reboot just to change the timeout value. After discussions with some of the Illumos kernel developers,
we worked out how to set the property at runtime using the modular Solaris debugger, mdb. This allows editing kernel values at runtime.
The system wide sd_io_time is used to populate a per-disk timeout value which is also stored in the same structure as the per-disk retry count. So changing the values is pretty similar.
First, we want to obtain the memory values for the settings we wish to edit:
# echo "::walk sd_state | ::grep '.!=0' | ::print -a struct sd_lun un_cmd_timeout" | mdb -k > /tmp/un_cmd_timeouts # cat /tmp/un_cmd_timeouts ffffff0d347a3a7c un_cmd_timeout = 0x3c ffffff0d247983bc un_cmd_timeout = 0x3c ffffff0d3429d3fc un_cmd_timeout = 0x3c ffffff0d55daf37c un_cmd_timeout = 0x3c ...
Now we have the values in /tmp/un_cmd_timeouts, we can set the value using mdb -kw:
# for i in `cat /tmp/un_cmd_timeouts | awk '{print $1}'` ; do echo ${i}/W 0x5 | mdb -kw ; done
We can then check the value was set by re-running:
# echo "::walk sd_state | ::grep '.!=0' | ::print -a struct sd_lun un_cmd_timeout" | mdb -k
Now we can do the same for un_retry_count:
# echo "::walk sd_state | ::grep '.!=0' | ::print -a struct sd_lun un_retry_count" | mdb -k > /tmp/un_retry_count
# for i in `cat /tmp/un_retry_count | awk '{print $1}'` ; do echo ${i}/W 0x3 | mdb -kw ; done
Hey presto, we just adjusted boot time kernel parameters on the fly :-)
If you need to know which disk is which, you can assume the output from mdb is ordered, and do:
echo "::walk sd_state | ::grep '.!=0' | ::print struct sd_lun un_sd | ::print struct scsi_device sd_dev | ::devinfo -q" | mdb -k
This returns the sd instance id, which can be seen from “iostat -E”. StackOverflow has some answers for mapping from sd to device name should you need to.
Concluding Remarks
With these values in place, our timeout is reduced from upwards of 3 minutes, to a mere 15 seconds. This is far more likely to cause the OS to offline dodgy disks like the one we were experiencing issues with.
There has been some recent discussion on the Illumos mailing lists regarding the default sd_io_time value, suggesting that the default should be lowered to 8 seconds. This has caused a bit of a furore, as people using Solaris with fibre channel disk arrays require higher timeouts, say 180 seconds. So there are people on both sides of the fence. But one thing is for sure – its a setting more people should know about.
10 comments May 14th, 2011
Autoconf, Automake and Libtoolized version of bzip2
Autoconf, Automake and libtool are 3 utilities designed to simultaneously help and hinder those of us that have to compile software. They together produce the familiar “./configure ; make ; make install” procedure most of us have used time and time again.
Although these tools are universally hated for being overly complex, slow and hard to use, thankfully most projects use them, because the alternative (usually some shitty Makefile that only works on Linux) is far far far worse.
BZip2 is one of those very simple system utilities we all require, where the author only ships a Makefile. Thankfully a helpful SuSE developer has autoconfized it. So grab a copy of those files into your bzip2-1.0.6 folder, and run autogen.sh
If only someone would do this for libxvid and ffmpeg…
Add comment March 28th, 2011
Lame, nasm, and text relocations (textrels)
Well, this took some debugging.
I’ve filed it all in a nasm bug report. To cut a long story short, if you compile LAME with Nasm 2.09, you’ll end up with TEXTRELs in the resultant libmp3lame.so.
What is a TEXTREL you may ask? Something bad! It stops the code being fully PIC (position independent), which stops the shared object being loaded into memory once and mapped multiple times. But worse, it causes Solaris ld to explode when linking:
gcc -shared -Wl,-h -Wl,libmp3lame.so.0 -o .libs/libmp3lame.so.0.0.0 .libs/VbrTag.o .libs/bitstream.o .libs/encoder.o .libs/fft.o .libs/gain_analysis.o .libs/id3tag.o .libs/lame.o .libs/newmdct.o .libs/presets.o .libs/psymodel.o .libs/quantize.o .libs/quantize_pvt.o .libs/reservoir.o .libs/set_get.o .libs/tables.o .libs/takehiro.o .libs/util.o .libs/vbrquantize.o .libs/version.o .libs/mpglib_interface.o -Wl,-z -Wl,allextract ../libmp3lame/i386/.libs/liblameasmroutines.a ../libmp3lame/vector/.libs/liblamevectorroutines.a ../mpglib/.libs/libmpgdecoder.a -Wl,-z -Wl,defaultextract -lm -lsocket -lnsl -lc -maccumulate-outgoing-args Text relocation remains referenced against symbol offset in file0x6e ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o) 0x75 ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o) 0x9a ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o) 0xa1 ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o) 0xa8 ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o) 0x12b ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o) 0x133 ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o) 0x1a0 ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o) 0x1aa ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o) 0x1b4 ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o) 0x1c2 ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o) 0x24c ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o) 0x25d ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o) 0x39 ../libmp3lame/i386/.libs/liblameasmroutines.a(fft3dn.o) 0x56 ../libmp3lame/i386/.libs/liblameasmroutines.a(fft3dn.o) 0x128 ../libmp3lame/i386/.libs/liblameasmroutines.a(fft3dn.o) 0x142 ../libmp3lame/i386/.libs/liblameasmroutines.a(fft3dn.o) 0x26e ../libmp3lame/i386/.libs/liblameasmroutines.a(fft3dn.o) 0x2b9 ../libmp3lame/i386/.libs/liblameasmroutines.a(fft3dn.o) 0x2d6 ../libmp3lame/i386/.libs/liblameasmroutines.a(fft3dn.o) 0x398 ../libmp3lame/i386/.libs/liblameasmroutines.a(fft3dn.o) 0x4ce ../libmp3lame/i386/.libs/liblameasmroutines.a(fft3dn.o) 0x2c ../libmp3lame/i386/.libs/liblameasmroutines.a(fftsse.o) 0x7a ../libmp3lame/i386/.libs/liblameasmroutines.a(fftsse.o) 0x88 ../libmp3lame/i386/.libs/liblameasmroutines.a(fftsse.o) 0xc4 ../libmp3lame/i386/.libs/liblameasmroutines.a(fftsse.o) 0xd9 ../libmp3lame/i386/.libs/liblameasmroutines.a(fftsse.o) 0xe7 ../libmp3lame/i386/.libs/liblameasmroutines.a(fftsse.o) 0x1d0 ../libmp3lame/i386/.libs/liblameasmroutines.a(fftsse.o) 0x1e4 ../libmp3lame/i386/.libs/liblameasmroutines.a(fftsse.o) 0x20b ../libmp3lame/i386/.libs/liblameasmroutines.a(fftsse.o) 0x219 ../libmp3lame/i386/.libs/liblameasmroutines.a(fftsse.o) t1l 0x189 ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o) largetbl 0xde ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o) largetbl 0x105 ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o) largetbl 0x10f ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o) table23 0x245 ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o) table56 0x256 ../libmp3lame/i386/.libs/liblameasmroutines.a(choose_table.o) ld: fatal: relocations remain against allocatable but non-writable sections collect2: ld returned 1 exit status
The way to fix the problem is to use NASM 2.08 or earlier, or wait until the bug gets fixed (although they might point their finger at LAME). I’m going to try yasm instead of nasm and see if that works, as an alternative.
If you don’t care about TEXTRELs, on Linux you don’t have to do anything (GNU ld allows them by default), but on Solaris you can tell the Solaris linker to allow impure text segments by adding "-mimpure-text -lrt" to your LDFLAGS. Or, you can use the GNU linker. This is quite hard, but I wrote a blog post about it.
Add comment March 26th, 2011
Using the GNU ld Linker on Solaris
On Solaris, GCC by default is compiled with the option –with-ld=/usr/ccs/bin/ld, telling it to use the Solaris linker. Unfortunately GCC uses this value above all else, meaning it will ignore LD= environment variables to set an alternative linker, such as /usr/sfw/bin/gld
Although tools like libtool/autoconf will pick up your LD= environment variable, and detect which options the linker supports (and whether its GNU ld or not), libtool unfortunately still calls gcc for the linking stage, which then ignores LD=. This makes it near-impossible to use GNU ld without actually doing a nasty hack, like “mv /usr/ccs/bin/ld /usr/ccs/bin/ld.off ; ln -s /usr/sfw/bin/ld /usr/ccs/bin/ld”. Yuck!
However, today when trying to get lame to compile using nasm (which generates objects that refuse to link with Solaris LD), I found Solaris LD accepts a very useful environment variable. The variable is LD_ALTEXEC.
Solaris LD will actually re-exec the value of LD_ALTEXEC, meaning that if you set LD_ALTEXEC to /usr/sfw/bin/gld, when /usr/ccs/bin/ld gets called, it immediately instead calls /usr/sfw/bin/gld with the arguments passed on. Thus, you can use whatever linker you wish. Hurrah! :-)
5 comments March 25th, 2011


