It isn't a normal day at work when you discover that the only way to get an application to run is to manually invoke an alternate loader.

So for a bit of background, one of my colleagues uses a proprietary library from a vendor for part of his scientific computing. This is normally not a big deal, but due to differences in system upgrade cycles, a few versioning issues have cropped up. The first, and most simple, was that the vendor is using gfortran version 4.4. At work, most of our systems use RedHat Enterprise Linux 4, which is a bit old, and doesn't ship with gfortran-4.4. A simple download and build and we're all set. We can now link with this library and run code.

My colleague found a bug in the library, and submitted a bug report. The vendor fixed the bug and sent back a new shared library. However, in the time between sending us the original library, and building the new library, the vendor had upgraded their systems. This new .so (the shared library) requires GLIBC 2.5 or greater. RHEL4 ships with GLIBC 2.3. For whatever reason, the vendor is unwilling to ship us a .so that links against libc-2.3.so, nor are they willing to ship us a static library (.a). Had they shipped us a static library, we could just link statically on a RHEL5 machine, and run on a RHEL4 box without issue. By requiring a dynamic library, we have to build a dynamic executable, so we depend a bit more on our running system.

At this point, I was asked if there was anything I could do. Now, this seems like it should be fairly straightforward. I mean, in the end, its just hunks of executable code. The trick is getting libraries to merge right.

Step 1: Find a RHEL5 box. Luckily, we happen to have a machine in our lab running RHEL5 that can be used as a compile system, but it isn't sufficient for running actual jobs.

Step 2: Compile test program with gfortran-4.4 and link with vendor's library. To minimize the number of extraneous libraries, include -static-libgcc -static-libgfortran on the link line.

Step 3: Copy the binary over to our RHEL4 system and try running it:

rhel4> ./a.out
./a.out: error while loading shared libraries:  ./libVENDOR.so: requires glibc 2.5 or later dynamic loader.
rhel4>

Step 4: OK, so we NEED the GLIBC 2.5 loader to get this library into memory for some reason. Luckily, we just happen to know that the loader is /lib/ld-linux.so.2. Follow the sym-links, and see that the loader is /lib/lib-2.5.so. Contrary to the .so extension, this is an executable that can be used to load subsequent executables. Copy over the loader from the RHEL5 box and try using it to launch our executable.

rhel4> ./ld-2.5.so ./a.out
./a.out: relocation error: /lib/tls/libc.so.6: symbol _dl_out_of_memory, version GLIBC_PRIVATE not defined in file ld-linux.so.2 with link time reference
rhel4>

Well, shoot. Looks like we need to bring over the libc from our RHEL5 box. While we're at it, let's pull over ALL the libraries we need.

Step 5: Determine which libraries are used.

rhel5> ldd a.out
    linux-gate.so.1 =>  (0xffffe000)
    libVENDOR.so => ./libVENDOR.so (0xec983000)
    libm.so.6 => /lib/libm.so.6 (0x00b42000)
    libc.so.6 => /lib/libc.so.6 (0x0093d000)
    libpthread.so.0 => /lib/libpthread.so.0 (0x00a3200)
    /lib/ld-linux.so.2 (0x0091f000)
rhel5>

Step 6: Now, with some work, you might be able to convince the compiler and linker to statically link libm, libc and libpthread. For now, let's just copy those shared libs to our RHEL4 system.

Step 7: Attempt to run the binary on the RHEL4 system

rhel4> ./a.out
./a.out: relocation error: /lib/tls/libc.so.6: symbol _dl_out_of_memory, version GLIBC_PRIVATE not defined in file ld-linux.so.2 with link time reference
rhel4>

Step 8: Of course! We need to force the use of these libraries. The environment variable LD_LIBRARY_PATH is referenced AFTER the system library paths. Let's use LD_PRELOAD to force these three system libraries to the versions we pulled over.

rhel4> env LD_PRELOAD="./libc-2.5.so ./libm-2.5.so ./libpthread-2.5.so" ./ld-2.5.so ./a.out
Hello, we have success!
rhel4>

And there you have it! A dynamic executable avoiding all base-system supplied support (loader, system libraries, etc). It is as if it were a statically linked program. Now, an exercise left to the reader is to figure out how to get gfortran to statically link libc, libm and libpthread in a dynamic executable so we can avoid the LD_PRELOAD environment variable.

Update:

By making sure to rename the system libraries to the names desired in the executable (libc-2.5.so -> libc.so.6, etc), you can avoid the need for LD_PRELOAD as long as the local directory is in your LD_LIBRARY_PATH environment variable, or you linked with -Wl,-rpath='$ORIGIN'.

Update 2:

I've figured out a way to allow my colleague to build and run from the RHEL4 system. As long as he copies libc.so.6, libm.so.6, libpthreads.so.0 and ld-2.5.so from the RHEL5 system to the run directory of the RHEL4 system, he can compile and link on RHEL4 with the following command line:

gfortran-4.4 -m32 -Wl,-rpath='$ORIGIN' -Wl,-dynamic-linker,./ld-2.5.so -Wl,--unresolved-symbols=ignore-in-shared-libs test.o libVENDOR.so -o test

-dynamic-linker ./ld-2.5.so specifies to the linker that you'll be using a custom loader (the one from RHEL4). --unresolved-symbols=ignore-in-shared-libs means that the linker shouldn't bother trying to find all the symbols that libVENDOR.so is looking for. This way, link-time doesn't need to use the same versions of the libraries as run-time.

Now my colleague can build and run normally (ie, ./test with no specialized-loader preface) from RHEL4.