Wednesday, January 2, 2008

The Debian/Ubuntu preseed hack

Today I'm going to talk about preseed hacking in the Debian GNU/Linux/Ubuntu operating system installation.

If you've ever developed a package for any distribution or had to hack an existing package then there is a fairly good chance that you know a package can exhibit different behavior depending on when it's being installed. Like our VirtualBox package in the previous post for example. During the OS install it will fail to completely install due to issues with how it installs parts of itself but after the system is running after the installation the package installs flawlessly.

This kind of behavior can make debugging and development a risky and painful process. How does one test a new version of a package without breaking any existing deployments of that package? Here in out production desktops we use apt-pinning to make it select the package from our internal stable repositories before it selects it from the testing repository. But if you're testing a new package how do you MAKE it install the testing package without installing EVERY testing package? To do this we're going to make use of the debian preseed option: "late_command".

The rest of this post assumes you are comfortable with editing a preseed file, rerolling your initrd.gz and then recompiling and burning your bootable media image. If I just lost you then please read on with caution and don't do anything you don't think you can't fix on your own if your system breaks.

So lets start this by defining what we're dealing with. Our infrastructure has two repositories: stable and testing. We want to test a package foo-2.5 and foo-2.4 is already in stable and deployed. Foo-2.4 was having problems with it's installation during an OS install and your boss is pushing you to make it work on the first try now, so you're working on foo-2.5. Now you need to test foo-2.5 on a reinstall without making your network (which updates every hour to the latest greatest stable packages) break.

When I had to deal with this problem I first looked at various options like setting the version to download manually for a package inside the preseed, but couldn't find anything about doing that. Another option I considered was creating a meta-package that would also auto-install for a given machine and would itself manually depend on the testing version, but I decided that was too messy. One other option which I experimented with and had varying degrees of success with was just running the command 'in-target rm /etc/apt/preferences' in my late command. Because my infrastructure uses an automated install program we developed in our late_command this would cause everything that apt gets to be the version with the greatest version number, not quite an idea solution since there are going to invariably be packages in our repositories that are truly UNstable and may just throw more wrenches in the gears. The solution I decided on affords the installer the most flexibility, creativity, and stability.

Because the late_command allows you to execute commands in a chroot like manner on your target installation you can wget chmod and run anything. Here's what we'll put into the preseeds late_command:
preseed preseed/late_command string \
    in-target wget your.server.tld/~uname/getNewFoo.sh; \
    in-target chmod +x /getNewFoo.sh; \
    in-target /preseed.sh; \
    in-target autoinstall;

Line 1 is the basic preseed option line that all late_commands will start with. Line 2 fetches your custom script that runs inside the chroot environment. Lins 3 makes it runnable. Line 4 will run the actual command, and line 5 is your automatic package installation program.

So what's in this magical getNewFoo script? Mine looks like this:
#!/bin/bash
# Backup 'normal' preferences
cp /etc/apt/preferences /etc/apt/preferences.old

# Get new temp preferences
wget http://your.server.tld/~uname/preferences

# Remove old preferences
rm /etc/apt/preferences

# Move in new preferences
mv preferences /etc/apt/preferences

And my preferences file looks like this:
Package: *
Pin: release a=stable
Pin-Priority: 700

Package: *
Pin: release a=testing
Pin-Priority: 100

Package: foo
Pin: release a=testing
Pin-Priority: 1001

To go over this quickly, tells apt to download all packages from the repository called 'stable' before they pull from the repository called 'testing'. These priority levels are indicated by "Pin-Priority: nn". The last entry is the new entry to the preferences file which tells apt to make special consideration for the package 'foo' and to pull it from the repository called 'testing' regardless of version number. This this method I've been successfully able to rebuild machines picking packages from any repository I want.

As a side effect this increased the stability of the overall network due to not flooding the stable repository with unstable packages and then rushing to patch any issues that crop up.

Monday, December 31, 2007

Packaging, during OS install and Post OS install

Introduction: In my work environment we use a Ubuntu to essentially do everything with our network, we have our OS boot media highly customized such that it installs software collections specific to each machine it installs on based only off of some LDAP queries performed during the install process. During OS installation not all software packages behave as they do during an installation post OS install. The package I was investigating last Friday is VirtualBox (innotek). It comes with it's own package provided through the Ubuntu repositories and that's the version we're using.

The problem that has come up is that during an OS install VirtualBox fails to install completely. More specifically the package installs all of it's components but the kernel driver it provides does not get built. The other side of this story is that during a post-os install installation of VirtualBox the package installs completely and works as it should each time.

To get to the bottom of this I started by rebuilding a machine and jumping onto a virtual terminal during it's software installation process. I searched through the logs in /var/log/syslog until the words "virtualbox" showed up.
Dec 29 02:36:17 in-target: Messages emitted during module compilation will be
logged to /var/log/vbox-install.log.

I checked that file out and here's what I found.
Makefile:68: *** Error: unable to find the sources of your current Linux kernel.
Specify KERN_DIR= and run Make again.. Stop.


Now it's starting to make sense to me. The virtualbox installer needs the kernel source to compile it's own kernel driver against. But where IS it looking? A little more poking around and I found that it searches for the current kernel directory dynamically. It does a check for if [ -d /lib/modules/`uname -` ] . Sure enough that is the problem. Here's the output of a uname -r run in the installer environment: 2.6.20-15-generic. So what actually exists inside the /lib/modules directory during OS installation?
$ ls -l /lib/modules
2.6.20-16-generic

Not sure how to fix this one 100% yet. The basic logic behind fixing this is going to be (in pseudo code): if [os install] -> then define KERN_DIR=/lib/modules/2.6.20-16-generic. The only thing I can think of doing to check that is to add a few lines in the preseed file we use on our boot media that will create a temp file in the target partition for the installation then when the autoinstall is finished it will remove it. This would allow for any postinst scripts in our packages to check for that files existance and perform the appropriate actions during it's installation.
In the case of VirtualBox we can't edit the package directory due to licensing issues but we have created a meta package that Depends: the VirtualBox package. This allows us to still have some level of control over the installation process. Using this meta package we may be able to do a check for that first time install file and then export the KERN_DIR directory to the shell. However, what I am not sure about is the scope and lifespan of shell variables defined during the preinst of a package in Debian.

Little Bits about Bytes

The goal of this blog is to start a technology focused journal of things I learn and encounter while interacting with technology. Be it in my work environment or on my own free hacking. I want to keep it structured as a Problem -> Investigation/Discussion -> Solution type entry system. With that said, let the blogging begin!