Canonical Voices

Jussi Pakkanen

I played around with btrfs snapshots and discovered two new interesting uses for them. The first one deals with unreliable operations. Suppose you want to update a largish SVN checkout but your net connection is slightly flaky. The reason can be anything, bad wires, overloaded server, electrical outages, and so on.

If SVN is interrupted mid-transfer, it will most likely leave your checkout in a non-consistent state that can’t be fixed even with ‘svn cleanup’. The common wisdom on the Internet is that the way to fix this is to delete or rename the erroneous directory and do a ‘svn update’, which will either work or not. With btrfs snapshots you can just do a snapshot of your source tree before the update. If it fails, just nuke the broken directory and restore your snapshot. Then try again. If it works, just get rid of the snapshot dir.

What you essentially gain are atomic operations on non-atomic tasks (such as svn update). This has been possible before with ‘cp -r’ or similar hacks, but they are slow. Btrfs snapshots can be done in the blink of an eye and they don’t take extra disk space.

The other use case is erroneous state preservation. Suppose you hack on your stuff and encounter a crashing bug in your tools (such as bzr or git). You file a bug on it and then get back to doing your own thing. A day or two later you get a reply on your bug report saying “what is the output of command X”. Since you don’t have the given directory tree state around any more, you can’t run the command.

But if you snapshot your broken tree and store it somewhere safe, you can run any analysis scripts on it any time in the future. Even possibly destructive ones, because you can always run the analysis scripts in a fresh snapshot. Earlier these things were not feasible because making copies took time and possibly lots of space. With snapshots they don’t.

Read more
Jussi Pakkanen

I work on, among other things, Chromium. It uses SVN as its revision control system. There are several drawbacks to this, which are well known (no offline commits etc). They are made worse by Chromium’s enormous size. An ‘svn update’ can easily take over an hour.

Recently I looked into using btrfs’s features to make things easier. I found that with very little effort you can make things much more workable.

First you create a btrfs subvolume.

btrfs subvolume create chromium_upstream

Then you check out Chromium to this directory using the guidelines given in their wiki. Now you have a pristine upstream SVN checkout. Then build it once. No development is done in this directory. Instead we create a new directory for our work.

btrfs subvolume snapshot chromium_upstream chromium_feature_x

And roughly three seconds later you have a fresh copy of the entire source tree and the corresponding build tree. Any changes you make to individual files in the new directory won’t cause a total rebuild (which also takes hours). You can hack with complete peace of mind knowing that in the event of failure you can start over with two simple commands.

sudo btrfs subvolume delete chromium_feature_x
btrfs subvolume snapshot chromium_upstream chromium_feature_x

Chromium upstream changes quite rapidly, so keeping up with it with SVN can be tricky. But btrfs makes it easier.

cd chromium_upstream
gclient sync # Roughly analogous to svn update.
cd ..
btrfs subvolume snapshot chromium_upstream chromium_feature_x_v2
cd chromium_feature_x/src && svn diff > ../../thingy.patch && cd ../..
cd chromium_feature_x_v2/src && patch -p0 < ../../thingy.patch && cd ../..
sudo btrfs subvolume delete chromium_feature_

This approach can be taken with any tree of files: images, even multi-gigabyte video files. Thanks to btrfs’s design, multiple copies of these files take roughly the same amount of disk space as only one copy. It’s kind of like having backup/restore and revision control built into your file system.

Read more
Jussi Pakkanen

The four stages of command entry

Almost immediately after the first computers were invented, people wanted them to do as they command. This process has gone through four distinct phases.

The command line

This was the original way. The user types his command in its entirety and presses enter. The computer then parses it and does what it is told. There was no indication on whether the written command was correct or not. The only way to test it was to execute it.

Command completion

An improvement to writing the correct command. The user types in a few letters from the start of the desired command or file name and presses tab. If there is only one choice that begins with those letters, the system autofills the rest. Modern autocompletion systems can fill in command line arguments, host names and so on.

Live preview

This is perhaps best known from IDEs. When the user types some letters, the IDE presents all choices that correspond to those letters in a pop up window below the cursor. The user can then select one of them or keep writing. Internet search sites also do this.

Live preview with error correction

One thing in common with all the previous approaches is that the input must be perfect. If you search for Firefox but accidentally type in “ifrefox”, the systems return zero matches. Error correcting systems try to find what the user wants even if the input contains errors. This is a relatively new approach, with examples including Unity’s new HUD and Google’s search (though the live preview does not seem to do error correction).

The future

What is the next phase in command entry? I really have no idea, but I’m looking forward to seeing it.

Read more
Jussi Pakkanen

Complexity kills

The biggest source of developer headache is complexity. Specifically unexpected complexity. The kind that pops out of nowhere from the simplest of settings and makes you rip your hair out.

As an example, here is a partial and simplified state machine for what should happen when using a laptop’s trackpad.

If you have an idea of what should happen in the states marked “WTF?”, do send me email.

Read more
Jussi Pakkanen

What is worse than having a problem?

The only thing worse than having a problem is having a poor solution to a problem.


Because that prevents a good solution from being worked out. The usual symptom of this is having a complicated and brittle Rube Goldberg machine to do something that really should be a just simpler. It’s just that nobody bothers to do the Right Thing, because the solution that we have almost kinda, sorta works most of the time so there’s nothing to worry about, really.

Some examples include the following:

  • X used to come with a configurator application that would examine your hardware and print a conf file, which you could then copy over (or merge with) the existing conf file. Nowadays X does the probing automatically.
  • X clipboard was a complete clusterf*ck, but since middle button paste mostly worked it was not seen as an issue.
  • The world is filled with shell script fragments with the description “I needed this for something long ago, but I don’t remember the reason any more and am afraid to remove it”.
  • Floppies (remember those?) could be ejected without unmounting them causing corruption and other fun.

How can you tell when you have hit one of these issues? One sign is that you get one of the following responses:

  • “Oh, that’s a bit unfortunate. But if you do [complicate series of steps] it should work.”
  • “You have to do X before you do Y. Otherwise it just gets confused.”
  • “It does not do X, but you can do almost the same with [complicated series of steps] though watch out for [long list of exceptions].”
  • “Of course it will fail [silently] if you don’t have X. What else could it do?”
  • “You ran it with incorrect parameters. Just delete all your configuration files [even the hidden ones] and start over.”

If you ever find yourself in the situation of getting this kind of advice, or, even worse, giving it out to other people, please consider just spending some effort to fixing the issue properly. You will be loved and adored if you do.

Read more
Jussi Pakkanen

You know how we laugh at users of some other OS’s for running random binary files they get from the Internet.

Well we do it as well. And instead of doing it on our personal machines, we do it on those servers that run our most critical infrastructure?

Here is a simple step by step plan that you can use to take over all Linux distributions’ master servers.

  1. Create a free software project. It can be anything at all.
  2. Have it included in the distros you care about.
  3. Create/buy a local exploit trojan.
  4. Create a new minor release of your project.
  5. Put your trojan inside the generated configure script
  6. Boom! You have now rooted the build machines (with signing keys etc) of every single distro.

Why does this exploit work? Because configure is essentially an uninspectable blob of binary code. No-one is going to audit that code and the default packaging scripts use configure scripts blindly if they exist.

Trojans in configure scripts have been found in the wild.

So not only are the Autotools a horrible build system, they are also a massive security hole. By design.

Post scriptum: A simple fix to this is to always generate the configure script yourself rather than using the one that comes with the tarball. But then you lose the main advantage of Autotools: that you don’t need special software installed on the build machine.

Read more
Jussi Pakkanen

If there were no resource files, programming would be easy. You could just directly access any thing you need by its variable name. Unfortunately in the real world, you sometimes need to load stuff from files. This seemingly simple task is in fact very complicated.

When confronted with this problem for the first time the developer will usually do something like this:

fp = fopen("datadir/datafile.dat", "r");

Which kinda works, sometimes. Unfortunately this is just wrong on so many levels. It assumes that the program binary is run in the root of your source dir. If it is not, the path “datadir/datafile.dat” is invalid. (You are of course keeping your build directory completely separate from your source dir, right?) After diving into LSB specs and config.h autogeneration, the fearless programmer might come up with something like this:

fp = fopen(INSTALL_PREFIX "/datadir/datafile.dat", "r");

Which works. Sometimes. The main downside being that you have to run “make install” always before running the binary. Otherwise the data files in INSTALL_PREFIX may be stale and cause you endless debugging headaches. It also does not work if the binary is installed to a different directory than given in INSTALL_PREFIX. The platforms that can do this are Windows, OSX and, yes, Linux (though, to be honest, no-one really does that).

Usually the next step is to change the binary to take a command line argument specifying where its data files are. Then a wrapper script is created that determines where the binary currently lies, constructs the argument and starts the binary.

This also works. Sometimes. Unfortunately there is no portable scripting language that works on all major platforms so there need to be several scripts. Also, what happens if you want to run the binary under gdb? You can’t run the script under gdb, and the binary itself won’t work without the script. The only choice is to code custom support for gdb in the script itself. Simply invoking gdb reliably is hard. The commands to run are completely different depending on whether you are using Libtool or not and have installed the binary or not. If you want to run it under Valgrind, it needs custom support as well. The wrapper script will balloon into a rat’s nest of hacks and ifs very, very quickly.

Before going further, let’s list all the different requirements for file access. The binary would need to access its own files:

  • in the source tree when the binary is compiled in-source
  • in the source tree when the binary is compiled out-of-source
  • in the install directory when installed
  • in a custom, user specified directory (overriding all other options)
  • without wrapper scripts
  • in Unix, OSX and Windows

There are many ways to do this. A recommended exercise is to try to think up your own solution before going further.

The approach I have used is based on an environment variable, say MYPROG_PREFIX. Then we get something like this in pseudocode:

open_datafile(file_name) {
    return fopen(envvar_value("MYPROG_PREFIX") + file_name);
  return platform_specific_open(file_name);

// One of these is #ifdeffed to platform_specific_open.

open_datafile_unix(file_name) {
  return fopen(INSTALL_PREFIX + file_name);

open_datafile_windows(file_name) {
  // Win apps usually lump stuff in one dir and
  // cwd is changed to that on startup.
  // I think. It's been a long time since I did win32.
  return fopen(file_name);

open_datafile_osx(file_name) {
  // This is for accessing files in bundles.
  // Unfortunately I don't know how to do that,
  // so this function is empty.

During development, the environment variable is set to point to the source dir. This is simple to do in any IDE. Vi users need to do their own thing, but they are used to it by now. ;-) The end user does not have this variable set, so their apps will load from the install directory.

The one remaining issue is Unix where the binary is relocated to somewhere else than the install location. A simple approach that comes to mind is to dynamically query the location of the current binary, and then just do CURRENT_BINARY_DIR + “../share/mystuff/datafile.dat”.

Unfortunately Posix does not provide a portable way to ask where the currently executing binary is. For added difficulty, suppose that your installed thing is not a binary but a shared library. It may lie in a completely different prefix than the binary that uses it and thus the app binary’s location is useless. I seem to recall that the Autopackage people had code to work around this but their website seems to be down ATM so I can’t link to it.

Read more
Jussi Pakkanen

The Ubuntu Advantage™

As a touch developer people often ask me what makes our touch stack better than the rest. As exhibit A I present this image of one of our competitor’s products.

This was found in Orlando’s Hard Rock Cafe.

Read more
Jussi Pakkanen

Note: nothing written here should be seen as an endorsement or anything by Canonical or any other party. This is just me being a comedian speculating on things.

By now we have seen that in the world of marketing black is white and outside is downside or something to that effect. Let’s apply our newly found knowledge to a real world issue. If we were to design a new “image” for Ubuntu using the guidelines given, what would it look like.

First we need to determine what Ubuntu is. It is an operating system. Therefore we must not ever mention that fact. Or the fact that it is scalable, has high performance or any other attribute that can be quantified.

Then we need to determine what it is not. Reading through Internet postings we find that due to Ubuntu’s Unix heritage there are problems with non-working hardware, having to use the command line, compiling applications from source to use them and so on. Whether or not these accusations are true is irrelevant. They simply tell us that according to valued Internet posters such as mr Trolly McTrollenstein Ubuntu is user-hostile.

What is the opposite of hostile? There are several choices, but let’s go with cozy.

For a visual look we’re going to use a cheap trick: upturned palms. This is an age-old technique to look sincere as used by used car salesmen, politicians and other people whose job it is to make you trust them even if you really should not. Putting it all together we get something like this.


The Coziest Computer Experience in the World

Now all that is needed is that a few million people keep repeating this mantra consistently to change reality as we know it.

Read more
Jussi Pakkanen

If I asked you who defines the reality we live in, you would probably think that it’s a strange question. So let’s examine this with a simple question.

How was this guy commonly referred to?

No, not creepy weirdo. The other one.

That’s right, the King of Pop. But have you ever wondered how he became the King? Did he do battle with other peons of pop to eventually rise up as the ruler of popdom? Is there a a section of the UN that governs over the royalty of popular culture (there are at least The Duke, a Princess and a Queen)? Or maybe he was thrown a Sword of Pop by a lady in the lake thus giving him this prestigious role.

One might wonder about the succession of this Kinghood. Did he get his from the King of Kings who had died just before his career got off? And now that the King of Pop is dead, who is the next King? Is it this guy:

These were among the questions that Howard Stern thought about a long time ago. He realized that no-one had actually named Michael Jackson the King of Pop, he had just started calling himself that. So he decided to try the same thing just to see if it worked, even though he was just a radio show host (though a pretty successful one at that). So he started calling himself King of All Media. The results were quite interesting.

People started treating him as if he truly were King of All Media. At interviews he was always presented as King of All Media and even regular people commonly referred to him as that. He had not done any media conquests or anything like that. He simply started behaving as if he were the King and people treated him as such. In effect, he had altered reality simply through his will.

This is not an isolated incident. There is also the case of Norton I, the Emperor of the United States. He was a businessman who lost his fortune, went insane and declared himself emperor. He was then treated like an emperor. People wrote him letters pretending to be various heads of state, issued currency in his name and even attended his funeral by the tens of thousands. In his mind he truly was the emperor, simply because chose to be.

To come back to the original question: reality is defined people’s view on the world. Those views are not actually based on anything in the same way buildings are based on the ground they rest on. So if you want to change the world, all you have to do is to pretend that the change has already happened and behave accordingly. The really scary part is that other people will start believing it (though it’s not in any way guaranteed that over two people will ever See the Light as You Intended).

This is what advertising is based on: choosing how you want the world to be and then repeating it over and over and over and over again. Eventually reality changes and your message has become fact.

And that is why plants crave Brawndo: it’s got electrolytes.

Read more
Jussi Pakkanen

For those with an engineering background, marketing seems somewhat bizarre. A lot about it just does not seem to make any sort of sense. This is commonly known as the rational-view-of-the-world bias. But if you look into it, things become clearer step by step.

Mostly everything follows from Rule Number One of marketing. It goes as follows:

You must emphasize that which your product is not.

Seems quite backwards, doesn’t it? But yet, this is what has been proven to work, time and time again. Let’s look at an example.

One of the main plot devices of the TV show True Blood is that a japanese company has developed synthetic blood and thus vampires don’t have to feed on humans any more. They named this product Tru Blood.

Why this name? Because that is the one thing the product is not. It is not real, but synthetic.

A more real-world example comes from Hong Kong. They had a problem where people in a certain swamp area kept dying of malaria. This of course made it somewhat hard to get people to move in there. So the people in charged made the only reasonable choice: they renamed the place Happy Valley. Problem solved.

This is one of those things that once you “see” it, it’s everywhere. Here are just some examples.

Apple’s slogan is “Think different” but their products go out of their way to prevent the user from doing anything not officially sanctioned.

Any Hollywood movie that advertises itself as a “hilarious comedy” is usually roughly as fun as dragging steel forks on a chalkboard.

Restaurants and food manufacturers commonly use phrases such as “just like mom used to make” and “delicious home-cooked food” even though my mother never made any food like that and and fairly sure that chefs don’t live in the backrooms of their restaurants. (And if they do, I really don’t want to eat in those locations.)

Freshly squeezed orange juice isn’t and blueberry muffins aren’t.

Enron’s stationary slogan was “Respect. Integrity. Communication. Excellence.”

The TV show Bullshit! was originally about exposing quacks and hoaxes using science. At some point it became a soapbox for the hosts’ personal libertarian agenda of “everything the government ever does is always wrong (even if it is the exact opposite of what we were talking about last week)”. At the exact same time the show’s opening credits was changed to emphasize science, objectivity, reason, fairness and all other values the show itself didn’t adhere to any more.

The obvious question that comes from all this is that why does this work. That will be explained in the next post.

Read more
Jussi Pakkanen

Is malloc slow?

We had a discussion some time ago about the speed of malloc and whether you should malloc/free small objects as needed or reuse them with e.g. memory pools. There were strong opinions on either side which made me search the net for some hard numbers on malloc speed.

It turns out there weren’t any. Search engines only threw up tons of discussions about what the speed of malloc would theoretically be.

So I wrote a test program that allocates chunks of different sizes, holds on to them for a while and frees them at random times in a random order. Then I made it multithreaded to add lock contention in the mix. I used 10 threads.

On quad core laptop with 4GB of memory glibc can do roughly 1.5 million malloc/free pairs per second.

On a crappy ARM board with a single core glibc can do 300 000 malloc/free pairs per second.

What does this mean in practice? If you are coding any kind of non-super-high-performance app, the only reasons you would care about reducing mallocs are:

  • you do hundreds of malloc/free calls per second for long periods at a time (can lead to severe memory fragmentation)
  • you have absolute latency requirements in the sub-millisecond range (very rare)
  • your app is used actively for hours on end (e.g. Firefox)
  • you know, through measurement, that the mallocs you are removing constitute a notable part of all memory allocations

But, as Knuth says, over 97% percent of the time malloc is so fast that you don’t have to care.

No, really! You don’t!

Update: The total memory pool size I had was relatively small to reflect the fact that the working set is usually small. I re-ran the test with 10x and 100x pool size. The first was almost identical to the original test, the latter about 10 times slower. That is still ~175 000 allocations per second, which should be plenty fast. I have also uploaded the code here for your enjoyment.

Read more
Jussi Pakkanen


What makes computational geometry algorithms special is that they consists only of special cases.

Plane geometry algorithms are those that seem extremely obvious but are really hard to implement. Simply checking whether a point lies within a given polygon turns out to be really hard to solve exhaustively.

These sorts of problems turn up all the time when dealing with multi touch input. Let’s start with a simple example: dragging two fingers sideways. What should happen? A drag event, right? How hard can it be?

If both touches are within one window, the case is simple. But what if one touch is over a window and the other is over the desktop? Two one finger drags would seem reasonable. But what if the desktop drag goes over the window. Should it transform into a two-finger drag? Two one finger drags? Some hybrid of them?

On second thought, that seems a bit complicated. Let’s look at something simpler: a one finger drag. That’s simple, right? If you drag away from a window to the desktop, then the original window/widget gets drag events as the touch moves. Drag and drop works and everything is peachy.

But what about if the touch comes back into the same window but on a different widget? One that also has a gesture subscription. Who gets the drag events? Let’s assume that the user wanted to scroll one viewport up and another one down. It would mean that we need to end the first gesture and start a new one when moving back in over the window. But if the user was doing drag and drop, we can’t end the gesture, because then we lose source info.

So there you go, the simplest possible gesture event pops up an undecidable problem. When you deal with the combinatorics of multiple touches, things start really getting hard.

Read more

The GNU Autotools are nothing short of an astounding engineering achievement. They can configure, compile and install software ranging from supercomputers all the way down to a SunOS 1.0 from 1983, or thereabouts. It can even do shared libraries portably, which is considered to be among the darkest of magicks on many of these platforms.

This technological triumph does not change the fact that using Autotools is awful. The feeling of using Autotools is not totally unlike trying to open a safe by hitting it repeatedly with your head. Here are just some ways they cause pain to developers.

Complexity is standard

If I were to describe Autotools with just one word, it would be this:


It’s not that Autotools is hard, a lot of systems are. It’s not that it has a weird syntax, lots of programs have that too. It’s that every single thing about Autotools seem to be designed to be as indecipherable as possible.

Suppose that a developer new to Autotools opens up a for the first time. He might find lines such as these, which I took from a real world project:

AM_CONDITIONAL([HAVE_CHECK],[test "x$have_check" = xyes])

AM_INIT_AUTOMAKE([foreign dist-bzip2])

Several questions immediately come to mind. Why are all function arguments quoted in brackets? What is AM_MAINTAINER_MODE? Why is it needed since I am not the maintainer of Automake? What is “xyes”? A misspelling of “xeyes” perhaps? Are the bracketed things arrays? Is space the splitting chracter? Why is the answer no on some locations and yes on others?

The interested reader might dig into these questions and a week or so later have answers. Most people don’t. They just get aggravated, copy existing files and hope that they work. A case in point: Autotools’ documentation states clearly that AM_MAINTAINER_MODE should not be used, yet it is in almost every Autotools project. It survives as a vestigial organ much like the appendix, because no-one wants to understand the system. And, just like the appendix, it sometimes tries to kill its host with a sudden inflammation.

My estimate is that there are less than 20 people in the entire world who truly understand Autotools. That is one hell of a small pool for something as fundamental as the most used build system in the Free software world.

A legacy of legacy

Autotools work by generating a Bourne shell (the one from the seventies) compatible configure script and makefiles (also of the type from the seventies). To do this they use a macro language called M4 (guess which decade this one is from). There is nothing inherently wrong with using tried and tested tools.

The big problem is that the designers did not do proper encapsulation. The source files to Autotools do not have one syntax. They have several all mixed together. The “xyes” thing
mentioned above is in fact a shell script snippet that gets included (eventually) to the main configure script. Make directives can also be added for extra fun.

The end result is that you have code snippets in several different languages mixed together arbitrarily. In addition to being tedious to read, they also make automatic code inspection all but impossible. For example most non-trivial Autoconf snippets give syntax errors in Eclipse’s Autotools editor due to missing and/or extraneous parentheses and so on (to be fair, most of these are bugs in Eclipse, but they are caused by the fact that parsing is so hard). The only way to find errors is to compile and run the code. Debugging the result is even harder.

Since Autotools is intertwingled with these tools and their idiosyncrasies, it can never be fixed, cleaned or substantially improved.

Surprising costs

One of the main goals of the C++ committee has been that you should never have to pay penalty for features you don’t use. If you don’t use virtual inheritance, your function calls are just as fast as in C. Don’t need RTTI? Just disable it. Don’t use templates? Then you
don’t get any extra bloat in your binaries.

Not so with Autotools.

Suppose you develop a program that uses GTK+ and D-bus. That implies a rather modern Linux program that will never run on, say, AIX 4.0. So you would probably want to throw away all the garbage dealing with that platform’s linking peculiarities (and everything else, too) from your build system. But you can’t.

Autotools is designed so that every single portion of it runs according to the lowest possible common denominator in the entire world (except when it doesn’t, we’ll come back to this). This has interesting consequences.


The most common complaint about any piece of software is that it is bloated. For some reason this is never said of Autotools, even though it is one of the most bloated things in existance.

As an example let’s look at utouch-grail. It is a plain C library that detects multitouch gestures. It is a small to medium sized project. Analyzing it with Sloccount reveals that its configure script is three times larger than all C source (including headers) put together.


This is even more astounding when you remember that the configure script is written in a high level scripting language, whereas the library is plain C.

If you look inside the configure script, one of the things you notice quite quickly is that it does not use shell functions. They are all unrolled. This is because the original plain Bourne Shell did not support functions (or maybe it did but they were broken in some versions of Ultrix or whatever). So Autotools will not use them in the code it generates. You pay the price whether you want to or not.


My friend once told me that if you have a multicore machine and update Gentoo on it, a fascinating thing happens. For most packages running configure takes a lot longer than the actual build. The reason being that configure is slow, and, even worse, unparallelizable.

A question of state

Computers are very good at remembering state. Humans are notoriously bad at it. Therefore the basic rule in interaction design is to never have the user remember state that the machine can either remember or deduce by itself. Autotools forces its user to keep all sorts of state needlessly in his head.

When the user has changed the code, he types make to compile it. This usually works. But when the files describing the build system are changed (Which ones? I really don’t know.) just running make will fail. Even worse it probably fails silently, claiming everything is fine but producing garbage.

In these cases the user has to manually run, autoreconf, or maybe something else, before make. Why? Why does the developer have to care?  Is it really too much for a build dependency tracker system to, you know, actually track dependencies?  To notice that a file that some other files depend on has changed and thus deduce the steps that
need taking? And take those steps automatically?

For Autotools the answer is yes. They force the user to keep state in his head needlessly. Since the human mind can only keep about 7 things in short term memory at any one time, this single choice needlessly wastes over 14% of developer brain capacity.

When are binaries not binaries?

When they are shell scripts that invoke magical incantations in hidden directories, of course. This little gem is courtesy of Libtool, which is a part of Autotools.

If your project uses shared libraries, Autotools does not actually build them until after you do “make install”. Instead it creates so-called convenience libraries and, in a step of utmost convenience, hides them from the developer. Since the actual libraries do not exist, binaries in the build tree cannot use them, ergo they are replaced with magical wrapper scripts.

By itself this would not be so bad, but what happens when you want to run the files under gdb or Valgrind? You either always run make install before debugging or you follow the instructions on this page:

(At least they are honest, seeing that their breakpoint was put on the line with the statement ‘printf (“Welcome to GNU Hell!\n”)’.)

This decision again forces state on the user. How you run gdb, Valgrind or any other inspection tool depends on whether the program you build uses shared libraries or not. There goes another 14% of your brain. More if your project has several different kinds of

Consistent lack of consistency

With Autotools you can never depend on anything being the same. So you have to jump through unnecessary hoops all the time. Say you want to have automated build service that does builds directly from revision control as well as release builds from tarballs.

To do this you need two different build rules, since the configure script is not in revision control you need to generate it in daily builds. But since source tarballs sometimes don’t contain you can’t always call that before configure. And indeed you shouldn’t,
you’re testing the release after all.

As an added bonus any patch that changes configure is guaranteed to be non-mergeable with any other patch that does the same. So be sure to specifically tell your diff program to ignore the configure file. But be sure to remember whether you did that or not every single time.

This is just one more way Autotools gives you more state to deal with. These kinds of annoying one-off things keep popping up in places where they really should not. They sap developers’ time and effort constantly.

Poor portability

Autoconf’s main claim to fame is that it is portable. The only requirement it has, as mentioned above, is the userland of SunOS from 1983.

Unfortunately there is a platform that does not provide anything even close to that. It is quite popular in some circles. For example it has over 90% market share in desktop machines (but who uses those, anyway).

You can use Autotools on Windows, but first you need to install either Cygwin or MSYS and even then you can only use variants of GCC. There is roughly zero support for Visual studio, which unfortunately is the most popular compiler on that platform.

The end result is that if you want or need to support Windows as a first class platform then Autotools can’t be used. Many projects provide both Autotools and Visual Studio projects, but that means that you have two independent build systems that will go out of sync on a
regular basis.

Portability is internal too

Autotools are not forward or backwards compatible with themselves. The developers change the API quite often. This means that if you need to support several different aged platforms, you need to support several versions of Autotools.

This can be done by programming the configure scripts to work differently in different Autotools versions. And who among us doesn’t like a bit of meta-meta-meta-metaprogramming?

Not producing garbage is better than cleaning it

As a final issue I’d like to mention build directories. This is a concept advocating source code directory hygiene. The idea is that you have a source directory, which contains all files that are checked into revision control. In addition there is the build directory. All files generated during build go there. In this way the source directory is always clean. You can also have several build directories, each using different settings, a different compiler and
so on.

Autotools do provide this functionality. If you run the configure script from a directory other than source root, it writes the build files to that directory. But this is pretty much useless, as it only works on full tarballs. Actually it probably won’t since most Autotools projects are written so that they only work when built in the source directory. Probably because the project they were copypasted from also did that.

Tarballs are used mostly by packagers and end users. Developers work on revision control checkouts. As their first step they need to run either or autoreconf. These commands will always vomit their files in the source directory. And there are lots of them, just look at almost any project’s revision control file ignore list.

Thus we have a really useful feature, which is completely useless to those people who need it the most.

What to use then?

That depends. Believe it or not, there are build systems that are even worse. Actually most of them are.

My personal favorite is CMake. It fixes almost all of the issues listed here. It has a couple of downsides too. Its language syntax is weird at some places. The way its state and cache values interact on repeated invocations is non-intuitive. Fortunately you usually don’t have to care about that if you just want to build your stuff.

Read more