EDIT: As several people have pointed out, there is a script to already to this, pull-lp-source. Perfect! I’ve asked numerous people over the last year about whether there was a tool that could do this and nobody mentioned this one (even asked on AskUbuntu last week). So out of all this I end up with a link to a great new tool and got to write some python yesterday. pull-lp-source looks like it will meet all my needs.

During my work on bug triage and trying to become MOTU, I’ve found myself wanting to be able to pull source packages for a specified release, for example, download source for lxc on precise, even if I’m using raring. Although you can do this if you setup apt with all the releases and then use pinning, or doing a setup like this, I wanted an easier way. So I decided to glue together rmadison and dpkg-source and create a tool called “get_source”. This is how it works.

get_source.py -r <release> -p <package>

Pulling the source for bc on oneiric:

get_source.py -r oneiric -p bc

Grabbing lxc on precise:

get_source.py -r precise -p lxc

Seems pretty simple and it is!

The tool relies on outside helpers to do the hard work, namely rmadison and dpkg-source, so you’ll need those installed to use it. Please give it a try and send in feedback and fixes. If you’re a developer you’ll note that I even have unit-ish tests, please add more if you make some fixes for corner-cases.

bzr branch lp:~mfisch/+junk/get_source

How It Works

  1. Run rmadison and build a list of packages + versions per release
  2. Find the release we care about. We now know the package name, version, and release name.
  3. Using some hueristics, download the dsc file.
  4. Read and parse the dsc file to find the filenames for the orig file and diff and/or debdiff
  5. Download the orig and diff/debdiff files
  6. Use dpkg-source -x to extract it

Alternatives and Issues

When I started this, I figured it would be simple, but I was mistaken. There is lots of variation on filenames and locations in the archives, for example:

  1. I had originally planned to just go grab http://url/pool/main/<package first letter>/package/package_version.<extension>, but it’s not quite that simple. First, not all packages use standard names, some have a diff.gz, some a debian.tar.gz. Then some packages use xz and some use gz.  Native packages won’t have a diff at all (I think), and right now I know my code won’t support that.
  2. There’s also the question of package directory. alsa-base for example comes from the directory “alsa-driver”. I plan on grabbing this information from apt-cache show, but even that will not solve the issue if I’m on raring and the package was elsewhere in precise. This is also not yet supported in this version.
  3. Packages like angband have a version of 1:3.0.9-1, and the “1:” portion is not included in the filename. The code now supports this.

I found these cases by making this app work for a package and then randomly trying more and more packages to find and hopefully fix new cases. The worry I have is that there are hundreds more corner-cases that I don’t handle. Given all these issues, I’m still releasing this code for other people to test, but perhaps someone has simpler solutions to the problems above? Even better, maybe someone has already written a better tool, which I’ll gladly use!

Read more