Canonical Voices

What How Bazaar talks about

Tim Penhey (thumper)

Wikkid Wiki

As with any creation being released, I'm writing this with some trepidation. I'd like to announce the first release, 0.1, of Wikkid Wiki.

What is it?

Wikkid is a wiki that uses the Bazaar DVCS as an underlying storage model. Wiki pages are text files in a branch.

Why another wiki, surely there are enough already?

There is the obvious reason, because I felt like it. But this is not the primary reason. Since I started working for Canonical I've come to appreciate the whole culture of free, libre, and open source software more. During the day I work on Launchpad. Primarily in the area of integrating the Bazaar DVCS. Launchpad doesn't have a wiki integrated, and it is my plan to see wikkid be the wiki that is integrated.

I have to admit to having very strong opinions myself on what I wanted for wikis in Launchpad. I've tried to encapsulate that in the vision below. Since no one else was looking at it, I took it upon myself. Discussions started last year between a small group of Launchpad developers, but there was no traction. At the start of March I started writing the Bazaar backed wiki. It needed a name though. Thankfully after trying several I got the perfect name from Aaron Bentley - wikkid.

The wikkid vision


  • Any wikkid wiki can be branched locally for off line editing.

  • Any branch can be viewed using wikkid - not limited to branches created through wikkid

  • A local wikkid server can be run using a Bazaar plugin

  • Local commits use the local users Bazaar identity

  • Wikkid can be configured to operate in a stand alone, public facing mode where it has a database of users

  • Wikkid can be integrated as a library into other python applications

  • Wikkid uses standard wiki markup languages - not inventing its own



What does Wikkid Wiki offer?

Right now, wikkid offers basic page editing, rendering and browsing.

  • ReST is the default wiki format

  • Creole 1.0 is also supported (as long as the first line is "# creole")

  • source files are syntax highlighted using pygments

  • you get to see your gravatar for your local Bazaar identity

  • no page locks are used, but instead a three way merge

  • conflicts due to concurrent editing are shown for the user to resolve



Where to from here?

Well here is just the beginning. The TODO file is quite long already, and
that is just a simple brain dump.

Things that I want to have done for 0.2 include:

  • Change the underlying server from twisted.web to WSGI

  • Change the generated URL format

  • Add the stand-alone user database code, along with sessions and logins and email validation

  • Add a view to show changes for a page

  • Allow the reverting of any historical change

  • Daily build of trunk into the wikkid developer's PPA



Ideally for the 0.2 release I'd like to provide everything that is needed for
wikkid to be deployed as a stand-alone, public facing, wiki.

Wow, how can I help?

Wikkid uses Launchpad for colaboration and project tracking -
https://launchpad.net/wikkid.

  • You can get a copy of trunk using 'bzr branch lp:wikkid'

  • File bugs for any issues you find playing with it

  • Join the development and discussion mailing list

  • If you feel so inclined, you could implement a feature or fix a bug, push the branch to Launchpad and propose the merge

  • Come and chat in #wikkid on freenode, nothing fires developers up like having encouraging users

Read more
Tim Penhey (thumper)

Launchpad code update

We've been very busy over the last couple of months with lots of changes that most people will never notice.

Reduced latency


Branches pushed to Launchpad are now immediately available over http to anonymous readers of the branch, which includes the loggerhead code browser.

Code review email changes


When proposing a branch for review the initial emails and subsequent comments will now come in order. Previously if someone commented before the script that generated the diff was run, the comment would be emailed out first, now it isn't.

Teams requested to review now get email


Everyone in the team that is requested to review will get email now. This is a blessing for those that want it, and almost a curse for those that aren't interested. Launchpad adds a number of email headers to help users with filtering of email. Here is an example from an email I received:


X-Launchpad-Message-Rationale: Reviewer @drizzle-developers
X-Launchpad-Notification-Type: code-review
X-Launchpad-Project: drizzle


Since it was a team that was requested to review, there is the @drizzle-developers added to the X-Launchpad-Message-Rationale. If I was personally asked to review, the header would just say Reviewer.

Build branch to archive


This was the original name of the feature, but it is more about recipes now. A recipe is a collection of instructions on how to build a source package. We are still testing this internally, but I'm hoping to get this enabled on edge very soon. This will be extended to add daily builds.

What does this really mean?

Lets say you want to have a daily build of a project, like gwibber. You would then create a recipe that uses trunk as a base branch, merge in the packaging info, and say "Please build this every day into my PPA". And Launchpad will.

Read more
Tim Penhey (thumper)

Trivial bugs

This is just a quick note really. One thing I've been trying to do more and more is to address simple bugs in a more timely manner.

I use the tag "trivial" to indicate to me that the bug is very simple to fix. By this I mean that I should be able to have the fix and the test all written in under an hour, and normally under 30 minutes.

Personally I'm (hopefully) fixing one trivial bug a day in addition to other work. This way the simple bugs get some attention, and I get the feeling of accomplishing something when other things are in the pipeline that take longer to get completed.

My scheduling of trivial bugs is somewhat arbitrary. Often the most recently commented on trivial bug will get my attention.

Read more
Tim Penhey (thumper)

Don't wait for perfection

Way back in July I was thinking of writing a post about the new branch listings in Launchpad. I was working on making branch listings for distributions and distroseries work, for example Package branches for Ubuntu and Packages branches for Ubuntu Lucid. But the code wasn't entirely optimised. Then as things happen, the optimisation got pushed back... and back. And finally when I did get the optimisation in, I didn't feel it was worthy of talking about.

I guess the thing to remember is: don't wait for perfection. Sure it wasn't perfect, but if more people were accessing the pages, the optimisation may have happened sooner.

One thing going on at the moment is more integration of the lazr-js widgets. The main merge proposal page now has in in-page multi-line editor for the commit message. Sure, it needs tweaking, but the main functionality is there. More ajaxy goodness is finding its way into Launchpad.

One of the things that I'm thinking about right now is splitting the concepts of code reviews and merge proposals. At the moment we almost use the term interchangeably, which does cause some confusion. I'd like to have the merge proposal reflect the meta-data and information around the intent to have work from one branch be landed or merged into another branch (normally the trunk branch), and the code review the conversation that goes on around the merge proposal. Merge proposals may have an associated code review, but right now, a code review must be associated with a merge proposal.

Associated with this, I'd like to extract some state information. Currently merge proposals have only one status, which really reflects two things. I'd like to break this out into two: review status; and merge status. Review status would be one of: work in progress; needs review; approved; rejected; superseded; and maybe withdrawn. Merge status would be one of: proposed; merged; queued; and merge failed. Queued relates to the merge queues which are currently partially implemented in the Launchpad codebase, and merge failed is the state that a proposal would be set to when a landing robot like PQM or Tarmac attempt to land the branch but it fails due to either merge conflicts or test failures.

My goal for the next six months it to write more often, talk about ideas more, and not wait for perfection.

Read more
Tim Penhey (thumper)

Breaking up work for review

It was Friday morning after three days of working on one feature. Last thing Thurday I counted the size of the change and it was over 1100 lines and I wasn't quite finished. I found myself in the situation that turns up periodically that I wanted to break up my work into cohesive reviewable chunks. Now it isn't a matter of taking commits x through y as chunk one, and so on, as the size grew organically as I changed what needed to be changed, and wrote what needed to be written without really stopping to think about the size of the change until it was done. However now it was done, I wanted to break it up.

Last time I did this, I used looms, but Aaron told me we could do it easily using his new Bazaar pipeline plugin. So I spent some time talking through with Aaron on how to do it, promising to write it up if it worked well. I must say that it was good. During the process we identified a number of enhancements to the plug in to make it even easier.

I'm going to show the progression we made, along with our thoughts. I have trimmed some of the output when I've decided that it doesn't add value.

The first thing I had to do was to get the pipeline plugin.

$ bzr branch lp:bzr-pipeline ~/.bazaar/plugins/pipeline


Unfortunately this seemed to clash slightly with the QBzr plugin. The were both trying to redefine merge. Personally I don't use QBzr and had probably just installed it to take a look, so I removed that plugin.

Caution: the pipeline plugin relies on switch so works with lightweight checkouts. This is how I work anyway, so I didn't have anything to do here, but if you work differently, YMMV.


The pipeline plugin is designed around having a set of branches one after (individual pipes) the other that perform a pipeline, clever eh? When you have the pipeline plugin, any branch is also considered a pipeline of one.

$ bzr show-pipeline
* nice-distribution-code-listing


What I was wanting to do was to break up this work into a number of distinct change sets, each that could be reviewed independently. We decided that the way to do this was to create a pipe before the current one, and bring changes in. This is done with the command add-pipe.

$ bzr add-pipe factory-tweaks --before nice-distribution-code-listing
$ bzr show-pipeline
factory-tweaks
* nice-distribution-code-listing


Right here we decided that there should be an easier way to add a pipe before the current pipe, as right now it needs a pipe name. A bug was filed to track this.

You can see from the show-pipeline command that the new pipe is before the current one. The pipeline plugin addes a number of branch aliases:

  • :first - the first pipe in the pipeline

  • :prev - the pipe before the current pipe

  • :next - the pipe after the current pipe

  • :last - the last pipe in the pipeline



Now to make the switch to the first pipe. Both :prev and :first refer to the same branch here, and I could have used either.

$ bzr switch :prev
... changed files shown
All changes applied successfully.
Now on revision 8747.


Now this pipe was added from the pipe after it, so it starts off with the same head revision. Not exactly the starting point I wanted, so we replaced the head of this branch with the last revision of the trunk branch that we had merged in.

$ bzr pull --overwrite -r submit:


The submit: alias refers to the submit branch. This is often trunk, and is in my project layout (specified using submit_branch in .bazaar/locations.conf).

Now the lower pipe was a copy of trunk. A good place to start adding changes I think. The next problem was how to get the changes from the following pipe into this one. Our first attempt was to merge in the following branch, shelve what we didn't want, throw away the actual merge, but keep the changed text, and commit.

$ bzr merge :next
$ bzr shelve
$ bzr stat
modified:
lib/lp/testing/factory.py
pending merge tips: (use -v to see all merge revisions)
Tim Penhey 2009-07-02 New view added.
$ bzr revert --forget-merges
$ bzr stat
modified:
lib/lp/testing/factory.py
$ bzr commit -m "More default args to factory methods and whitespace cleanup."


Now this seemed very convoluted. Why merge and then forget the merge? I seemed kinda icky, but it worked. The next thing to do is to merge these changes down the pipeline. This is done through another command pump.

$ bzr pump


This merges and commits the changes down the pipeline. If there are conflicts, it stops and leaves you in the conflicted pipe. This didn't occur here, nor did it occur for any of my other ones. Here you can see the commit message that pump used:

$ bzr switch :next
$ bzr log --line -r -1
8714: Tim Penhey 2009-07-03 [merge] Merged factory-tweaks into nice-distribution-code-listing.
$ bzr switch :prev


Now it was time to add the next pipe.

$ bzr add-pipe code-test-helpers
$ bzr show-pipeline
* factory-tweaks
code-test-helpers
nice-distribution-code-listing
$ bzr switch :next


This time, instead of merging in the changes, we shelved them in. The shelve command. The shelve command can apply changes from arbitrary revisions, and it also knows about files. The change that I wanted in this branch was a single added file, so I could tell shelve about that file.

$ bzr shelve -r branch::next lib/lp/code/tests/helpers.py
Selected changes:
+ lib/lp/code/tests/helpers.py
Changes shelved with id "2".


However the big problem with this is it all looks backwards. We are shelving from the future not the past. This really did my head in. Shelve would say "remove this file?" and by shelving it, it would add it in. It worked but made my head fuzzy. We filed a bug about this too. By adding a better way to take the changes, the command could do the reversal for you and provide you with a nicer question.

More of the same happened for the next few pipes, and I won't bore you with repeated commands.

On the whole, the pipeline plugin worked really well. I was able to break my work up into five hunks which could be reviewed easily. In the end I kept working on the branch that was my original, so all my original history remained intact. It would have been just as easy to add another pipe and take the remaining changes. This would have left me with five branches, each with one commit. This works well for the way we work as we have reviews based on branches. Each pipe could be pushed to Launchpad and a review initiated for it. With some more UI polish, I think pipelines will be even more awesome than I think they are now.

Read more
Tim Penhey (thumper)

You're doing it wrong!

Just yesterday I found a missing feature in one of the apps I just started using. My thought processes were something along the lines of “hey, I could add this feature and it would be good”. So I went to the project's website, found their source code repository, and got blown away by the comment that was with it:

Please note that code you get from this repository is not intended for productive use (unless it's tagged as a released version, of course, in which case the usual alpha/beta disclaimers apply ;-)). We like to break our codebase, config files, database schemas and all kinds of stuff. We sometimes commit non-compiling revisions to facilitate collaborative development. Running such an unstable version might trash your settings, your backlog and maybe your computer. You have been warned!


Eh? OK, I get the first sentence. It is even a good disclaimer. Tagged releases are more stable. People regularly commit code that is unpolished. Sometimes even with some known bugs or issues.

The second sentence has me going “NO!?! What are you doing?

The third sentence just blew my mind. This project is using a DVCS. Not my DVCS of choice, but really that doesn't matter. All DVCSs are made to have good merging and sharing of code between developers. Saying “We sometimes commit non-compiling revisions to facilitate collaborative development” is just a lack of understanding of how to use the tools. You are using a DVCS to facilitate collaborative development! This is centralised version control thinking.

Try this for a code to work by:
Trunk should always at least compile, run, and pass all the tests.


This hasn't stopped me wanting to work on the code, but it has raised my caution levels.

Read more
Tim Penhey (thumper)

kiwipycon

Today NZPUG held yet another organisation meeting for the first kiwipycon. Organising conferences takes a lot of effort by many dedicated people. The Christchurch python user group has volunteered to host the first PyCon in New Zealand. Personally I suggest things from time to time, but a big thanks goes out to those guys for the hard work that has gone on even before the call for papers.

The dates have been set as Saturday the 7th and Sunday the 8th of November 2009. A weekend was chosen to allow those working or studying who can't get leave to attend. As I understand it they are still working on pricing. The call for papers will probably go out next month some time.

Should be interesting...

Read more
Tim Penhey (thumper)

launchpadlib updates for branches

Just a quick note to yet you know of a few changes to launchpadlib for branches. Mainly because I've removed a method that I know someone is using.

You used to be able to get to the branches for a project by saying my_project.branches, but I've removed this. It would have been nicer to deprecate it but we don't have a nice deprecation method right now for launchpadlib, and since it is still in beta, I didn't feel too bad.

The branches of a project was an attribute, now we have a getBranches method. The old attribute would give you all the branches of the project, including the merged and abandoned ones. The method defaults to give you the active branches, and allows you to pass in the statuses that you'd like to get.

Also with this change you can now get the branches for a project group, or the branches owned by a person using the same getBranches method call.

Project groups also grew the method getMergeProposals in the same way that the method was already available for people and projects.

Please file any bugs on the launchpad-code project on Launchpad.

Read more
Tim Penhey (thumper)

lp:mad

One of the things that I have spent quite a lot of time on recently is the code review stuff in Launchpad. Recently, as of the 2.2.2 release, new merge proposals get a review diff created for them automagically. This review diff is based on the changes that have been done in the branch relative to the least common ancestor (LCA) of the target branch. Since the review diff only has changes that have been added, there is no way for this diff to ever have conflicts.

There is another diff that is useful to see however. This is the diff of what changes would happen if the source branch was merged into the target branch right now. Sometimes this might conflict. Sometimes this might be a smaller diff as some other dependent functionality has landed. This diff isn't generated automatically by Launchpad. However this is something that you can run to add it.

The Merge Analysis Daemon

Alright, it isn't exactly a daemon (yet), but the name was cool.

What this script does, using the launchpadlib API, is to get all of the current merge proposals for a project and works out the diff that would be — what we call the preview diff.

What do you need

Firstly you need branches of wadllib, launchpadlib, and mad.

$ bzr branch lp:wadllib
$ bzr branch lp:launchpadlib
$ bzr branch lp:mad

Inside the mad directory, there is the LICENSE file (GPL v3), and the script.

The script has many parameters.

$ ./mad.py --help
Usage: mad.py [options]

Options:
-h, --help show this help message and exit
-v, --verbose Display extra information
-q, --quiet Display less information
-p PROJECT, --project=PROJECT
The name of the Launchpad project.
-r DIR, --repo=DIR The location of a local repository to use.
--dry-run Don't upload the diffs
--force Force an update of the diff.
--staging Update the proposals on staging.launchpad.net.
-c FILE, --credentials=FILE
The credentials file. Defaults to ~/.launchpad/mad
--cachedir=DIR The location of the cache directory. Defaults to
~/.launchpadlib/cache.
--no-op Don't get the proposals for the project.
--new-credentials Get a new OAuth token and save in the credentials
file.

I have the following in the server's crontab listing:


20 * * * * PYTHONPATH=/home/tim/launchpadlib:/home/tim/wadllib /home/tim/mad/mad.py -p storm -r /home/tim/sandbox/mad-playground -c /home/tim/.launchpad/mad -v >> /home/tim/mad-storm.log 2>&1


Basically this says:

  • At 20 minutes past every hour

  • Run the mad.py script using a PYTHONPATH that knows about wadllib and launchpadlib.

  • Use the credentials file ~/.launchpad/mad

  • Use the respository at ~/sandbox/mad-playground

  • Be verbose

  • Make all output go to a log file.


If the specified repository directory does not exist, a new shared repository with no working trees is created. If there is an existing repository, it will use that.

Each of the source and target branches are pulled into the repository. MAD won't create branches for them, it just grabs all the necessary revisions. MAD then calculates the diff that would be if the source was merged into the target, and sends that to Launchpad to have it annotate the proposals. As an example, see the storm ones.

You will also need to have permission to edit the proposals that you are wanting to update. If you are the person that is running the project, and are in the team that owns the target branches, then you should be able to update them.

There is a --staging option to test the script against what is in staging.

The script also walks you through the necessary OAuth token acquisition the first time you run the script.

Report bugs on Launchpad.

Read more
Tim Penhey (thumper)

Shallow branches or history horizons

There is an idea floating around and I'm curious to see if it is an idea that has merit and worth putting effort into. This idea is in the DVCS space and is called "shallow branches" or "history horizons".

The concept itself is pretty simple. When using a DVCS with a project with a long history, each and every user has a copy of this history. Now much of this history may be ancient (for some definition of ancient, 6 months, 6 years, whatever). Most developers will never have a need to go into the ancient history of a project, and so a truncated history is fine as long as their branches that they create are still mergable with the main repository.

Here's how it could play out:


  • Bob wants to work on the fooix project to fix a minor bug, this is Bob's first look at the fooix source. The fooix project has been around for eons and has a huge history. Bob doesn't care about the history, he just wants to do his simple fix (think a typo).

  • Bob grabs the fooix trunk branch but only gets enough history to create the working files.

  • Bob makes his fix, and publishes his branch for the fooix developers to grab.



The advantage here is that when Bob grabs his branch, he is only getting just enough history to work, and so his resulting repository is smaller and faster.

Commands that worked by inspecting the history would stop at the repository's horizon and say something like "and that's all I've got". Obviously there'd need to be a way to say "go and get me another 4 months of history" or even "ok, now I'm really interested, get me the complete history".

This is conceptually different from a lazy loading or stacked repository as there is an explicit horizon where normal history commands stop.

So lazyweb, the question I have is this: "Is this a worthwhile feature in a DVCS tool?"

Read more
Tim Penhey (thumper)

Bazaar has the model right

Some people in the GNOME community have suggested that if Bazaar has nice usability, then GNOME can just use Git on the back-end, and Bazaar lovers can just use the Git back-end via Bazaar. It's true that Bazaar could support this — an experimental plug-in exists to do this right now. But this suggestion betrays several wrong assumptions.

People assume Git and Bazaar are the same. They're not. People assume that if Git and Bazaar have technical differences, then Git must have it right.

The problem with these assumptions is that usability begins at the ground level. Bazaar started with a focus on usability. Git began with a focus on speed. The data models of both Bazaar and Git reflect their initial focus. But Bazaar's model can also be fast. In fact, the Bazaar developers are currently optimising a number of key operations for speed.

Data retrieval

Git and Bazaar are both key/value mapping systems. When bytes are needed, they are requested with that key.

The big difference is that Git's keys are also the hashes of the bytes. This is why it's called a content-addressable file system. This allows git to offer a guarantee that if the value hashes to the key, it has not been modified, whether deliberately or by accident. The Bazaar team considered adopting this approach, but decided it was too constricting. Bazaar uses UUIDs instead.

Authenticating revisions

For detecting malicious modification of revisions, Git uses its cryptographic hashes.

Bazaar uses revision-signing. All revisions can be PGP-signed. No signed revision can be forged. And the hashed representation can easily be generated and passed around to ensure that exactly the same content is used.

If SHA-1 is broken, both Bazaar and Git will lose their ability to detect malicious modification. But since Bazaar uses UUIDs to identify revisions, users can re-sign their old revisions with whatever method proves to be secure. Changing the hash used by Git would make it incompatible with all existing repositories.

Data Integrity and Serialization formats

Bazaar stores hashes of every value, so it equally capable of detecting accidental modification. It can be useful to have different representations of a tree in different repositories. For example, when Git lists files, it divides this data by directory. This is a good approach, but not necessarily the best approach. An alternative approach would be to use a radix tree. This would ensure that Git performed quickly even if users put unreasonable numbers of files in a single directory. But Git's keys are hashes, upgrading Git's format to use radix trees would change the keys, which means that people could not use the commit-id from one repository to refer to the same tree in an other form.

Bazaar doesn't assume it has the perfect format. It provides an upgrade path, and does't change the commit-id of a revision if you change your format. What's more, Bazaar can even reference data it has never seen. This allows partial imports from other VCSes to be fully compatible with more complete imports. And if a VCS provides UUIDs (content hashes certainly qualify as UUIDs), Bazaar can refer to those UUIDs directly.

File and directory representation

Git refers to files by path. It makes no attempt to track renames in its data store.

Bazaar has an inode abstraction; files and directories both have ids. When a file is renamed, its id stays the same. Bazaar's core code refers to files by their id, so merging a renamed file requires no special effort.

Git's approach means that users are warned not to rename files while changing their content. But when files are renamed, those files that refer to the renamed files must have their contents changed as well. For example, if you rename foo.h and foo.c to bar.h and bar.c, you should update the contents of bar.c, or else you will break the build. With Bazaar, users can do whatever they want, and the VCS just works. While Git must always use heuristics to deduce renames, Bazaar does not have to. Of course, it can if it wants to. This is an example of why it is important to design a model for usability from the beginning.

Bazaar can import rename data losslessly from foreign VCSes. Some other VCSes support file-ids, and Bazaar can reuse those without change. For VCSes that support renames, but not file-ids, Bazaar's representation is also non-lossy. When data imports are deterministic and non-lossy, it's easy to export them back to their source VCS. Bazaar's Subversion integration is a great example of how this can work.

Choose the back-end with the right model

In any situation it makes sense to use a back-end that stores the richer dataset. It makes more sense to have a front end client that doesn't use all the functionality or data representation of the back-end than it does to have a richer client that isn't able to store the required information as the back-end is not able to represent it.

If a single back-end storage is going to be used, it makes more sense to use a Bazaar back-end as Bazaar is able to represent everything that Git does, but the reverse is not true.

Conclusion

The Bazaar developers focused on usability, which requires having a model that supports usability. Bazaar has improved its model to increase the usability of the system. We believe that Bazaar has the right model.

co-written by Aaron Bentley and Tim Penhey

Read more
Tim Penhey (thumper)

J5 mentioned in his post his interpretation of the number of users for GIT, Bazaar and Hg (Mercurial). He also finishes with "Converse amongst yourselves".

I guess I should first point out that I am a Bazaar user, and that I work for Canonical. I felt somewhat enraged at the post from J5, and have spent some time trying to work out some response.

John Carr mentioned that 83% of statistics are made up on the spot, and that cannot be more true here. I had been waiting for someone else to post the numbers that they saw at the BOF, but so far I have not seen one.

Here is my take on it.

Yes there were more GIT users than Bazaar users at the BOF, but the numbers were more like 50% of the audience were GIT users, and about 40% were Bazaar users. Someone piped up and said "What about Mercurial?" and so the question was asked, and there were about five or six people. There was an overlap of the GIT and Bazaar groups, and there was by far the larger majority of the audience that had not used any DVCS.

What conclusions can we draw from this? Not much. Many people attending the pre-conference work for larger companies, like Red Hat, Novell, and Nokia, and many of those people work on some hard core linux stuff, many of which have chosen GIT. Many have chosen GIT because that is what the linux kernel is using. Is that a good reason to chose a DVCS? I don't feel that we can really answer that question as I am sure there are strong advocates for both sides.

An interesting question is "Which DVCS is easier for the casual contributor to use?" Surely one of the reasons that a project chooses a DVCS is to allow for more community contributions in an easy to merge way that has a clear contribution history. Bazaar just works. It works for the hard-core developers, but is also easy for those soft-core (?).

From the people I talk to, and I've tried to talk to many here, is that of those that use Bazaar it just works. Bazaar doesn't get in your way of developing the software that you are working on. It is just a tool that works.

One final point. The questions were "Who uses <insert DVCS>?", not "Who likes/loves using <insert DVCS>?".

Read more
Tim Penhey (thumper)

Shelving looms

For a feature that I'm currently working on, I decided to try out the loom plugin. Looms have been around for a little while, but I just hadn't gotten around to trying them out.

We have code reviews of all work that is to be merged. Part of this process is to try to limit changes to 800 lines of unified diff. We have found that when the branches have more changes than this the time to review the branch increases non-linearly with the increase in line count. In the past in order to break up a "chunky" branch I would branch from trunk for the first part:

$ bzr cbranch trunk feature-part-1

(I use cbranch as I don't have my working trees in my repository. This is another story to write about. cbranch is found in bzrtools.)

Once this part was complete, I would branch from that for part-2:

$ bzr cbranch feature-part-1 feature-part-2

Complications come in when I want to bring an updated trunk into my branch for part-2, as it makes getting a diff of changes much more difficult as I can no longer generate a diff simply. This problem propagates if I need three or four parts to implement the feature.

Enter looms. Looms provide a new branch format for Bazaar. To convert your branch to the new format, you use the command loomify. You can then create threads of your loom. Each thread is like another branch.

So, the process goes something like this:

$ bzr cbranch trunk my-new-feature
$ cd my-new-feature
$ bzr loomify
$ hack, commit, hack, commit, hack, commit
$ bzr create-thread next-part-of-feature

Creating a thread is like creating a new branch which has the same revisions as the last thread.

$ hack, commit, hack, commit, et al

However what happened after several hours of hacking away, and several diversions in the code that needed fixing, I checked out the size of the branch.

$ bzr diff -r thread:

(The loom plugin adds a revision specifier to easily allow things like this to see all the changes that were introduced by the current thread.)

Oh, damn. The resulting size was about 1100 lines. Now while our 800 line limit is set in stone, it is considered a bit rude if you could have broken it up but didn't. Stepping through the diff I identified three distinct chunks of work that could be broken out for review. My question was "now that I have all this in a loom, and I have a vested interest in keeping the loom as the current work is based on an earlier thread, how the hell am I going to break this work up?" Shelve to the rescue. Shelve is also found in bzrtools. I also wanted to have the threads named reasonably, and unfortunately there is no easy way to rename a thread right now. I wanted to have the threads named alpha, bravo and charlie (well, not really, but you get the picture). The first step is to create alpha and get rid of the current thread.

$ bzr create-thread alpha
$ bzr down-thread

create-thread takes you to the new thread too, so using down-thread to go back to the thread I was working on before.

$ bzr combine-thread

This effectively discards the current thread. The assumption is that the changes from this thread had been merged into the lower thread through merging another branch. This hasn't happened in this case, and discarding is what we want here. So now I'm at the state where I was before except my thread is now called "alpha". Now to break out the changes for alpha.

$ bzr create-thread bravo
$ bzr down-thread

I created a thread bravo. This is also at the state where all three parts are there and working. Next I went back to "alpha" thread. Now we use shelve using the revision specifier that looms introduce. Shelve by default will just allow you to shelve (or put to one side) the uncommitted changes.

$ bzr shelve -r thread:

Now I get lots of questions. Do I want to shelve each of the chunks. I shelve all changes that are unrelated to the alpha feature. What I'm left with after this command is a working tree as it would be if I had just written the alpha feature on a clean thread. I checked the results with:

$ bzr diff -r thread:

Looks good, so commit.

$ bzr commit -m "Shelved changes unrelated to alpha."

Now for the magic.

$ bzr up-thread

This takes me back to "bravo". However up-thread also merges in the changes from the thread below. Now my tree is showing that all the changes relating to bravo and charlie have been removed. The actual merge magic is done with this command:

$ bzr revert .

Take a special note of the dot. Without the dot the revert would revert the entire merge. I don't want this. I just want to revert the changes to the tree. I need bzr to remember that I have merged the changes from the thread below. This is exactly what the "revert ." does. The changes to the three are reverted, but the merge isn't. Next you need to commit.

$ bzr commit -m "Merge from alpha while splitting up the changes."

Now I have the alpha thread with just the changes needed to implement alpha, and a bravo thread that appears to introduced the bravo and charlie features. I also have a .shelf directory (created by shelve). Since I have no intention of unshelving these changes (as they are already there), I delete this directory. I'm not sure if this is strictly necessary, but I like to run a clean shop.

To break apart the bravo and charlie features I repeated the process. The end result was three separate threads that each appear to introduce a single feature.

Phew.

One point of caution. Sometimes in the breaking apart, you don't always get a clean break. In these situations you need to keep more than you need (i.e. don't shelve that change), and once the revert and commit is done, then go back to the earlier thread and clean up. If you try to do it earlier, the changes will be thrown away in the revert dot command, and then it just gets messier.

All in all I'm really enjoying working with looms. I currently have about eight threads, and will probably need another four or five to finish the feature, but this way it is dead easy to keep the changes small and distinct and simple to review.

Read more
Tim Penhey (thumper)

bzr alias

I now officially have some of my code in an open source project where the work was actually done entirely on my own time. Despite being involved with professional software development for almost twenty years, I've normally kept my programming to work hours, or private projects.

This time though, it was different. This time it was truly scratching an itch. I've become somewhat of a Bazaar convert. Bazaar has for a long time allowed you to define command aliases in your bazaar.conf configuration file. I have always used bash aliases for commands that I do really often, so it seemed natural to me to define aliases for bazaar for commands that I used often as well.

Next came the internal conflict. I am an inherently lazy person. This is why I like aliases, less typing. One of the things that bugged me was having to actually edit the configuration file any time I wanted to add or modify an alias. This bugged me. It bugged me so much it caused me to actually do something about it. Luckily for me Bazaar is written primarily in python, and this just happens to be my current favourite language.

The code is now committed to trunk, and should be available generally in bzr 1.6.


  • bzr alias — lists your current aliases

  • bzr alias commit="commit --strict" — sets the alias for commit

  • bzr alias commit — print out the current alias for commit

  • bzr alias --remove commit — removes the commit alias


Obviously if you use alaises as much as I do, one of the first things you'll do is set the alias for unalias.

bzr alias unalias="alias --remove"

My current aliases (that I'll tell you about):

tim@slacko:~/src/bzr/alias-command$ ./bzr alias
bzr alias cbranch="cbranch --lightweight --hardlink"
bzr alias col="checkout --lightweight"
bzr alias commit="commit --strict"
bzr alias lastdiff="diff -r-2..-1"
bzr alias lastlog="log -r-2..-1"
bzr alias ll="log --line -r-10..-1"
bzr alias my-missing="missing --mine-only"
bzr alias unalias="alias --remove"

Read more
Tim Penhey (thumper)

Code in Launchpad

Launchpad offers many things to developers, and open source software developers in particular. One of these things is the ability to host Bazaar branches. For those that have looked a little deeper, they will have noticed that there are four types of branches in Launchpad: Hosted; Mirrored; Remote; and Imported. Hmm, this isn't really what I was intending to talk about at all, but I'm going to go with the flow.

Hosted branches are those where Launchpad is the primary public location of the branch. Hosted branches are normally created by pushing a branch directly to Launchpad. Before you do that though, you need to have registered on Launchpad, and supplied an SSH key. This is how Launchpad knows who you are. There are two ways you can push a branch to Launchpad: one is via SFTP; and the other using the Bazaar smart server (bzr+ssh).

As an example I'm going to use my alias-command bzr branch. The complete SFTP location would be sftp://thumper@bazaar.launchpad.net/~thumper/bzr/alias-command, and the smart server one bzr+ssh://thumper@bazaar.launchpad.net/~thumper/bzr/alias-command. These are a bit unwieldy, so we extended the lp type urls for bzr to support writing if the launchpad plug-in knows who you are. In order for you to do this you use the lp-login command. bzr lp-login will tell you the username that is currently set. If you have not done this yet, you'll see a message like "No Launchpad user ID configured." I set mine by saying bzr lp-login thumper. This stores thumper as the launchpad_username in the bazaar.conf file. This also means I can use bzr push lp:~thumper/bzr/alias-command to push to my hosted Launchpad branch.

Mirrored branches allow you to have your branches stored publicly in some location that you control, and you let Launchpad know where this is. Launchpad will then update its copy of your branch every six hours. This is handy if you don't have an SSH key, or you have a slow network connection, or you just like having your branches available on your own server.

Remote branches are a bit different. Remote branches were sort of created out of necessity. Some people were registering mirrored branches with unreachable locations. Some of these were possibly by mistake, but quite a few were obviously inaccessible. But more strange is that those branches were linked to bugs or blueprints. There was obviously a desire to have branch meta-data there, but not actually allow Launchpad to get access to the branches. So we have remote branches. You cannot get a copy of a remote branch from Launchpad as Launchpad does not have a copy of it.

Imported branches are those branches where Launchpad get the code from either CVS or Subversion, and puts it into a Bazaar branch. I was really wanting to talk about this as I saw two projects recently where we are importing code that I didn't know about. One is my favourite music player, Amarok, and the other was MPlayer. Just out of curiosity I looked at both of these branches on Launchpad. The Amarok one has 12195 revisions as I'm writing this, and the last revision was 11 hours old, and MPlayer had even more revisions, at 26761. However that isn't even the cool bit. What is really nifty is you can go bzr branch lp:amarok or bzr branch lp:mplayer to get the code. Just to check I did just that, and got a copy of the amarok source. It was the first bit of C++ I had looked at in a long time (it used to be all I did).

Anyway, that was what I really wanted to say. Oh yeah, and bzr rocks.

Read more
Tim Penhey (thumper)

The Launchpad branch directory service

Recently Bazaar grew a branch directory service. This allows plug-in developers to define custom "protocols" that resolve the branch names into some other branch location.

The Launchpad plug-in defines a protocol "lp". Launchpad uses the other parts of the URL to relate to projects, series or individual's branches. The shortest valid URL for Launchpad is something like lp:do. You can also use an empty authority (or site part of a URL), so lp:///do is exactly the same, just longer. Personally I prefer the one without all the slashes.

The do part of the branch location relates to the GNOME Do project on Launchpad. There is a little magic (a.k.a. configuration) that is needed to make this work. Projects in Launchpad get an initial development focus series created for them. This is intended to relate to the branch of development that is where current or new work goes. In order to have the code available through the Launchpad directory service, the code has to be available through Launchpad as a normal branch.

Once a branch has been either hosted, mirrored or imported for the project, one of the people responsible for the project in Launchpad can relate the branch with the series. Once this is done the branch is easily accessible. People that have permission to make these links will be shown a link on the main code tab for their project (we don't taunt people who can't make the links with an invalid option).

If there is another series, say 1.0 for our project fooix, and we have branches associated with them, then we could get the 1.0 branch using lp:fooix/1.0. Normal Launchpad branches are also accessible using the lp protocol using lp:~username/project/branch-name.

Read more
Tim Penhey (thumper)

The December push-up challenge

Yay, finished early. We had a challenge running at Canonical to do 2000 push-ups during December. I've just done my 2000th push-up for the month. It has actually been quite good. I've found that the most important bit is just to do a set regularly, and really try for four or five sets a day.

Also I've finished work for the year. Another yay! It has been a good year, with much good stuff getting done. Even more interesting stuff is due to come out in the first six months of next year.

With another year heading past, the girls are a year older. Jessie is starting school in February next year at the start of term, and Maia is now old enough to go to playcentre by herself (and she has done so at least once).

It seems time to start thinking about new year resolutions...

Read more
Tim Penhey (thumper)

Cambridge, MA

Well, my body is now almost completely adjusted to US east coast time. I do find myself waking each night between two and three in the morning. I'm thinking that this might be due to me expecting to hear Jessie or Caitlin waking, but here it is only Joey.

The rest of the Launchpadders arrived tonight. I ended up meeting so many of them it isn't really worth me writing them down for fear of forgetting someone. We have the agile training tomorrow, which should be quite interesting.

Cambridge, or the small bits of it that I have seen, reminds me of London. But perhaps that is just because they are digging up the roads all over the place, and the pavement is made up of concrete slabs and bricks that have been laid for so long that they are no longer flat.

I thought the USA was supposed to be cheap, but so far the meals out in the evening have been about the same that I'd pay back in NZ (converting the dollars). However generally books and DVDs are cheaper. I've put through an order to amazon.com and getting it delivered to the hotel.

Read more
Tim Penhey (thumper)

Trivialities

Well the laptop has been running mostly well since I upgraded it. Just a few times when it really had problems. Most of those seem to have been solved by making KDE start a fresh session every time rather than trying to use what I had running last time. It means a bit of manual tweaking each time I log in, but I tend to just leave it on most of the time.

I have Jono and James here this week working with me and we are attacking both features and usability issues with Launchpad and Bazaar. I feel that things are coming along nicely.

I've just joined a squash club, but yet to organise my first game. I know that I'll be really stiff after the first smack around.

Read more
Tim Penhey (thumper)

Going gutsy

I upgraded my desktop machine to Gutsy (Gutsy Gibbon, Ubuntu 7.10) several weeks ago, and it has been going reasonably well, so it is time to upgrade the laptop. The distribution upgrader application happily told me that I needed to download around 1.4 gig of updates that would take around 2 days 7 hours using a 56K modem. I'm happy that I have a decent ADSL connection, although it is still saying that it has got between 4 and 7 hours remaining depending on the current download speed.

The official Gutsy release candidate is in about a week or two. If you feel so inclined, you too can upgrade and help report bugs.

I just wish I could figure out how to use my favourite window picker that beryl has that compiz doesn't seem to (or at least I haven't found it yet).

Read more