Canonical Voices

Posts tagged with 'performance'

jameinel

The last of my patches is queued up to land, so I figured I’d post an update about the performance improvements I’ve been working on. I’m also just excited about how well it has all come together.

There were essentially 3 changes that mattered for performance on large trees.

  1. Fixing iter_entries_by_dir() to preload the data in Repository- optimal ordering rather than by-request ordering. In large trees this was causing us to thrash and become pathologically slow. In the 70,000-file test tree, thrashing took about 3 minutes, the preloading version takes about 15s. This affected a lot of our commands, though I guess the next two fixes would actually reduce the number of commands affected by this.
  2. Fixing several code paths to use optimized iter_changes() rather than the generic iter_changes(). The generic path walks both inventories iter_entries_by_dir() and compares them. Our 2a format Repository can do iter_changes without loading the whole tree. (It internally uses a hash_trie to store the inventory, and so nodes with matching sub-trees can be skipped for comparison.) This generally shows up as something that was taking 15s (to load the whole inventory) dropping to <2s for the improved comparison. (bzr revert and bzr pull were both directly impacted here)
  3. Changing WT.set_parent_trees([one_tree]) to update itself using current_basis.iter_changes(one_tree), rather than setting the state from scratch. This basically adds another case where we can avoid reading the whole inventory state again, which is another 15s to <2s sort of change. This only showed up after fixing (2), because once the tree is loaded, the other actions are generally pretty quick. (bzr up, bzr pull)

This is the chart I put together for “whats-new-in-2.4.txt”. bzr-2.3.2 will have fix (1), but not (2) or (3), to give a feel for how much of an impact different fixes have had.

    bzr-2.3.1 bzr-2.3.2 bzr-2.4  action
    3m39s         1m08s   1m03s  bzr co --lightweight
      38s            8s      2s  bzr revert (in a clean tree)
    4m47s         3m56s     15s  bzr merge
    4m45s           20s      3s  bzr pull
    4m58s         3m00s      2s  bzr up
    9m33s           21s     19s  bzr uncommit (including a merge)
    4m44s           17s      2s  bzr uncommit (simple commit)

So yes, some operations that were taking almost 5 minutes have now dropped down to taking <3s.

You won’t see that dramatic of an improvement for smaller trees, though most cases will have a pleasant improvement. Here is a short list for the ‘Launchpad‘ tree (with ~8k items).

    bzr-2.3.1   bzr-2.4     action
    5.3s        5.2s        bzr co --lightweight
    0.9s        0.3s        bzr revert
    1.4s        0.4s        bzr pull
    3.9s        3.7s        bzr uncommit (with merge)
    0.9s        0.3s        bzr uncommit (without merge)

Anyway, I’m quite happy about how much better bzr-2.4 will be in large trees.

update:Add graphs…

Read more
Martin Pool

A nice mail from Martitza this morning:

Hi.  Some of you may recall that I’ve provided measurements of some slow operations in both bzr itself and bzr-explorer based on bzrlib 2.1 and earlier.  I don’t recall doing any testing on bzr 2.2.0 specifically, but now that I’m on bzr 2.2.1 bzr (and bzr-explorer) are noticeably faster and even more pleasant to use now.   So far it seems that each of the add/commit/status operations on sets of 10K+ files (500MB+ branches) takes no more than two or three minutes each.  This is twice as fast as some of my previous measurements on the same or smaller branches.  Also, memory consumption is much reduced, which I suspect helps a lot.  I even have one of these large repos hosted on an old 512MB Pentium 3 machine with no problem today.  This is wonderful!  I hesitate to name the people I know made this possible, because I do not want to leave out the many more people whose contributions I don’t know.  Please accept my thanks to the whole team.


Read more
Martin Pool

Parth announced bzr-grep 0.2.0.  Amongst other things there are performance enhancements such that Eli says:

Thanks, this looks great. I just tried it on Windows XP searching for a fixed string in the Emacs repository — took 28 seconds with a cold cache and only 7.5 with a warm cache, which is impressive.

By contrast, “grep -F -R” (with suitable –exclude patterns, to prevent it from searching binary files and inside .bzr) took about 12.5 seconds with a warm cache.


Read more