Canonical Voices

Posts tagged with 'bug fixes'

Colin Watson

Here’s a summary of what the Launchpad team got up to in August.


  • Webhook support for Git repositories is almost finished, and only needs a bit more web UI work (#1474071)
  • The summary of merge proposal pages now includes a link to the merged revision, if any (#892259)
  • Viewing individual comments on Git-based merge proposals no longer OOPSes (#1485907)

Mail notifications

Our internal stakeholders in Canonical recently asked us to work on improving the ability to filter Launchpad mail using Gmail.  The core of this was the “Include filtering information in email footers” setting that we added recently, but we knew there was some more to do.  Launchpad’s mail notification code includes some of the oldest and least consistent code in our tree, and so improving this has entailed paying off quite a bit of technical debt along the way.

  • Bug notifications and package upload notifications now honour the “Include filtering information in email footers” setting (#1474071)
  • Bug notifications now log an OOPS rather than crashing if the SMTP server rejects an individual message (#314420, #916939)
  • Recipe build notifications now include an X-Launchpad-Archive header (#776160)
  • Question notification rationales are now more consistent, including team annotations for subscribers (#968578)
  • Package upload notifications now include X-Launchpad-Message-Rationale and X-Launchpad-Notification-Type headers, and have more specific footers (#117155, #127917)

Package build infrastructure

  • Launchpad now supports building source packages that use Debian’s new build profiles syntax, currently only with no profiles activated
  • Launchpad can now build snap packages (#1476405), with some limitations; this is currently only available to a group of alpha testers, so let us know if you’re interested
  • Builders can now access Launchpad’s Git hosting (HTTPS only) in the same way that they can access its Bazaar hosting
  • All amd64/i386 builds now take place in ScalingStack, and the corresponding bare-metal builders have been detached pending decommissioning; some of the newer of those machines will be used to further expand ScalingStack capacity
  • We have a new ScalingStack region including POWER8-based ppc64el builders, which is currently undergoing production testing; this will replace the existing POWER7-based builders in a few weeks, and also provide virtualised build capacity for ppc64el PPAs
  • We’ve fixed a race condition that sometimes caused a user’s first PPA to be published unsigned for a while (#374395)


  • The project release file upload limit is now 1 GiB rather than 200 MiB (#1479441)
  • We spent some more time supporting translations for the overlay PPA used for current Ubuntu phone images, copying a number of existing translations into place from before the point when they were redirected automatically
  • Your user index page now has a “Change password” link (#1471961)
  • Bug attachments are no longer incorrectly hidden when displaying only some bug comments (#1105543)

Read more
Colin Watson

Here’s a summary of what the Launchpad team got up to in July.


  • We fixed a regression in the wrapping layout of side-by-side diffs on (#1436483)
  • Various code pages now have meta tags to redirect “go get" to the appropriate Bazaar or Git URL, allowing the removal of special-casing from the “go" tool (#1465467)
  • Merge proposal diffs including mention of binary patches no longer crash the new-and-improved code review comment mail logic (#1471426), and we fixed some line-counting bugs in that logic as well (#1472045)
  • Links to the Git code browsing interface now use shorter URL forms

We’ve also made a fair amount of progress on adding support for triggering webhooks from Launchpad (#342729), which will initially be hooked up for pushes to Git repositories.  The basic code model, webservice API, and job retry logic are all in place now, but we need to sort out a few more things including web UI and locking down the proxy configuration before we make it available for general use.  We’ll post a dedicated article about this once the feature becomes available.

Mail notifications

We posted recently about improved filtering options (#1474071).  In the process of doing so, we cleaned up several older problems with the mails we send:

  • Notifications for a bug’s initial message no longer include a References header, which confuses some versions of some mail clients (#320034)
  • Package upload notifications no longer attempt to transliterate non-ASCII characters in package maintainer names into ASCII equivalents; they now use RFC2047 encoding instead (#362957)
  • Notifications about duplicate bugs now include an X-Launchpad-Bug-Duplicate header (#363995)
  • Package build failure notifications now include a “You are receiving this email because …” rationale (#410893)

Package build infrastructure

  • The sbuild upgrade last month introduced some regressions in our handling of package builds that need to wait for dependencies (e.g. #1468755), and it’s taken a few goes to get this right; this is somewhat improved now, and the next builder deployment will fix all the currently-known bugs in this area
  • In the same area, we’ve made some progress on adding minimal support for Debian’s new build profiles syntax, applying fixes to upload processing and dependency-wait analysis, although this should still be considered bleeding-edge and unlikely to work from end to end
  • We’ve been working on adding support for building snap packages (#1476405), but there’s still more to do here; we should be able to make this available to some alpha testers around mid-August


  • We’ve arranged to redirect translations for the overlay PPA used for current Ubuntu phone images to the ubuntu-rtm/15.04 series so that they can be translated effectively (#1463723); we’re still working on copying translations into place from before this fix
  • Projects and project groups no longer have separately-editable “display name” and “title” fields, which were very similar in purpose; they now just have display names (#1853, #4449)
  • Cancelled live file system builds are sorted to the end of the build history, rather than the start (#1424672)

Read more
Curtis Hovey

I have been analysing Launchpad’s critical bugs to track the Purple squad’s progress while on Launchpad maintenance duty. In January of 2011, the Cloud Engineering team né Launchpad Engineering team was reorganised into squads, where one or more squads would maintain Launchpad while other squads work on features. This change also aligned with a new found effort to enforce the zero-oops policy. The two maintenance squads had more than 332 critical bugs to close before we could consider adding features that the stakeholders and community wanted. By July 2011, the count dropped to its lowest point, 250 known critical bugs. Why did the count stop falling for fifteen months? Why is the count falling again?

Charting and analysing critical bugs

Chart of Launchpad's critical bugs since the formation of Launchpad squads and maintenance duties
The chart above needs some explanation to understand what is happening in Launchpad’s critical bugs over time. (You may want to open the image in a separate window to see everything in detail.) Each iteration is one week. The backlog represent the open critical bugs in launchpad at the start of the iteration. The future bugs are either bugs that are not discovered, not introduced, or reported and fixed within the iteration. The last group is crucial to understand the lines plotting the number of bugs fixed and added during the iteration. We strive to close critical bugs immediately. Most critical bugs are reported and fixed in a few days, so most bugs were not open long enough to be show up in the backlog. The number of bugs fixed must exceed the number added to make the backlog count fall. You can see that the maintenance squads have always been burning down the critical bugs, but if you are just watching the number of open bugs in Launchpad, you get the sense that the squads are running to just stand still.

I use the lp-milestone YUI widget to chart the bugs and analyse the our progress through the critical bugs. It allows me to summarise a set of bugs, or analyse a subset by bug tag.

Launchpad maintenance analysis -- driving critical bugs to zero

Though 22 bugs were fixed this past week, 14 were added, thus the critical count dropped by 8. The last eight iterations are used to calculate the average bugs closed and open per iteration. The relative velocity (velocity – flux) is used to estimate the remaining number of days to drive the count to zero. When the Purple squad started maintenance on September 10th of 2012, the estimated days of effort was more than 1,200. In just three weeks, the number has fallen dramatically. The principle reason the backlog of critical bugs has fallen is that the Purple squad is now giving those bugs their full attention, but that generalisation is unsatisfactory.

Why is the Purple squad so good at closing bugs in the critical backlog?

I do not know the answer to my question. The critical backlog reached its all-time low of 250 bugs with the release of the Purple squad’s maintenance work in July 2011. There was supposition that  Purple fixed the easy bugs, or that the fixes did not address the root cause, so another critical bug was opened. I disagree. The squad had no trouble finding easy bugs, and it too would have been fixing secondary bugs if the first fix was incomplete. I can tell you how the squad works on critical bugs, but not why it is successful.

I was surprised to see the Purple squad were still the top critical bug closers when it returned to maintenance after 15 months of feature work. How could that be?  The squad fixed a lot of old timeout and JavaScript bugs in the last few months through systemic changes — enough to significantly affect the statistics. About 600 critical bugs were closed while Purple squad were on feature work. The squad closed 210 of those bugs. 60 were regressions that were fixed within the iteration, so they never showed up in the backlog. 70 critical bugs were fixed because they blocked the feature, and 80 critical bugs were because Purple was the only squad awake when the issue was reported. The 4 other squads fixed an average of 98 bugs each when they were on maintenance. The Purple squad fixed more bugs then maintenance squads on average even when they were not officially doing maintenance work.  The data, charts, and analysis always includes the Purple squad.

I suspect the Purple squad has more familiarity with bugs in the critical backlog. They never stopped reading the critical bugs when they were on feature work. They saw opportunities to fix critical bugs while solving feature problems. I know some of the squad members are subscribed to all critical bugs and re-read them often. They triage and re-triage Launchpad bugs. This familiarity means that many bugs are ready to code — they know where the problem is and how to fix it before the work is assigned to them. They fixed many bugs in less than a day, often doing exactly what was suggested in the bug comments.

During the first week of their return to maintenance, about 30 critical bugs were discovered to be dupes of other bugs. Though this change does make the backlog count fall, it also revised all the data, so the chart is not showing these 30 bugs as at all now. The decline of backlog bugs does not include dupes. While the squad was familiar enough to find many bugs that they close in a single day, they were not so familiar as to have known that there were 30 duplicate bugs in the backlog when they started.

Most squad have only one person with DB access, but the Purple squad is blessed with 3 people who can test queries against production-level data. This could be a significant factor. It is nigh impossible to fix a timeout bug without proper database testing. Only 13 of the recent bugs closed were timeouts though. The access also helps plan proper fixes for other bugs as well, so maybe 20% of the fixed bugs can be attributed to database access.

Maybe the Purple squad are better maintenance engineers than other squads who work on maintenance. For 28 months, I was the leading bug closer working on Launchpad. I closed 3 times more bugs than the average Launchpad engineer. I am not a great engineer though. My “winning” streak came to a closed shortly after William Grant started working on Launchpad full time; he soundly trounced me over several months. Then he and I were put on the same squad and asked to fix critical bugs. Purple also had Jon Sackett, who was closing almost 2 times the number bugs than the other engineers. I don’t think I need to be humble on this matter. To use the vulgar, we rocked! Ian was the odd man on the Purple squad. He was the slowest bug closer, often going beyond our intended scope to fix an issue. Then Purple switched to feature work…Ian lept to the first rank while the rest of the squad struggled. Ian fixed almost double the number of Disclosure bugs than other squad members. The leading critical bug closer on the squad at the moment though is Steve Kowalik. This is his first time working on maintenance. His productivity has jumped since transitioning to maintenance.

I can only speculate as to why some engineers are better at maintenance, or can just close more bugs than others. A maintenance engineer must be familiar with the code and the rules that guide it. Feature engineers need to analyse issues and create new rules to guide code. I did not gradually become a leading bug closer, it happen in a single day when I realised while solving one issue that the code I was looking at was flawed, it certainly was causing a bug, I knew how to fix it, and with a few extra hours of extra effort, I could close two bugs in a single day. Closing bugs has always been easy since that moment.

I believe the Purple squad values certainty over severity and small scope over large scope when choosing which critical backlog bugs to fix. I created several charts that break the critical bugs into smaller categories. I suggested the squad burn down sub-categories of bugs like regressions, or 404s. The squad members are instead fixing bugs from the entire backlog. They are choosing bugs that they are certain they can fix in a few hours.  I think the squad has tacitly agreed to fix bugs that are less than a day of effort. When this group is exhausted, they will fix issues that require days of effort, but also fix as many bugs. The last bugs to be fixed will be those that require many days to fix a single bug. Fixing the bugs with the highest certainty reduced our churn through the critical bugs, there are fewer to triage, to dupe, to get ready to code.

The Purple squad avoids doing feature-level design and effort to fix critical bugs. Feature-level efforts entail more risk, more planning, and much more time. There is often no guarantee, low certainty, that a feature will fix the issue. A faster change with higher certainty can fix the issue, but leaves cruft in the code that the engineers do not like. Choosing to do feature-level fixes when a more certain fix is available indicates there is tension between the Launchpad users who have a “critical” issue that stops them from using Launchpad, and the engineers who have a “high” issue maintaining mediocre code. I contend it is easier to do feature-level work when you are not interrupted with maintenance issues. When the Purple squad does choose to do feature-level work to fix a critical, they have a list of the bugs they expect to fix, and they cut scope when fixing a single bug delays the fix of the others. The Launchpad Answers email subsystem was re-written when other options were not viable, there we about 20 leading timeouts represented by 5 specific bugs to justify 10 days of effort to fix them.

The Purple squad is not unique

Nothing that I have written explains why the Purple squad are better are closing critical bugs. All squads have roughly the same skills and make decisions like Purple. Maybe the issue is just a matter of degree. If the maintenance squad is not closing enough bugs to burn down the backlog, their time is consumed by triaging and duping new critical bug reports. Familiarity with Launchpad’s 1000’s of bugs is an advantage when triaging bugs and getting a bug ready to code. Being able to test queries yourself on a production-level database takes hours or days off the time needed to fix an issue. Familiarity with the code and the reasoning that guided it increases the certainty of success. The only domain that Purple is not comfortable working with is lp.translations; the squad is comfortable changing 90% of Launchpad’s code. There may be correlation between familiarity with code, and the facts that the squad members participated in the apocalypse that  re-organised the code base, and that some have a LoC credit count in the 1000’s.

Read more
Martin Pool

We’ve just upgraded Launchpad’s builder machines to Bazaar 2.4. Most importantly, this means that recipe builds of very large trees will work reliably, such as the daily builds of the Linaro ARM-optimized gcc. (This was bug 746822 in Launchpad).

We are going to do some further rollouts over the next week to improve supportability of recipe builds, support building non-native packages, handle muiltiarch package dependencies, improve the buildd deployment story etc.

Read more
Martin Pool

Continuing on from our earlier work of sending less but better mail and making it faster to import i18n translation templates: Launchpad will no longer send mail when it successfully imports a template. You can see in the web ui when the template was last imported, and you will still get mail if there’s a problem.

I could hardly put it better than Riddell:

Danilo asked for my reasoning. My reasoning is that pointless e-mails are a pain.

Big pile of junk mail from Verizon

(I hope we’ll eventually have a more structured notification model, that will let you choose to see some notifications by mail and others in the web ui. One step at a time.)

Read more