Canonical Voices

What You can't take the sky from me talks about

Posts tagged with 'libreoffice'

bmichaelsen

“I fought the law and the law won”

– Sonny Curtis and the Crickets — prominently covered by the Clash

So in a few minutes, I will be leaving for the meeting at Open Knowledge Lab in Hamburg for Code for Germany in Hamburg — but I dont want to show up empty-handed. Earlier I learned about BundesGit which is a project to put all federal german laws in a git repository in easily parsable markdown language. This project was featured prominently e.g. on Wired, Heise and got me wondering that having all those laws available at the tip of your hand would be quite useful for lawyers. So here I went and quickly wrote an extension to do just that. When you install the extension:

  • it downloads all the german federal laws from github and indexes them on the next restart of LibreOffice (completely in the background without annoying the user)
  • that takes about ~5 minutes (and it only checks for updates on the next start, so no redownload)
  • once indexed you can insert a part of a law easily in any text in Writer using the common abbreviations that lawyers use for these:
  • Type the abbreviation of the paragraph on an otherwise empty line, e.g. “gg 1″ for the first Artikel of the Grundgesetz
  • press Ctrl-Shift-G (G for Git, Gesetz or whatever you intend it to mean)
  • LibreOffice will replace the abbreviation with the part of that law
BundesGit for LibreOffice

BundesGit for LibreOffice

Now this is still a proof-of-concept:

  • It requires a recent version (1.9 or higher) of git in the path. While that is for example true in the upcoming version of Ubuntu 14.04 LTS, other distributions might still have older versions of git, or — on Windows — none at all: Packing a git binary into the extension is left as an exercise for the reader.
  • I have not checked it to parse all the different laws and find all the paragraphs. It also ignores some non-text content in the repository for now. Patches welcome!
  • While it stays in the background most of the time intentionally to not get into the way of the user, it could use some error reporting or logging, so users are not left in the dark if it fails to work.

On the other hand, the extension is a good example what you can do with less than 300 lines of Python3 (including tests) in LibreOffice extensions. Thus the code was hopefully verbosely enough commented and was uploaded to sdk-examples repository, where it lives alongside this LibreOffice does print on Tuesdays extension that also serves as an example. Of course, if there other useful repositories of texts online, it can be quickly adapted to provide those too.

So download BundesGit for LibreOffice and test it on Ubuntu 14.04 LTS (trusty).

 

addendum: This has been featured on golem.de and linux-magazin.de (both german).

 


Read more
bmichaelsen

Document Liberation has been announced today, but a picture says more than a thousand words, so I created one based on the beautiful work of Paulo José. Enjoy!

Document Liberation

Document Liberation (CC-by-sa 3.0 Paulo Jose, Bjoern Michaelsen)


Read more
bmichaelsen

LibreOffice bugzilla status

Im kind of over gettin told to throw my hands up in the air

so there

Team, Pure Herione, Lorde

So, somewhere between the LibreOffice 4.2.0 and the 4.1.5 release, bugs.freedesktop.org broke through 25.000 reported bugs. A time to throw the hands up in despair? Not at all, as the following chart shows:

LibreOffice bug states on freedesktop

LibreOffice bug states on freedesktop

  • 7% of reports are still unconfirmed or need more information
  • 22% are confirmed and unresolved issues, that are not enhancements requests
  • 6.5% are unresolved enhancement requests.

On the other hand:

  • 33% of all reports have been fixed in some way
  • and 30% are invalid or duplicates.

Its interesting to see how now a quarter of the confirmed unresolved reports are asking for new features and enhancements. Its gets even more encouraging, if you take into account that the number of bugs reports is at a long term constant 20-25 reports per day, while over 40% of the bugs intentionally or collaterally fixed changed their state in the last 12 month. So we are picking up speed in triaging and fixing bugs, while the influx of new reports stays constant.

If you are interested, please help QA quite a bit in all this by writing good bug reports, identifying duplicates, confirming new reports, bibisecting regressions, run and test daily builds and prereleases or otherwise helping with the QA Easy Hacks!


Read more
bmichaelsen

It’s so hard when it doesn’t come easy

It’s so hard when it doesn’t come fast

– So Hard, Taking The Long Way, Dixie Chicks

So, LibreOffice 4.2 is released, FOSDEM is over, was very nice and I am back home in Hamburg after a week in London. I missed the LibreOffice UX Hackfest for that, which I heard was also awesome. So without further ado, here are the slides from my quick talk at FOSDEM:

(direct link if you are watching this on a planet that does not support embedded speakerdecks: https://speakerdeck.com/sweetshark1/liberated-build-system-mission-accomplished)

and some errata for it: On slide 13 it says “the same file is also hardlinked from workdir/” — thats not true for quite a while already. LibreOffice keeps around exactly one copy of a library, unlike the confusing three copies that we had in LibreOffice 3.3. This should be a lot less confusing to the curious first time contributor.

Reviewing all these changes in toto, it became how much we simplified getting involved with LibreOffice through this. As the lyrics quoted above say: “Back when we started, we didn’t know how hard it was”.

If there is just one number to take away from all these slides, its that a noop rebuild for LibreOffice on a three year old developer notebook with the distro provided GNU make 3.81 takes just 17 seconds(*). And slide 7 shows still some possibilities to still speed things up beyond that — and while at current speeds it might not be worth it on Linux, it might be worthwhile for e.g. Windows, which is traditionally rather slow when it comes to file I/O.

On a related note, over time we improved the way new contributors can submit their changes on our instance of gerrit in many ways. Thanks a lot to David, Norbert and Robert for the work on this. One only has to look at one of daily digests generated from activity on gerrit and imagine we would still get one mail for each change, update and merge to the mailing list for manual patch tracking as we did in the early days. Thanks a lot also to Mathias Michel for his work on the script!

So if you haven’t done that yet, consider graping an EasyHack and get started!

A copy of the original .odp is also available at FOSDEM or on the LibreOffice wiki.

(*) This includes checking 1.3GB of generated c++ dependency files for some >8000 object files, which we simplify to <350MB.


Read more
bmichaelsen

Numbers

“Eins, zwei, drei, vier, fuenf, sechs, sieben, acht”

– Nummern, Computerwelt, Kraftwerk

So LibreOffice 4.2.0 release candidate 3 has been tagged yesterday evening. A good time to look back at the cycle and look at some numbers. The number of issues fixed in the 4.2 series are in line with our historic trends:

There is no page for the third release candidate yet, but I assume it to be no exception. Fixing issues is mainly done by development, although QA does the preparation for that by triaging a bug well. But QA also does quite a bit of work before a bug is triaged, and this is not directly locked to changes in code. So I had a look at the numbers simply in the timeframe between the tagging of 4.1.0 rc3 (2013-07-17) and 4.2.0 rc3 (yesterday). In this timeframe, QA did:

  • confirm 3114 bugs (change of ever_confirmed).
  • resolve 3393 bugs (change of resolution and not unresolved now, this includes the bugs fixed by development).

Naturally, these can not be simply be added up: for example, a bug can be confirmed and then be resolved by fixing it. If all of that happens in the timeframe (as it likely will for a relevant bug), it will appear in all the above counts. Meanwhile, in this timeframe 4092 bugs have been filed by endusers. Of those new bugs filed, 9.3% where enhancement requests. Since not all resolved bugs need to be confirmed (e.g. invalid bugs), these numbers add up nicely.

Speaking of quality, another thing to look at is regressions. How many of those will be fixed in 4.2 as of now? Here is the rundown:

  • 1 regression introduced in 3.4 or before
  • 2 regressions introduced in 3.5 or before
  • 3 regressions introduced in 3.6 or before
  • 2 regressions introduced in 4.0 or before
  • 8 regressions introduced in 4.1 or before
  • 51 regressions introduced on master or found in betas and release candidates

As you can see, most of the regressions fixed with this have actually never been released. This should be encouraging news to those testing daily builds: If you do that, you will be rewarded with quick bug fixes. Still, only fixing 16 regressions that were visible in previous releases seems a rather low count for a release. Well, this is because this count does not count fixed regressions that are also backported to the updates on the 4.1 stable series. As regressions are usually worth that effort, this is usually done unless it is to risky a change for that. If you look for regressions that were fixed in 4.2 and also backported to 4.1, you as of now get a count of:

  • 230 regressions fixed in 4.2 that were also backported to the 4.1 series

in addition. See this earlier post for more details on how the backporting works and some numbers on it.

Speaking of regressions, we have a pretty unique tool to corner them: bibisect. How well does this work? I keep tracking these in bugzilla for the last months. Currently 176 bugs have been bibisected, with the number of unresolved bibisected bugs staying constant in the 60-70 range. That is encouraging, as it means that for each regression bibisected, a developer fixes a bibisected regression. This happens currently at a rate of ~2 bugs per week, which is not too bad, as such regressions might be quite hard cornercases that without bibisect would be tricky to pin down. However, only ~14% of our unresolved regressions are bibisected as of now. Clearly, we can improve that ratio with more bibisecting and get more regressions fixed even quicker.

Ok, admittedly, this was a boring and dry post on bug numbers. What can I do to lighten you up? Here is catcontent, presented in LibreOffice Draw 4.2 running on Ubuntu trusty with the awesome new libreoffice-style-sifr icon theme:

More info about the upcoming 4.2 release can be found in the still evolving release notes and in this nice sneak peak video on 4.2. by Leif Lodahl.

tl;dr: We are doing well, but could use even more people testing daily builds and do bibisects.

addendum: The LibreOffice 4.2.0 release candidate 3 page is populated — additional 29 bugfixes. And the final release candidate 4 has 12 more.

addendum: Michael wrote a nice wrap-up what happened elsewhere in the (now released) LibreOffice 4.2.0.


Read more
bmichaelsen

Sometimes I wonder if the world’s so small,
Can we ever get away from the sprawl?

Sprawl II — Arcade Fire

So these days, most people prefer to use an IDE to navigate their source code. This has often been greeted with some defensive elitism of the “real programmers” kind since the early days of the open sourcing of StarOffice. One does not simply load a code base the size of LibreOffice in your wimpy IDE: while it is possible somehow in the end, its a lot more trouble than its worth to manually set up e.g. all the include path manually to get the fancy stuff like autocompletion. Add to that, that e.g. UNO headers are generated during the build and header were at distributed over multiple IDE unfriendly locations, with many headers even available as copies from multiple locations, before we fixed that.

All these things are fixed now. And while LibreOffice still is a huge beast with our new build system we can get a holistic view of what needs to get build where, how and when. This makes it easy, almost trivial to generate an IDE project file from the build system. And to prove this point, I did just that for the kdevelop IDE. This isnt limited in principle to this one IDE — in fact the kdevelop specific part of this is some 150 lines of Python. So no matter what IDE you use: Eclipse, Netbeans, Anjuta, Visual Studio, Code::Blocks or XCode — you should be able to adapt this. In fact, while writing this, I find there is already work going on for XCode. Feel invited to join the party and make LibreOffice trivially buildable in your favourite IDE!

So as announced to the developer list, this allows you to make navigating, editing, building, testing and running LibreOffice much easier, giving you features like:

  • autocompletion
  • building a module from the IDE
  • building all of LibreOffice from the IDE
  • nondebug and debug build configs for the above
  • starting LibreOffice from the IDE
  • running unitchecks, slowchecks and subsequentchecks from the IDE

Dont believe it? Here is a video featuring a stuttering german guy (me) on the audio track showing this:

If you want to show this around on social media, there is also a shorter version featuring the essentials (make sure to link to the HD versions).

A closing note: A long time, common IDEs embrace and extended into the buildsystems so once you used an IDE, you could only use this one IDE and no other. In retrospect, this is obviously doing it wrong. With the current approach, we can make LibreOffice easily buildable in any IDE on any platform. A very important fact for a product available on so many platforms.

addendum: As Karl Fogel wrote LibreOffice is now ridiculously easy to build. before we even had this, it just shows that one can always do better. ;)


Read more
bmichaelsen

He asked me if I’d seen a road with so much dust and sand.
And I said, “Listen, I’ve travelled every road in this here land!”
I’ve been everywhere, man.
I’ve been everywhere, man.

I’ve been Everywhere, Johnny Cash

So about a month ago I travelled in one week from Hamburg via Zürich and San Francisco to Oakland and then via San Francisco, Munich and Basel to Freiburg to attend the LibreOffice Hackfest Freiburg 2013 and back to Hamburg. The Freiburg Hackfest is the third and last Hackevent we had in Germany this year (after the Impress Sprint in Dresden and the Hackfest in Hamburg) nicely accompanying the international events like the LibreOffice conference in Milan and our usual presence at FOSDEM.

Bags packed to get back to Europe

I have to admit that I arrived at this event with some travel fatigue and some upcoming Ubuflu, so I was not too productive myself, but its good to see fixes like for example in the kde integration (Jan-Marek), in Calc (Eilidh), for enabling bitcoin donations (Florian), to mail merge (again Jan-Marek), to Math (Marcos), for the build system (Michael and David) happening (or at least be prepared at the event). A big “Thank You” to all the angels of the Chaos Computer Club Freiburg that organized the event — when I learned that I would need to travel to the US right before this, I had some doubts if it would result in “remote-organization-troubles” given this was a first time in Freiburg. This was completely unfounded, the support of our hosts was amazing and they seemed to have made a deal with Eris to take revenge for the original snub somewhere else on this weekend. ;)

So, given that I did not do much coding (just some preparation for the KDevelop integration for LibreOffice, more on that later), what can I offer you? Catcontent was not available (no cats at this Hackfest), so I give you the second best thing: the deputy chairman of the board of the Document Foundation patrolling the premises on a skateboard:

skateboard patrol

So, whats next? FOSDEM! We will of course be there again, and back-to-back with the event we will have a user experience Hackfest in Bruessels. So come and join us:


Read more
bmichaelsen

Gimme Fuel, Gimme Fire

Take the corner, going to crash
Headlights, head on, headlines
Another junkie lives too fast
Yeah, lives way too fast, fast, fast, woh

– Fuel, Reload, Metallica

So, LibreOffice 4.2.0 alpha1 has been tagged upstream a week ago. It is an alpha release, essentially only a tagged snapshot of the LibreOffice master branch and as such might eat your kitten and kill unsuspecting relatives. On the other hand, if you absolutely are of the type that Metallica roars about in the above quote and therefore you are running the development release of Ubuntu (trusty tahr, which will become Ubuntu 14.04 LTS), you can add the LibreOffice prereleases PPA and try it out and report bugs. Of course, you should not use this in a production environment of any kind!

Im happy to see that this build available again a week earlier than last year, as early testing allows more bugs to be triaged and fixed in time. The more important difference though is that last year, the alpha version was build on the stable and released version of Ubuntu, while this year the version is already build against the early and moving development version of Ubuntu.

LibreOffice 4.2.0 alpha1 and a hint of the new Startcenter on Ubuntu Trusty Tahr

LibreOffice 4.2.0 alpha1 and a hint of the new startcenter on Ubuntu Trusty Tahr

Happy testing!


Read more
bmichaelsen

We gonna do what they say can’t be done
We’ve got a long way to go and a short time to get there

– East Bound and Down, Jerry Reed

So, the LibreOffice conference in Milan is just past us and it was awesome — if you missed it, Kohei posted a very nice set of pictures from that event. If you are interested in the talks too, you can find both streams and slides for almost all the talks. One other talk from the conference, I would like to hightlight is Michael Stahls gbuild talk — it was a long journey from when gbuild was still a pet project of mine, but now as the migration is now finished, things unlocked and we (*) are now really reaping what we sowed.

Almost, as e.g. while I was able and eager to send the slides for the lightning talks I moderated, I somehow forgot to do so for my own slides for my talk on tb3. It will hopefully end up on the conference site at some point, but for now I uploaded at at speakerdeck (with the odp originals on the wiki and here):

I didnt bring my own camera and thus missed making pictures during e.g. the lively QA roundtable, but Rob made sure that we get at least some photo on the last day (when many were already on the way home):

some LibreOffice QA contributorsRob, me and Robinson

some LibreOffice QA contributors: Rob, me and Robinson (left to right)

So in the next days, I will be hopping over atlantic for a visit to the west coast, just to return to turn “eastbound and down, loaded up and trucking” to be at Freiburg for the Hackfest again. A big Thank You in advance to Tauon and Florian Effenberger, who took over a lot of my organizer duties on this one due to this tight scheduling. Oh, and of course, I hope to see many of you there!

(*) actually they: By far, the most awesome stuff is now done by others than me


Read more
bmichaelsen

I’m easy like Sunday morning
That’s why I’m easy
I’m easy like Sunday morning

– Easy, Faith No More

So, Ubuntu 13.10 (Saucy Salamander) was released into the wild and comes with a fresh LibreOffice version: 4.1.2. Since the last major version of LibreOffice (4.0) was branched off, 11.034 commits by more than 200 different committers were done upstream up to the release that is now in Ubuntu 13.10. (*) The LibreOffice 4.1 features and fixes page gives an overview what is new with this release: rotating images, embedded fonts, improved interoperability — to name a few.

In the Ubuntu/Debian packaging repository, some 513 commits by 5 authors have been done between the version Ubuntu 13.04 was released with and the just released version. The majority of those commits have been done by Rene Engelhard of Debian. A big “Thank you” for all that work! Now leaving this release behind with a “Girl, Im leaving you tomorrow” on my mind, I am looking forward to what the name for Ubuntu t-series will be, as there does not seem to be an announcement yet (although there have been eager suggestions), start to brace myself for the early cycle madness again and prepare to make sure that Ubuntu t-series will get the best LibreOffice 4.2.

So much for looking backwards. A lot of people are shy and assume they could never be one of the contributors making a dent in LibreOffice, or even get started. Let me show you how wrong that assumption is:

resolved Easy Hacks over time

resolved Easy Hacks over time

This little chart shows Easy Hacks resolved by newcomers to the project. Easy Hacks are tasks that get need to be done on LibreOffice and can be done without understanding all of the million lines of code and more than 20 years of history — quite a few do not even require C++ skills. They are specifically selected for that — and if you run into any trouble solving those, you can jump in at #libreoffice-dev to get help. So get yourself a LibreOffice build (here’s a video on how easy that is on Ubuntu — with dubstep soundtrack), find yourself an Easy Hack and get going!

(*) I didnt bother to check for the exact number, because checking for duplicates in email addresses is tiresome.

Note: An earlier version of this post talked about 22.000 commits — that was an error on my part fiddling with the scripting late at night.


Read more
bmichaelsen

LibreOffice Lightning

Go, Greased Lightnin’
You’re burnin’ up the quarter mile

– Greased Lighting, Grease

So the conference schedule for the LibreOffice conference in Milan has just been published. The talks, workshops and sessions on the schedule encompass only the so far officially registered sessions. If you have another exciting and urgent topic that you want to share with the others at the conference, you may still get to present a lightning talk in the session on Thursday after lunch!

For that, just send a email to lightningtalks@libreoffice.org right now containing:

  • your name
  • the title of your talk
  • the length/format you want to use:
    • freeform 5 minutes (*) lightning talk
    • Pecha Kucha talk (20 slides, 20 seconds each = 6 minutes 20 seconds)
Its LibreOffice conference time again!

Its LibreOffice conference time again!

Excited to see you all in Milan next week!

(*) changed from 15 minutes earlier as there has been more demand for 5 minutes freeform than for 15 minutes sessions.


Read more
bmichaelsen

Powerplay

I’ve got the power.

– The power, Snap

So, Im back from vacation. One of the things I did was reorganizing my hardware, and for doing so, I bought a wattmeter to measure what my machines and toys actually consume. A lot of the stuff was what I expected, but there where a few nasty surprises:

 (all values in Watt)
Ideapad S12 Thinkpad W520 Bertha TV Pandaboard ES
power supply only 0 0.2 2.5 0.3 0.1
standby 0.3 0.2 2.5 15 3.2
desktop w/o display 13 10 122 130 6.1
with display 18 16 180 100
g+/gmail 20 20 212
compiling 90/70/35 417 8.1

From this set a few surprising takeaways:

  • The wimpy Ideapad S12 with its Atom CPU eats more power when idling than the Thinkpad W520 with its beefy i7 Quad-Core and 16GB of RAM (13 Watts vs. 10 Watts).
  • My TV doing nothing but waiting for the remote to tell it to turn itself on eats more power that each of my notebooks (15 Watts vs. 10/13 Watts).
  • Just opening Firefox with one tab google plus and one tab google mail eats 4 extra Watts on my notebook and 32 extra Watts on my desktop. It seems all that JavaScript voodoo does not come free at all: ~6 Euros per month when I leave it open on my desktop all the time.
  • Running my desktop (Bertha) as an tinderbox for LibreOffice 24/7 would cost me ~1.000EUR per annum. Doing it with three of those boxes would a very expensive and noisy alternative to what others sell as a room heater.
  • My TV eats 30 Watts more when displaying the black screen of a disconnected HDMI signal than with normal TV display. Maybe its expensive to search for a signal?
  • Compiling LibreOffice without ccache on my Notebook kicks the power consumption to 90 Watts — but only for a few minutes. Then the thermal controls throttle the machine down to 70 or even 35 Watts, which seems all the machine can disperse over sustained periods.

My electricity is 100% from water power, btw. Admittedly — its unlikely to come from the Hoover dam, though (Image copyright CC BY 2.0 by Gordon Wrigley)

And then there where these leftover pieces to measure, no surprises there, just a confirmation of my suspicion that the old Asus notebook I run as a home server is eating way too much power:

(all values in Watt)
bits and pieces
mic preamp off 1.1
mic preamp on 10
hub 5
phone 4
“home server” (decommissioned Asus Z53 notebook) 30

My tentative conclusions are:

  • replacing my old “home server” with something ARM-based like a Raspberry Pi or a Pandaboard breaks even after one year — I should do that.
  • Even when under load, a ARM-based Pandaboard has a modest power consumption.
  • I will completely turn off my TV on principle as the standby consumption is just pure impudence. As a bonus it prevents my BluRay player from kicking on the 100 Watt TV when I throw in a audio CD (Thanks Panasonic, for providing this excellent and “useful” integration).
  • A cheap Netbook might be less powerful, but it hardly consumes less than a high-end Notebook when idling. You get what you pay for.
  • I bought a cooler for my Notebook, hoping to unlock it from choking itself with thermal restriction. It should be a good idea in general as the logs not only talked about throttling, but also about more scary MCEs.
  • Buying a wattmeter is a good decision, when you run nontrivial amounts of hardware.

Addendum: The 2.5 Watts for Bertha when off may seem bad — but its not at all, if you consider it is running a lights-out management on that.


Read more
bmichaelsen

Stop right there I gotta know right now before we go any further

Let me sleep on it and I’ll give you an answer in the morning

– Paradise by the Dashboard Light, Bat out of Hell, Meat Loaf

So, I did some work recently to possibly make our tinderboxes more efficient and scalable — which is a bit ironic as I recently hinted others at Paul Grahams advise to “do things that do not scale”. At LibreOffice we currently have tinderbox setup that served us as good as it could in the first years: It gave a quick overview of the basic health of current development branch of LibreOffice. But LibreOffice takes some time to build and test and with 50-100 commits to master each day it is playing catch-up with a moving target.

And whle they did a good job at this, they also have a few distinct weaknesses: For one, these tinderboxes would also mail everyone who commited on a branch since the last known good build if they were unhappy. Since they do not know anything about each other, with a generic breaker each tinderbox would do that on its own. In a tragic imitation of a certain comic this would result in the incremental Linux tinderbox reporting after 5 minutes something went wrong, with all the other tinderboxes dribbling in with the same message over time, finalized by the full Windows build tinderbox excitedly reporting to 200 people (as a slow builder would have more commits between builds) that something was amiss — possibly hours after it was fixed again. This resulted in these messages being filtered away by most users and even worse: the Windows tinderbox reports, which should be the most useful of them, as most developers use Linux as development platform, being easily ignored as “someone else broke it”.

So I set out improve the situation with the initial goal:

  • to start make tinderboxes being able to coordinate
  • to make it possible to easily collate the information from multiple builders
  • while leaving the control over what is build with the owner of the tinderbox (as most of these boxes are sponsored, we dont want to make them into drones)
  • for slow platforms like Windows or ARM enable bisecting a breaker as the frequency of builds is too low for those in the commit range to feel personally responsible
  • while bisecting a breaker, also keep an eye one the branch moving forward (as in: dont try to bisect a breaker further when it was fixed in the meantime)

And I am happy to report to have reached this initial goal with tb3 which is a tinderbox coordinator written in Python3 and having as many lines of codes for unittests as for the product itself. So how is tb3 intended to work?

Leaving control over what is build with the owner of the tinderbox

tb3 is build around the idea, that the information about the state of the source is collected and managed by a central “tinderbox coordinator” and one or more tinderboxes go to it to:

  • ask for something to build, giving the coordinator a branch and a platform that they are interested to work for
  • report that they have started to build a certain state and give an estimate on when they will be finished
  • report that they have finished to build a certain state and give a result

Note that the first two steps are separate: The tinderbox is essentially just asking for a suggestion on what to build — its not promising to actually follow these proposals. It can come back and report to be building something completely different(*). Now the proposals the coordinator hands out come with a score. Just looking at a classical tinderbox mode, which will always build the current HEAD of a branch on a specific platform, the score of the highest ranking proposal will be equal to the number of commits since the last finished build. With tb3, a tinderbox can watch multiple branches (e.g. a development branch and a release branch) and commit itself to building the one which saw the most commits since the last finished. It can also use multipliers and use something like “if there are 10 times as many new commits on the development branch as on the release branch, then build that, otherwise stick to the release branch” or use limits: “I only want run a build if there are at least 5 new commits”.

Coordinating multiple tinderboxes

So how do we coordinate multiple tinderboxes and ensure that e.g. if someone pushes 9 commits to master, we do not get five Linux tinderboxes to build that last commit and then sprinkle everyones mailbox over the next hour? Here is where the “coordinator” part truly kicks in. The first tinderbox that asks for something to build will get proposals with scores as shown by the green line in the chart below: The highest score is the “9″ of the newest commit — the commit that has the biggest distance from the last build. If the first tinderbox reported to have taken on that proposed build, what would a second tinderbox that also asks to build something see? It makes little sense to give it the same build as the first tinderbox. Optimistically assuming that tinderbox will report something back, the best thing this second box can do is build something with the biggest distance to to the finished build and to the build running on the first tinderbox. As such, the coordinator will send it scored as denoted by the blue line and if the tinderbox accepts it will build commit 5 — which is why a third tinderbox asking for something to build, while the other two are running, will get proposals as per the pink line and thus be suggested to build commit 3.

proposal scores with tinderboxes just started

proposal scores with tinderboxes just started

Trusting tinderboxes … a bit

Now these tinderboxes “promised” to build some commit. But can we give the tinderbox unconstrained trust? E.g. should we never ever tell any other tinderbox to build this one commit, because some other tinderbox promised to build it? The answer is obviously no: As a tinderbox is a gift, the owner should be allowed to reboot or reassign a tinderbox for other tasks at any time with imprudence. This is why the tinderbox gives the coordinator an estimated duration for its build and the tinderbox coordinator “reserves” this commit for that time. As you did see in the last chart the commit that just had a tinderbox running got scores of zero. As time goes by the coodinator looses trust in the tinderbox to still report back: the chart below shows the scores given after twice the time the tinderbox gave as an estimate has passed. You see the blue line now scores highest at commit 6, not commit 5 and the pink line scores highest at commit 5, not commit 3 — so as the coordinator looses trust in the running tinderboxes to come back, it again proposes to do builds closer to the already scheduled ones.

proposed scores with tinderbox results overdue

proposed scores with tinderbox results overdue

Another thing to note is that the highest score is rising: While in the first chart, each running tinderbox lowered the highest score by one (green line: highest at 9, blue line: highest at 8, pink line: highest at 7) after twice the time has passed, the highscores are all around 9 again.

Bisecting a breaker

Should a branch be broken, it usually would be very helpful if the tinderboxes would help bisecting. This is especially true for slow platforms and builds like Windows, ARM or the document load torturer by Markus. However, we do not want the tinderbox to over fixate on that, as our branch is a moving target. If there is a build breaker somewhere in a range of 256 commits, we do not want a slow tinderbox to bust away for 8 builds to find the offending one, and while doing that leave the head of the branch unwatched for a long time. So by default, the bisecting proposals have a highscore that is equal to the number of commits to bisect still. As such, by default, a tinderbox will be told to bisect — as long as:

  • the head of the branch is still broken
  • there are more commits in the bisect range, than there are new commit on the branch.
scores of commits in a range to bisect

proposal scores of commits in a range to bisect

Otherwise, the tinderbox will be told to build the latest commit, to check if the branch is still broken or fixed in the meantime. As such the coordinator will guard against commiting tinderboxes to bisect a breaker that was already fixed. Therefore the coordinator knows a few more states than plain ‘good’ or ‘bad’ for a commit:

  • UNKNOWN — nothing known yet
  • RUNNING — a tinderbox is currently claiming to run this commit
  • GOOD — a tinderbox was happy with it
  • BAD — a tinderbox was unhappy with it
  • ASSUMED_GOOD — not tested, but the previous and the next finished build were good
  • ASSUMED_BAD — not tested, but the previous and the next finished build were bad
  • POSSIBLY_BREAKING — not tested, but the previous finished build was good and the next finished build was bad
  • POSSIBLY_FIXING — not tested, but the previous finished build was bad and the next finished build was good
  • BREAKING — this one was bad, while the previous commit was good

Here is some example output

$ ./tb3-show-history --repo ~/checkouts/core.git --platform linux --branch 65134fb75c3e94b7869fb6d490f88bf4b252760e --history-count 10
65134fb75c3e94b7869fb6d490f88bf4b252760e started on 2013-07-25 17:27:30.383767 with builder ubuntu-tinderbox and finished on 2013-07-25 17:40:41.226494 -- artifacts at 65134fb75c3e94b7869fb6d490f88bf4b252760e-137476605045.out, state: BAD (took 0:13:10.842727)
6100d94078d37cb1413a0e45460cee480ba3e211 started on None with builder None and finished on None -- artifacts at None, state: ASSUMED_BAD
24d46ea66485ff8b5bca49ec587b41547787bf42 started on None with builder None and finished on None -- artifacts at None, state: ASSUMED_BAD
d041980a7aad0e6d111752ca98db42f9853a3c6b started on 2013-07-25 17:40:52.587150 with builder ubuntu-tinderbox and finished on 2013-07-25 17:53:04.204549 -- artifacts at d041980a7aad0e6d111752ca98db42f9853a3c6b-137476685269.out, state: BAD (took 0:12:11.617399)
3b28ec6855e5df0629427752d7dafae1f0a277d4 started on None with builder None and finished on None -- artifacts at None, state: ASSUMED_BAD
cca0b9ae02603ab88ec7d8810aab2a8a1b4efda2 started on 2013-07-25 18:08:01.201013 with builder ubuntu-tinderbox and finished on 2013-07-25 18:20:39.536451 -- artifacts at cca0b9ae02603ab88ec7d8810aab2a8a1b4efda2-137476848124.out, state: BREAKING (took 0:12:38.335438)
767b02bd7614059dd80d0cd1be306d9b63291f31 started on 2013-07-25 17:53:14.745394 with builder ubuntu-tinderbox and finished on 2013-07-25 18:07:42.527839 -- artifacts at 767b02bd7614059dd80d0cd1be306d9b63291f31-137476759480.out, state: GOOD (took 0:14:27.782445)
c852f83bc4d91de51c61ad4be0edf1b848247eaa started on None with builder None and finished on None -- artifacts at None, state: ASSUMED_GOOD
0d874ee2e452ea67c03a27bf1a7f26d0ffc617dc started on None with builder None and finished on None -- artifacts at None, state: ASSUMED_GOOD
ff14c3b595ebe71153f97ebb8871cf024ea76959 started on 2013-07-25 17:12:58.024727 with builder ubuntu-tinderbox and finished on 2013-07-25 17:27:17.439374 -- artifacts at ff14c3b595ebe71153f97ebb8871cf024ea76959-137476517809.out, state: GOOD (took 0:14:19.414647)

Some details and missing bits

The coordinator stores the results in git notes as JSON objects. This has multiple advantages: There is no need for a external database and the state of the notes are under revision control. It also has one disadvantage: Its not exactly quick. However the revision control can help to mitigate that mostly  — as e.g. a webfrontend can easily ask: “what changed on the state since I last polled you?” and do incremental updates from there.

Which brings me to the missing bits: The stuff that tells the world the state of the repo on a webfrontend, RSS feed, IRC Bots or via email digests. The second missing bit is some kind of privilege separating between the tinderboxes and the coordinator. tb3 is currently churning away on the Sun Ultra 24 that I donated to the Document Foundation doing duty as an Ubuntu tinderbox, but coordinator and tinderbox are still running on the same account — even though as separate processes. As setuid for scripts is messy business, I plan to give tb3 a trivial REST-like interface on a non-public HTTP server. In addition to being able to offload the authentication and authorization problems outside of tb3 to something considering it a solved problem, it also makes integration in webfrontends etc. simple (esp. given that all the data is in JSON already anyway.)

In the long run, the scoring of tb3 also should make it easier for the buildbots that do duty on gerrit to make a call on if they should test build something there or if their help is more needed for tinderbox duty.

tl;dr

tb3 can:

  • coordinate multiple tinderboxes working on the same build scenario or branch
  • coordinate one tinderbox working on multiple build scenarios or multiple branches
  • make tinderboxes bisect without loosing sight of the head of a branch
  • especially help tests and builds that are painfully slow

They can also create builds for bibisect along the way, but that is a story for another day.

(*) This is helpful for some test suites like e.g. subsequentcheck. If you do a build as proposed by the coordinator, you can cheaply report back the result of the build only. And since you then can just the subsequentcheck test suite on top of the build of that commit (and only on that commit), you can then report to be running these tests and report the results without ever caring if the coordinator thinks this commit has as high priority for this.

postscriptum: Yeah, I know, I promised to be on vacation now and not harass you with any posts, but this is a scheduled blogpost and as such does not count.


Read more
bmichaelsen

Whoomp! There it is …

– Tag Team

After having uploaded slides already quite some time ago, its time for some update. So I added slides to the talks I gave at FISL 14 and 29c3 and added some video links to the descriptions for the FISL, 29c3 and the LibreOffice conference 2012 talks. Here are all the slides. And with that last long pending task done, I will bolt out for vacation. Enjoy!


Read more
bmichaelsen

Hey ho, let’s go
Hey ho, let’s go
They’re forming in a straight line
They’re going through a tight wind

– Blitzkrieg Bop, Ramones

Italo wrote this nice overview on the history of LibreOffice giving a somewhat nostalgic view back on the early days and some good statistics on what we achieved since then, so when I posted the postmortem on LibreOffice 3.6 yesterday there were some questions on the numbers beyond the LibreOffice 3.6 series. Well, without further ado here they are:(*)

minorfixesall

There is some healthy growth in fixes going in each release although it somewhat slowed down(**) around the 3.6 series as the amount of bugfixes grew in a way that made it quite some extra work to keep up with their administration purely on a mailing list. Luckily, this is were gerrit came into play: In the 4.0 series most commits (77%) are reviewed on gerrit, which steamlined the work in a way that made the rate of fixes climbing again(***), so that the current LibreOffice 4.0.4 has more bugfixes in a minor release than any previous version that early in the cycle.

Note though that these bug fix counts can not be simply added for a multitude of reasons:

  • some bugfixes go into multiple releases (because there is more than one active branch at any given time)
  • some bugfixes do not get accounted for in the bug tracker
  • many (in fact most of the exciting and interesting ones) go into the major series updates and not into a minor bugfix update

So how many bugs did LibreOffice resolve since it started? Its hard to tell, because these issues are not always tracked in one issue tracker(****). However, this table from bugzilla gives a lower bound. As of 2013-07-23, the LibreOffice project resolved 12.596 bug reports on its own issue tracker, half of those fixed by developers, half of those hunted down and triaged by the QA team:

  • 4.389 bug reports were intentionally fixed by developers (resolution: FIXED)
  • 2.123 bug reports were unintentionally fixed by developers and then found fixed by the QA team (resolution: WORKSFORME — sometimes one bug causes multiple symptoms, so a fix for one bug report might also solve another that the developer is not aware of)
  • 3.003 bug reports were identified as a duplicate of an existing report by the QA team (resolution: DUPLICATE)
  • 3.081 bug reports were found to be invalid of some kind (resolution: INVALID, NOTABUG, NOTOURBUG, …)

So in summary: Since it started and as of 2013-07-23, the LibreOffice project in total at least fixed 6.512 issues and resolved 12.596 bug reports from its own issue tracker.

(*) A note on the minor release bug fix counts: They are just scraped off from the ChangeLog pages like https://wiki.documentfoundation.org/Releases/4.0.4/RC1 — esp. for older releases these might still be a bit off.
(**) note that the 3.6 series was alive 3 month longer than 3.5 (~35% more time), without receiving the same amount of additional fixes.
(***) To the tune of the Ramones quoted above.
(****) For example, at the time of writing, there are 256 resolved issues in launchpad tracking bugs at LibreOffice of 1969 resolved issues filed against LibreOffice on Ubuntu in total as only a subset of well-triaged and hard to fix issues is upstreamed. So the numbers given above are not conflicting at all with e.g. the estimate of 3.000 bug fixes in LibreOffice 4.1 alone. See the development FAQ for an overview of common issue trackers referenced in commit messages.


Read more
bmichaelsen

Dear Prudence, won’t you come out to play
Dear Prudence, greet the brand new day

-- Dear Prudence, White Album, Beatles

Yesterday, I put a build of LibreOffice 3.6.7 in the according Ubuntu PPA. As LibreOffice 3.6.7 is the last minor release update for the 3.6 series this is a good time to have a look back and see how our choice of a train release model pays off. For this I had a look at the number of bugfixes and the regressions over the series. First lets look if there were any regressions in the minor series: that is, if there is something which worked in 3.6.0, but which stopped working in 3.6.7. For LibreOffice 3.6.7 we look at:

  • known issues (that is: they have to be reported e.g. at our Bug Submission Agent)
  • they have to be well-triaged to be a real problem (that is: They cant be in bug status NEEDINFO or UNCONFIRMED — which would mean that they are incomplete)
  • they have to be marked as a regression
  • they have not been there in 3.6.0, otherwise they would not be a minor release regression
  • they are not fixed in one of the 3.6.1 — 3.6.7 minor release updates.
  • they have to affect Linux

At the time of publishing 3.6.7 this query results in bugzilla replying with its infamous  “Zarro Boogs found” — so there are no known, well-triaged regressions in LibreOffice 3.6.7 vs. LibreOffice 3.6.0 at the time of release on Ubuntu.

Does this mean that LibreOffice will give you a hard guarantee on this? No, the release train model of LibreOffice is more important: It keeps everyone accountable and gives users a reliable date and time for a new release to test and use. So LibreOffice will not “stop the train”, if such a bug candidate is found. If you for example extend the above query to all platforms you will see it return 5 bugs from other platforms. It is unlikely that there are “real” minor release regressions though: those issues are just not confirmed to have been there at 3.6.0 already.

Enough talk about regressions in the 3.6 series — lets have a look at the positive things in minor releases: bug fixes. Despite LibreOffice 3.6.7 on Linux having no known, well-triaged release regressions, it has 547 bugs fixed against 3.6.0. So for Ubuntu users installing 3.6.7 this is the picture:367fixesandregressions

 

So much for the quality of minor releases, but what about major releases? Well, for major releases there is one additional important factor: timely reporting. So lets have a look at LibreOffice 4 regressions that have been filed in a timely manner: That is, between the tagging of the first alpha on November 20, 2012 and the release of LibreOffice 4.0.0 on February 7, 2013. Bugzilla shows that currently more than 95% of these are already resolved, even though LibreOffice 4 will still see a few more minor updates.

For regressions reported in the alpha and beta phase before the tagging of the first release candidate on January 9, 2013 its even more impressive: All of those regressions are resolved by now.

I assume the statistics for LibreOffice 4.1 to become even more impressive as the LibreOffice bug triage contest was hugely successful and sure will help fast tracking the triage of bugs, getting them ready to be fixed by a developer quickly. Thanks to Joel Madero, Joren De Cuyper and Robinson Tyrone for organizing this awesome event and big “Thank you!” too to all the volunteers taking part in this.

Which finally brings us to the little tune quoted at the top-right of the post: The earlier you test LibreOffice and report an issue, the better for you and for the product. With LibreOffice 4.1 approching fast, I invite you to test what will become LibreOffice 4.2 by downloading a daily build, playing with its exciting new features and get involved on #libreoffice-qa while listening to Sir Paul doing his magic on the bass over … Sir Paul doing his magic on the drums.


Read more
bmichaelsen

With a rebel yell: “more, more, more”
More, more, more.

Rebel Yell, Billy Idol

This weekend the LibreOffice community will meet again in Hamburg for the third Hackfest at this location:

335px-HHHackfest2013here is how it looked last year:

Hackers at Hackfest Hamburg 2012

Hackers on the last Hamburg Hackfest

Like last year, this years Hackfest gets kicked off with a meet and greet at the Schachcafe on Friday 20:00 o’clock local time. Think of it like the “beer event” at FOSDEM, which helps everyone warm up for the event –  except it will not be February and not freezing cold. Looking this picture, it likely wont be much like cold FOSDEM at all:

Schachcafe

All details can be found on the Hackfest 2013 wiki page. Thanks to Lanedo for sponsoring this event and also big thanks to Attraktor.org for hosting us again!


Read more
bmichaelsen

One and Only

I am the one and only nobody I’d rather be

I am the one and only you can’t take that away from me

– Chesney Hawkes, The One and Only

Just a short note: I have missed the exact date in the release madness for Ubuntu 13.04 Raring, but a few days ago something important silently happened: All supported Ubuntu releases are now shipping with LibreOffice by default, as trusty old Ubuntu 10.04 LTS (Lucid Lynx) reached its end of support for the desktop. So we now have these supported releases:

  • Ubuntu 12.04 LTS (Precise Pangolin) with LibreOffice 3.5
  • Ubuntu 12.10 (Quantal Quetzal) with LibreOffice 3.6
  • Ubuntu 13.04 (Raring Ringtail) with LibreOffice 4.0
  • and upcoming: Ubuntu 13.10 (Saucy Salamander) with LibreOffice 4.1

Also the following releases (which are not supported anymore) have been done in addition:

  • Ubuntu 11.04 (Natty Narwhal) with LibreOffice 3.3
  • Ubuntu 11.10 (Oneiric Ocelot) with LibreOffice 3.4

Looking back in time at the angstridden, not-acting-but-reacting excitement of the early days and comparing it with the way we are really pushing the envelope now, we have really come a long way, improving with every step on the way. Well worth a celebration with one of the most cheesy 1990ies hits ever – Thanks to everyone, who was and is part of this!


Read more
bmichaelsen

Sitting on a cornflake, waiting for the van to come.
Corporation t-shirt, stupid bloody Tuesday.
– I am the walrus, The Beatles

Although the “OOo does not print on Tuesdays” OpenOffice.org-bug is long fixed, OpenOffice.org never indemnified Tuesday for its loss in reputation. This is unacceptable and Tuesdays rage at the event has even passed on to its successful successor: LibreOffice.

Now, Tuesday is a weekday and as such, money does not mean that much to it. After some consultation with Tuesday, it was concluded that the only way to indemnify Tuesday was to make the other weekdays suffer the same fate. Since the set of weekdays is luckily limited, this was easily considered technically feasible, even if at some minor inconvenience for the users of LibreOffice.

Toolbar popup of the new feature

Toolbar popup of the new feature

To limit the impact for users, instead of silently failing to act when requested to print on a non-Tuesday, for the comfort of the users of LibreOffice a notification was added that explains the situation. Tuesday — after some heated discussion — gave license to this modification.

message on Non-Tuesdays

message on trying to print on non-Tuesdays

It is intended to ship this extension pre-bundled with all LibreOffice releases until Tuesdays rage is soothed. It is hoped this will happen quickly as both contributors and users of LibreOffice are known for their lack of sympathy for monopoly in any kind, shape or form — explicitly including the exclusive right of a weekday to print. For those of you excited to try out this thrilling new feature right now, the extension is available for download here:

It has to be noted that the extension is written purely in Python and is completely self-contained: it can either be treated as the oxt to be installed or as a zip file containing the source code: unpack it with the archive program of your choice, modify it to your hearts content and run the script called ‘build’ that you find in there. This will recreate a new (modified) extension.


Read more
bmichaelsen

autopkgtests for adults

“That’s not a knife — THAT’s a knife.”

– Michael J. Crocodile Dundee

I recently worked a bit to see this line showing up in my favorite editor:

ubtree0t-junit-subsequentcheck PASS

LibreOffice has multiple sets of testsuites and during the build of the package we run them all (although not yet on all platforms). However, LibreOffice depends on ~1/3 of main — so there are a lot of things that might break LibreOffice. A lot of things just break at build time and not at run time and thus prevent the such a broken package to enter the archive in the first place as we run the tests during the build already. Thats as: unless the breakage is caused by an update of a dependency of LibreOffice, therefore making the LibreOffice package in the archive FTBFS (or at least broken) in a sneaky way. Thats whats happened for the e.g. the libjpeg, boost, kdelibs examples above.

But lets keep those aside for now and concentrate on the runtime issues. Running the tests at build-time is a good early-warning already and prevents some serious breakage to enter the archive. On the other hand, these tests do not run against LibreOffice as we install it in the system from packages — it runs them against a installation set aside in the build tree. While I can not come up with an immediate example, where LibreOffice was broken when installed from the packages in a way that would have been detected by tests but missed when run against the in-tree installation, it would still be good to have the additional confidence that:

  • LibreOffice passes the tests as installed on the system
  • LibreOffice is not broken at runtime by some update of a dependency

In short: Its highly desirable to test that LibreOffice does still run and work as the ground below it keeps moving — this is even more important when Ubuntu is considering to move towards a more rolling way of releasing. And we have a means to do that: Autopkgtests.

So, what was needed to get this working for LibreOffice?

First, some parts of the testsuites are quite large and — as we run the tests during the build anyway — are already build during the build. Therefore it made sense to package these, which was done very early in the cycle (actually: during UDS).

Second, we would need to get LibreOffice to run the tests without trying to build the product. That originally wasnt as easy as it may seem. For one, LibreOffice build system was reasonably expecting that you need a product to test it and therefore would have dependencies on the product to be build. In addition, when I started considering this, we still had a lot of the old build system around — which was a pain to bend to your will. Luckily, these times are over. So, by now a patch changing some ~15 lines get us what we want.

Third, we need a config_host.mk (the output of ./configure), so that we can run the LibreOffice build. And for that, we unfortunately need the build dependencies (which are generated) of LibreOffice — otherwise we would not really test what we did build. But for a missing feature of autopkgtests, we can not reuse the existing dependencies, but have to do manual double bookkeeping there. Im not thrilled by the prospect of hunting false positives there. Some possible ways out would be:

  • to package the config_host.mk file into the package containing the other testsuite helpers, but that would make that package architecture dependendant
  • or to not really specify the dependencies at all and pragmatically and greedily request the restrictions needs-root and breaks-testbed and then — as we are root now — run this before starting the tests:
    apt-get build-dep -y libreoffice

Finally, we should be able to run:

apt-get build-dep libreoffice
apt-get install libreoffice-subsequentcheckbase
apt-get source libreoffice
cd libreoffice-*
./debian/tests/junit-subsequentcheck

and this should run the tests locally and headless — and indeed it does and the tests happily finish and report success. Great, lets quickly check if it also runs in the ‘official’ VM with:

run-adt-test

Nope, and this is why I choose the Crocodile Dundee quote below the title for this post: The VM fails before it even starts the tests — it does not even have enough discspace to copy in the LibreOffice source package. This needs to be fixed on the side of the image, there is nothing on the test side that could fix this. But to test if LibreOffice would finish if only the image could handle it, I began cannibalizing, removing one after another the directories of the icon-themes, translations and external sources from the package, each time getting a bit further: from failing to start to failing when installing the 501 additional packages and so on. With this hollowed out package, I could verify: yes, the autopkgtest would pass in the image, if only it had enough discspace.

Finally, once this is in the archive (or ppa) you will also be able to run:

apt-get build-dep libreoffice
apt-get install libreoffice-subsequentcheckbase
apt-get source libreoffice
cd libreoffice-*
libreoffice '--accept=pipe,name=blickenlights;urp'&
./debian/tests/junit-subsequentcheck 'connect:pipe,name=blinkenlights' 1

This will connect to the LibreOffice you started in the second-to-last step (which is not headless, but running in your session) and run the tests against it. The “1″ tells it not to use parallelization, but just run one suite at a time, as otherwise you have a very good chance to lock/hang your own session by compiz (or the dash or other components) being mightly confused by all the windows flashing up and closing in fast progression. With “1″ you might still get some test failures (mostly from the a11y integration) — but at least your session will survive:

ZO RELAXEN UND WATSCHEN DER BLINKENLICHTEN.

Addendum:

Preparing an adt-image with:

./bin/prepare-testbed -r raring amd64 -S12GB

seems to solve the issue. The “df -h” at the start of the test reports some 3GB of free space (with 2.6GB being needed still to create a rw-copy of the source tree after that point). So 12GB is likely the size the images on Jenkins roughly currently need (plus maybe another 1GB of wiggle room).


Read more