# Canonical Voices

Jussi Pakkanen

## Simple thread safety with pimpl’d objects

A use case that pops up every now and then is to have a self-contained object that needs to be accessed from multiple threads. The problem appears when the object, as part of its usual things calls its own methods. This leads to tricky locking operations, a need to use a recursive mutex or something else that is nonoptimal.

Another common approach is to use the pimpl idiom, which hides the contents of an object inside a hidden private object. There are ample details on the internet, but the basic setup of a pimpl’d class is the following. First of all we have the class header:

class Foo {
public:
Foo();
void func1();
void func2();

private:
class Private;
std::unique_ptr<Private> p;
};

Then in the implementation file you have first the defintiion of the private class.

class Foo::Private {
public:
Private();
void func1() { ... };
void func2() { ... };

private:
void privateFunc() { ... };
int x;
};

Followed by the definition of the main class.

Foo::Foo() : p(new Private) {
}

void Foo::func1() {
p->func1();
}

void Foo::func2() {
p->func2();
}

That is, Foo only calls the implementation bits in Foo::Private.

The main idea to realize is that Foo::Private can never call functions of Foo. Thus if we can isolate the locking bits inside Foo, the functionality inside Foo::Private becomes automatically thread safe. The way to accomplish this is simple. First you add a (public) std::mutex m to Foo::Private. Then you just change the functions of Foo to look like this:

void Foo::func1() {
std::lock_guard<std::mutex> guard(p->m);
p->func1()
}

void Foo::func2() {
std::lock_guard<std::mutex> guard(p->m);
p->func2();
}

This accomplishes many things nicely:

• Lock guards make locks impossible to leak, no matter what happens
• Foo::Private can pretend that it is single-threaded which usually makes implementation a lot easier

The main drawback of this approach is that the locking is coarse, which may be a problem when squeezing out ultimate performance. But usually you don’t need that.

pitti

## vim config for Markdown+LaTeX pandoc editing

I have used LaTeX and latex-beamer for pretty much my entire life of document and presentation production, i. e. since about my 9th school grade. I’ve always found the LaTeX syntax a bit clumsy, but with good enough editor shortcuts to insert e. g. \begin{itemize} \item...\end{itemize} with just two keystrokes, it has been good enough for me.

A few months ago a friend of mine pointed out pandoc to me, which is just simply awesome. It can convert between a million document formats, but most importantly take Markdown and spit out LaTeX, or directly PDF (through an intermediate step of building a LaTeX document and calling pdftex). It also has a template for beamer. Documents now look soo much more readable and are easier to write! And you can always directly write LaTeX commands without any fuss, so that you can use markdown for the structure/headings/enumerations/etc., and LaTeX for formulax, XYTex and the other goodies. That’s how it should always should have been! ?

So last night I finally sat down and created a vim config for it:

"-- pandoc Markdown+LaTeX -------------------------------------------

function s:MDSettings()
noremap <buffer> <Leader>b :! pandoc -t beamer % -o %<.pdf<CR><CR>
noremap <buffer> <Leader>l :! pandoc -t latex % -o %<.pdf<CR>
noremap <buffer> <Leader>v :! evince %<.pdf 2>&1 >/dev/null &<CR><CR>

" adjust syntax highlighting for LaTeX parts
"   inline formulas:
syntax region Statement oneline matchgroup=Delimiter start="\$" end="\$"
"   environments:
syntax region Statement matchgroup=Delimiter start="\\begin{.*}" end="\\end{.*}" contains=Statement
"   commands:
syntax region Statement matchgroup=Delimiter start="{" end="}" contains=Statement
endfunction

autocmd FileType markdown :call <SID>MDSettings()


That gives me “good enough” (with some quirks) highlighting without trying to interpret TeX stuff as Markdown, and shortcuts for calling pandoc and evince. Improvements appreciated!

Dustin Kirkland

## Improving Random Seeds in Ubuntu 14.04 LTS Cloud Instances

Tomorrow, February 19, 2014, I will be giving a presentation to the Capital of Texas chapter of ISSA, which will be the first public presentation of a new security feature that has just landed in Ubuntu Trusty (14.04 LTS) in the last 2 weeks -- doing a better job of seeding the pseudo random number generator in Ubuntu cloud images.  You can view my slides here (PDF), or you can read on below.  Enjoy!

### A: Because entropy is important!

• Choosing hard-to-guess random keys provide the basis for all operating system security and privacy
• SSL keys
• SSH keys
• GPG keys
• TCP sequence numbers
• UUIDs
• dm-crypt keys
• eCryptfs keys
• Entropy is how your computer creates hard-to-guess random keys, and that's essential to the security of all of the above

### A: Hardware, typically.

• Keyboards
• Mouses
• Interrupt requests
• HDD seek timing
• Network activity
• Microphones
• Web cams
• Touch interfaces
• WiFi/RF
• TPM chips
• RdRand
• Entropy Keys
• Pricey IBM crypto cards
• Expensive RSA cards
• USB lava lamps
• Geiger Counters
• Seismographs
• Light/temperature sensors
• And so on

### A: Pseudo random number generators are our only viable alternative.

• In Linux, /dev/random and /dev/urandom are interfaces to the kernel’s entropy pool
• Basically, endless streams of pseudo random bytes
• Some utilities and most programming languages implement their own PRNGs
• But they usually seed from /dev/random or /dev/urandom
• Sometimes, virtio-rng is available, for hosts to feed guests entropy
• But not always

### A: Yes, if they are properly seeded.

• See random(4)
• When a Linux system starts up without much operator interaction, the entropy pool may be in a fairly predictable state
• This reduces the actual amount of noise in the entropy pool below the estimate
• In order to counteract this effect, it helps to carry a random seed across shutdowns and boots
• See /etc/init.d/urandom
...dd if=/dev/urandom of=$SAVEDFILE bs=$POOLBYTES count=1 >/dev/null 2>&1 ...

### A: Basically, its a small catalyst that primes the PRNG pump.

• Let’s pretend the digits of Pi are our random number generator
• The random seed would be a starting point, or “initialization vector”
• e.g. Pick a number between 1 and 20
• say, 18
• Now start reading random numbers

• Not bad...but if you always pick ‘18’...

#### XKCD on random numbers

 RFC 1149.5 specifies 4 as the standard IEEE-vetted random number.

### A: Yep, but computers are predictable, especially VMs.

• Computers are inherently deterministic
• And thus, bad at generating randomness
• Real hardware can provide quality entropy
• But virtual machines are basically clones of one another
• ie, The Cloud
• No keyboard or mouse
• IRQ based hardware is emulated
• Block devices are virtual and cached by hypervisor
• RTC is shared
• The initial random seed is sometimes part of the image, or otherwise chosen from a weak entropy pool

### A: I’m afraid not...

#### Analysis of the LRNG (2006)

• Little prior documentation on Linux’s random number generator
• Random bits are a limited resource
• Very little entropy in embedded environments
• OpenWRT was the case study
• OS start up consists of a sequence of routine, predictable processes
• Very little demonstrable entropy shortly after boot
• http://j.mp/McV2gT

#### Black Hat (2009)

• iSec Partners designed a simple algorithm to attack cloud instance SSH keys
• Picked up by Forbes
• http://j.mp/1hcJMPu

#### Factorable.net (2012)

• Minding Your P’s and Q’s: Detection of Widespread Weak Keys in Network Devices
• Comprehensive, Internet wide scan of public SSH host keys and TLS certificates
• Insecure or poorly seeded RNGs in widespread use
• 5.57% of TLS hosts and 9.60% of SSH hosts share public keys in a vulnerable manner
• They were able to remotely obtain the RSA private keys of 0.50% of TLS hosts and 0.03% of SSH hosts because their public keys shared nontrivial common factors due to poor randomness
• They were able to remotely obtain the DSA private keys for 1.03% of SSH hosts due to repeated signature non-randomness
• http://j.mp/1iPATZx

#### Dual_EC_DRBG Backdoor (2013)

• Dual Elliptic Curve Deterministic Random Bit Generator
• Ratified NIST, ANSI, and ISO standard
• Possible backdoor discovered in 2007
• Bruce Schneier noted that it was “rather obvious”
• Documents leaked by Snowden and published in the New York Times in September 2013 confirm that the NSA deliberately subverted the standard
• http://j.mp/1bJEjrB

### A: For starters, do a better job seeding our PRNGs.

• Securely
• With high quality, unpredictable data
• More sources are better
• As early as possible
• And certainly before generating
• SSH host keys
• SSL certificates
• Or any other critical system DNA
• /etc/init.d/urandom “carries” a random seed across reboots, and ensures that the Linux PRNGs are seeded

### A: Run Ubuntu!

Sorry, shameless plug...

### A: Meet pollinate.

• pollinate is a new security feature, that seeds the PRNG.
• Introduced in Ubuntu 14.04 LTS cloud images
• Upstart job
• It automatically seeds the Linux PRNG as early as possible, and before SSH keys are generated
• It’s GPLv3 free software
• Simple shell script wrapper around curl
• Fetches random seeds
• From 1 or more entropy servers in a pool
• Writes them into /dev/urandom

### A: Introducing pollen.

• pollen is an entropy-as-a-service implementation
• Works over HTTP and/or HTTPS
• Supports a challenge/response mechanism
• Provides 512 bit (64 byte) random seeds
• It’s AGPL free software
• Implemented in golang
• Less than 50 lines of code
• Fast, efficient, scalable
• Returns the (optional) challenge sha512sum
• And 64 bytes of entropy

pollen.go

### A: Hello, entropy.ubuntu.com.

• Highly available pollen cluster
• TLS/SSL encryption
• Multiple physical servers
• Behind a reverse proxy
• Deployed and scaled with Juju
• Multiple sources of hardware entropy
• High network traffic is always stirring the pot
• AGPL, so source code always available
• Supported by Canonical
• Ubuntu 14.04 LTS cloud instances run pollinate once, at first boot, before generating SSH keys

### A: Then use a different entropy service :-)

• bzr branch lp:pollen
• sudo apt-get install pollen
• juju deploy pollen
• Add your preferred server(s) to your $POOL • In /etc/default/pollinate • In your cloud-init user data • In progress • In fact, any URL works if you disable the challenge/response with pollinate -n|--no-challenge ### Q: So does this increase the overall entropy on a system? ### A: No, no, no, no, no! • pollinate seeds your PRNG, securely and properly and as early as possible • This improves the quality of all random numbers generated thereafter • pollen provides random seeds over HTTP and/or HTTPS connections • This information can be fed into your PRNG • The Linux kernel maintains a very conservative estimate of the number of bits of entropy available, in /proc/sys/kernel/random/entropy_avail • Note that neither pollen nor pollinate directly affect this quantity estimate!!! ### Q: Why the challenge/response in the protocol? ### A: Think of it like the Heisenberg Uncertainty Principle. • The pollinate challenge (via an HTTP POST submission) affects the pollen's PRNG state machine • pollinate can verify the response and ensure that the pollen server at least “did some work” • From the perspective of the pollen server administrator, all communications are “stirring the pot” • Numerous concurrent connections ensure a computationally complex and impossible to reproduce entropy state ### Q: What if pollinate gets crappy or compromised or no random seeds? ### A: Functionally, it’s no better or worse than it was without pollinate in the mix. • In fact, you can dd if=/dev/zero of=/dev/random if you like, without harming your entropy quality • All writes to the Linux PRNG are whitened with SHA1 and mixed into the entropy pool • Of course it doesn’t help, but it doesn’t hurt either • Your overall security is back to the same level it was when your cloud or virtual machine booted at an only slightly random initial state • Note the permissions on /dev/*random • crw-rw-rw- 1 root root 1, 8 Feb 10 15:50 /dev/random • crw-rw-rw- 1 root root 1, 9 Feb 10 15:50 /dev/urandom • It's a bummer of course, but there's no new compromise ### Q: What about SSL compromises, or CA Man-in-the-Middle attacks? ### A: We are mitigating that by bundling the public certificates in the client. • The pollinate package ships the public certificate of entropy.ubuntu.com • /etc/pollinate/entropy.ubuntu.com.pem • And curl uses this certificate exclusively by default • If this really is your concern (and perhaps it should be!) • Add more URLs to the$POOL variable in /etc/default/pollinate
• Put one of those behind your firewall
• You simply need to ensure that at least one of those is outside of the control of your attackers

### A: The usual web server debug info.

• The current timestamp
• The incoming client IP/port
• At entropy.ubuntu.com, the client IP/port is actually filtered out by the load balancer
• The browser user-agent string
• Basically, the exact same information that Chrome/Firefox/Safari sends
• You can override if you like in /etc/default/pollinate
• The challenge/response, and the generated seed are never logged!
Feb 11 20:44:54 x230 2014-02-11T20:44:54-06:00 x230 pollen[28821] Server received challenge from [127.0.0.1:55440, pollinate/4.1-0ubuntu1 curl/7.32.0-1ubuntu1.3 Ubuntu/13.10 GNU/Linux/3.11.0-15-generic/x86_64] at [1392173094634146155]

Feb 11 20:44:54 x230 2014-02-11T20:44:54-06:00 x230 pollen[28821] Server sent response to [127.0.0.1:55440, pollinate/4.1-0ubuntu1 curl/7.32.0-1ubuntu1.3 Ubuntu/13.10 GNU/Linux/3.11.0-15-generic/x86_64] at [1392173094634191843]

### A: Yes, but more feedback is welcome!

• All of the source is available
• Service design and hardware specs are available
• The Ubuntu Security team has reviewed the design and implementation
• All feedback has been incorporated
• At least 3 different Linux security experts outside of Canonical have reviewed the design and/or implementation
• All feedback has been incorporated

Stay safe out there!
:-Dustin

Michael Hall

## Why do you contribute to open source?

It seems a fairly common, straight forward question.  You’ve probably been asked it before. We all have reasons why we hack, why we code, why we write or draw. If you ask somebody this question, you’ll hear things like “scratching an itch” or “making something beautiful” or “learning something new”.  These are all excellent reasons for creating or improving something.  But contributing isn’t just about creating, it’s about giving that creation away. Usually giving it away for free, with no or very few strings attached.  When I ask “Why do you contribute to open source”, I’m asking why you give it away.

This question is harder to answer, and the answers are often far more complex than the ones given for why people simply create something. What makes it worthwhile to spend your time, effort, and often money working on something, and then turn around and give it away? People often have different intentions or goals in mind when the contribute, from benevolent giving to a community they care about to personal pride in knowing that something they did is being used in something important or by somebody important. But when you strip away the details of the situation, these all hinge on one thing: Recognition.

If you read books or articles about community, one consistent theme you will find in almost all of them is the importance of recognizing  the contributions that people make. In fact, if you look at a wide variety of successful communities, you would find that one common thing they all offer in exchange for contribution is recognition. It is the fuel that communities run on.  It’s what connects the contributor to their goal, both selfish and selfless. In fact, with open source, the only way a contribution can actually stolen is by now allowing that recognition to happen.  Even the most permissive licenses require attribution, something that tells everybody who made it.

Now let’s flip that question around:  Why do people contribute to your project? If their contribution hinges on recognition, are you prepared to give it?  I don’t mean your intent, I’ll assume that you want to recognize contributions, I mean do you have the processes and people in place to give it?

We’ve gotten very good about building tools to make contribution easier, faster, and more efficient, often by removing the human bottlenecks from the process.  But human recognition is still what matters most.  Silently merging someone’s patch or branch, even if their name is in the commit log, isn’t the same as thanking them for it yourself or posting about their contribution on social media. Letting them know you appreciate their work is important, letting other people know you appreciate it is even more important.

If you the owner or a leader in a project with a community, you need to be aware of how recognition is flowing out just as much as how contributions are flowing in. Too often communities are successful almost by accident, because the people in them are good at making sure contributions are recognized and that people know it simply because that’s their nature. But it’s just as possible for communities to fail because the personalities involved didn’t have this natural tendency, not because of any lack of appreciation for the contributions, just a quirk of their personality. It doesn’t have to be this way, if we are aware of the importance of recognition in a community we can be deliberate in our approaches to making sure it flows freely in exchange for contributions.

Joseph Salisbury

## Kernel Team Meeting Minutes – July 22, 2014

### Agenda

20140722 Meeting Agenda

### Release Metrics and Incoming Bugs

Release metrics and incoming bug data can be reviewed at the following link:

http://people.canonical.com/~kernel/reports/kt-meeting.txt

### Status: Utopic Development Kernel

The Utopic kernel has been rebased to v3.16-rc6 and officially uploaded
to the archive. We (as in apw) has also completed a hurculean config
and let us know your results.
—–
Important upcoming dates:
Thurs Jul 24 – 14.04.1 (~2 days away)
Thurs Aug 07 – 12.04.5 (~2 weeks away)
Thurs Aug 21 – Utopic Feature Freeze (~4 weeks away)

### Status: CVE’s

The current CVE status can be reviewed at the following link:

http://people.canonical.com/~kernel/cve/pkg/ALL-linux.html

### Status: Stable, Security, and Bugfix Kernel Updates – Trusty/Saucy/Precise/Lucid

Status for the main kernels, until today (Jul. 22):

• Lucid – Released
• Precise – Released
• Saucy – Released
• Trusty – Released

Current opened tracking bugs details:

• http://people.canonical.com/~kernel/reports/kernel-sru-workflow.html

For SRUs, SRU report is a good source of information:

• http://people.canonical.com/~kernel/reports/sru-report.html

Schedule:

14.04.1 cycle: 29-Jun through 07-Aug
====================================================================
27-Jun Last day for kernel commits for this cycle
29-Jun – 05-Jul Kernel prep week.
06-Jul – 12-Jul Bug verification & Regression testing.
13-Jul – 19-Jul Regression testing & Release to -updates.
20-Jul – 24-Jul Release prep
24-Jul 14.04.1 Release [1]
07-Aug 12.04.5 Release [2]

cycle: 08-Aug through 29-Aug
====================================================================
08-Aug Last day for kernel commits for this cycle
10-Aug – 16-Aug Kernel prep week.
17-Aug – 23-Aug Bug verification & Regression testing.
24-Aug – 29-Aug Regression testing & Release to -updates.

[1] This will be the very last kernels for lts-backport-quantal, lts-backport-raring,
and lts-backport-saucy.

[2] This will be the lts-backport-trusty kernel as the default in the precise point
release iso.

### Open Discussion or Questions? Raise your hand to be recognized

No open discussions.

Jussi Pakkanen

## The two ways of doing something

There are usually two different ways of doing something. The first is the correct way. The second is the easy way.

As an example of this, let’s look at using the functionality of C++ standard library. The correct way is to use the fully qualified name, such as std::vector or std::chrono::milliseconds. The easy way is to have using std; and then just using the class names directly.

The first way is the “correct” one as it prevents symbol clashes and for a bunch of other good reasons. The latter leads to all sorts of problems and for this reason many style guides etc prohibit its use.

But there is a catch. Software is written by humans and humans have a peculiar tendency.

They will always do the easy thing.

There is no possible way for you to prevent them from doing that, apart from standing behind their back and watching every letter they type.

Any sort of system that relies, in any way, on the fact that people will do the right thing rather than the easy thing are doomed to fail from the start. They. Will. Not. Work. And they can’t be made to work. Trying to force it to work leads only to massive shouting and bad blood.

What does this mean to you, the software developer?

It means that the only way your application/library/tool/whatever is going to succeed is that correct thing to do must also be the simplest thing to do. That is the only way to make people do the right thing consistently.

pitti

## autopkgtest 3.2: CLI cleanup, shell command tests, click improvements

Yesterday’s autopkgtest 3.2 release brings several changes and improvements that developers should be aware of.

## Cleanup of CLI options, and config files

Previous adt-run versions had rather complex, confusing, and rarely (if ever?) used options for filtering binaries and building sources without testing them. All of those (--instantiate, --sources-tests, --sources-no-tests, --built-binaries-filter, --binaries-forbuilds, and --binaries-fortests) now went away. Now there is only -B/--no-built-binaries left, which disables building/using binaries for the subsequent unbuilt tree or dsc arguments (by default they get built and their binaries used for tests), and I added its opposite --built-binaries for completeness (although you most probably never need this).

The --help output now is a lot easier to read, both due to above cleanup, and also because it now shows several paragraphs for each group of related options, and sorts them in descending importance. The manpage got updated accordingly.

Another new feature is that you can now put arbitrary parts of the command line into a file (thanks to porting to Python’s argparse), with one option/argument per line. So you could e. g. create config files for options and runners which you use often:

$cat adt_sid --output-dir=/tmp/out -s --- schroot sid$ adt-run libpng @adt_sid


## Shell command tests

If your test only contains a shell command or two, or you want to re-use an existing upstream test executable and just need to wrap it with some command like dbus-launch or env, you can use the new Test-Command: field instead of Tests: to specify the shell command directly:

Test-Command: xvfb-run -a src/tests/run
Depends: @, xvfb, [...]


This avoids having to write lots of tiny wrappers in debian/tests/. This was already possible for click manifests, this release now also brings this for deb packages.

## Click improvements

It is now very easy to define an autopilot test with extra package dependencies or restrictions, without having to specify the full command, using the new autopilot_module test definition. See /usr/share/doc/autopkgtest/README.click-tests.html for details.

If your test fails and you just want to run your test with additional dependencies or changed restrictions, you can now avoid having to rebuild the .click by pointing --override-control (which previously only worked for deb packages) to the locally modified manifest. You can also (ab)use this to e. g. add the autopilot -v option to autopilot_module.

Unpacking of test dependencies was made more efficient by not downloading Python 2 module packages (which cannot be handled in “unpack into temp dir” mode anyway).

Finally, I made the adb setup script more robust and also faster.

As usual, every change in control formats, CLI etc. have been documented in the manpages and the various READMEs. Enjoy!

Rick Spencer

## The Community Team

So, given Jono’s departure a few weeks back, I bet a lot of folks have been wondering about the Canonical Community Team. For a little background, the community team reports into the Ubuntu Engineering division of Canonical, which means that they all report into me. We have not been idle, and this post is to discuss a bit about the Community Team going forward.

## What has Stayed the Same?

First, we have made some changes to the structure of the community team itself. However, one thing did not change. I kept the community team reporting directly into me, VP of Engineering, Ubuntu. I decided to do this so that there is a direct line to me for any community concerns that have been raised to anyone on the community team.

I had a call with the Community Council a couple of weeks ago to discuss the community team and get feedback about how it is functioning and how things could be improved going forward. I laid out the following for the team.

First, there were three key things that I think that I wanted the Community Team to continue to focus on:
• Continue to create and run innovative programs to facilitate ever more community contributions and growing the community.
• Continue to provide good advice to me and the rest of Canonical regarding how to be the best community members we can be, given our privileged positions of being paid to work within that community.
• Continue to assist with outward communication from Canonical to the community regarding plans, project status, and changes to those.
The Community Council was very engaged in discussing how this all works and should work in the future, as well as other goals and responsibilities for the community team.

## What Has Changed?

In setting up the team, I had some realizations. First, there was no longer just one “Community Manager”. When the project was young and Canonical was small, we had only one, and the team slowly grew. However, the Team is now four people dedicated to just the Community Team, and there are others who spend almost all of their time working on Community Team projects.

Secondly, while individuals on the team had been hired to have specific roles in the community, every one of them had branched out to tackle new challenges as needed.

Thirdly, there is no longer just one “Community Spokesperson”. Everyone in Ubuntu Engineering can and should speak to/for Canonical and to/for the Ubuntu Community in the right contexts.
So, we made some small, but I think important changes to the Community Team.

First, we created the role Community Team Manager. Notice the important inclusion of the word Team”. This person’s job is not to “manage the community”, but rather to organize and lead the rest of the community team members. This includes things like project planning, HR responsibilities, strategic planning and everything else entailed in being a good line manager. After a rather competitive interview process, with some strong candidates, one person clearly rose to the top as the best candidate. So, I would like formally introduce David Planella (lp, g+) as the Community Team Manager!

Second, I change the other job titles from their rather specific titles to just “Community Manager” in order to reflect the reality that everyone on the community team is responsible for the whole community. So that means, Michael Hall (lp, g+), Daniel Holbach (lp, g+), and Nicholas Skaggs (lp, g+), are all now “Community Manager”.

## What's Next?

This is a very strong team, and a really good group of people. I know them each personally, and have a lot of confidence in each of them personally. Combined as a team, they are amazing. I am excited to see what comes next.

In light of these changes, the most common question I get is, “Who do I talk to if I have a question or concern?” The answer to that is “anyone.” It’s understandable if you feel the most comfortable talking to someone on the community team, so please feel free to find David, Michael, Daniel, or Nicholas online and ask their question. There are, of course, other stalwarts like Alan Pope (lp, g+) and Oliver Grawert (lp, g+) who seem to be always online :) By which, I mean to say that while the Community Managers are here to serve the Ubuntu Community, I hope that anyone in Ubuntu Engineering considers their role in the Ubuntu Community to include working with anyone else in the Ubuntu Community :)

Want talk directly to the community team today? Easy, join their Ubuntu on Air Q&A Session at 15 UTC :)

Finally, please note that I love to be "interrupted" by questions from community members :) The best way to get in touch with me is on freenode, where I go by rickspencer3. Otherwise, I am also on g+, and of course there is this blog :)

niemeyer

## mgo release r2014.07.21, now at gopkg.in

This is a special release of mgo, the Go driver for MongoDB. Besides the several new features, this release marks the change of the Go package import path to gopkg.in, after years using the current one based on a static file that lives at labix.org. Note that the package API is still not changing in any backwards incompatible way, though, so it is safe to replace in-use import paths right away. Instead, the change is being done for a few other reasons:

• gopkg.in is more reliable, offering a geographically distributed replicated deployment with automatic failover
• gopkg.in has a nice landing page with pointers to source code, API documentation, versions available, etc.
• gopkg.in is backed by git and github.com, which is one of the most requested changes over the years it has been hosted with Bazaar at Launchpad

So, from now on the import path to use when using and retrieving the package should be:

Starting with this release, the source code is also maintained and updated only on GitHub at:

The old repository and import path will remain up for the time being, to ensure existing applications continue to work, but it’s already holding out-of-date code.

In terms of changes, the r2014.07.21 release brings the following news:

Socket pool size limit may be changed

A new Session.SetPoolLimit was added for changing the number of sockets that may be in use for the desired server before the session will block waiting for an available one. The value defaults to 4096 if unset.

Note that the driver actually had the ability to limit concurrent sockets for a long time, and the hardcoded default was already 4096 sockets per server. The reason why this wasn’t exposed via the API is because people commonly delegate to the database driver the management of concurrency for the whole application, and this is a bad design. Instead, this limit must be set to cover any expected workload of the application, and application concurrency should be controlled “at the door”, by properly restricting used resources (goroutines, etc) before they are even created.

This feature was requested by a number of people, with the most notable threads being with Travis Reader, from IronIO, and the feature is finally being added after Tyler Bunnel, from Stretchr.com, asked for the ability to increase that limit. Thanks to both, and everybody else who chimed in as well.

TCP name resolution times out after 10 seconds

Previous releases had no driver-enforced timeout.

Reported by Cailin Nelson, MongoDB.

json.Number is marshalled as a number

The encoding/json package in the standard library allows numbers to be held as a string with the json.Number type so that code can take into account their visual representation for differentiating between integers and floats (a non-standard behavior). The mgo/bson package will now recognize those values properly, and marshal them as BSON int64s or float64s based on the number representation.

New GridFile.Abort method for canceling uploads

When called, GridFile.Abort cancels an upload in progress so that closing the file being written to will drop all uploaded chunks rather than atomically making the file available.

Feature requested by Roger Peppe, Canonical.

GridFile.Close drops chunks on write errors

Previously, a write error would prevent the

File">

File">GridFile file from being created, but would leave already written chunks in the database. Now those chunks are dropped when the file is closed.

Support for PLAIN (LDAP) authentication

The driver can now authenticate against MongoDB servers set up using the PLAIN mechanism, which enables authentication against LDAP servers as documented.

Feature requested by Cailin Nelson, MongoDB.

Preliminary support for Bulk API

The ability to execute certain operations in bulk mode is being added to the 2.6 release of the MongoDB server. This allows queuing up inserts, updates, and removals to be sent in batches rather than one by one.

This release includes an experimental API to support that, including compatibility with previous version of the MongoDB server for the added features. The functionality is exposed via the Collection.Bulk method, which returns a value with type *mgo.Bulk that contains the following methods at the moment:

• Bulk.Insert – enqueues one or more insert operations
• Bulk.Run – runs all operations in the queue
• Bulk.Unordered – use unordered mode, so latter operations may proceed when prior ones failed

Besides preparing for the complete bulk API with more methods, this preliminary change adds support for the ContinueOnError wire protocol flag of the insert operation in a way that will remain compatible with the upcoming bulk operation commands, via the unordered mode.

The latter feature was requested by Chandra Sekar, and justifies the early release of the API.

Various compatibility improvements in user handling for 2.6

The MongoDB 2.6 release includes a number of changes in user handling functionality. The existing methods have been internally changed to preserve compatibility to the extent possible. Certain features, such as the handling of the User.UserSource field, cannot be preserved and will cause an error to be reported when used against 2.6.

Wake up nonce waiters on socket death

Problem reported and diagnosed by John Morales, MongoDB.

Don’t burn CPU if no masters are found and FailFast is set

Problem also reported by John Morales, MongoDB.

Stop Iter.Next if failure happens at get-more issuing time

Problem reported by Daniel Gottlieb, MongoDB.

Various innocuous race detector reports fixed

Running the test suite with the race detector enabled would raise various issues due to global variable modifications that are only done and only accessible to the test suite itself. These were fixed and the race detector now runs cleanly over the test suite.

Thanks to everybody that contributed to this release.

Michael Hall

## When is a fork not a fork?

Technically a fork is any instance of a codebase being copied and developed independently of its parent.  But when we use the word it usually encompasses far more than that. Usually when we talk about a fork we mean splitting the community around a project, just as much as splitting the code itself. Communities are not like code, however, they don’t always split in consistent or predictable ways. Nor are all forks the same, and both the reasons behind a fork, and the way it is done, will have an effect on whether and how the community around it will split.

There are, by my observation, three different kinds of forks that can be distinguished by their intent and method.  These can be neatly labeled as Convergent, Divergent and Emergent forks.

## Convergent Forks

Most often when we talk about forks in open source, we’re talking about convergent forks. A convergent fork is one that shares the same goals as its parent, seeks to recruit the same developers, and wants to be used by the same users. Convergent forks tend to happen when a significant portion of the parent project’s developers are dissatisfied with the management or processes around the project, but otherwise happy with the direction of its development. The ultimate goal of a convergent fork is to take the place of the parent project.

Because they aim to take the place of the parent project, convergent forks must split the community in order to be successful. The community they need already exists, both the developers and the users, around the parent project, so that is their natural source when starting their own community.

## Divergent Forks

Less common that convergent forks, but still well known by everybody in open source, are the divergent forks.  These forks are made by developers who are not happy with the direction of a project’s development, even if they are generally satisfied with its management.  The purpose of a divergent fork is to create something different from the parent, with different goals and most often different communities as well. Because they are creating a different product, they will usually be targeting a different group of users, one that was not well served by the parent project.  They will, however, quite often target many of the same developers as the parent project, because most of the technology and many of the features will remain the same, as a result of their shared code history.

Divergent forks will usually split a community, but to a much smaller extent than a convergent fork, because they do not aim to replace the parent for the entire community. Instead they often focus more on recruiting those users who were not served well, or not served at all, by the existing project, and will grown a new community largely from sources other than the parent community.

## Emergent Forks

Emergent forks are not technically forks in the code sense, but rather new projects with new code, but which share the same goals and targets the same users as an existing project.  Most of us know these as NIH, or “Not Invented Here”, projects. They come into being on their own, instead of splitting from an existing source, but with the intention of replacing an existing project for all or part of an existing user community. Emergent forks are not the result of dissatisfaction with either the management or direction of an existing project, but most often a dissatisfaction with the technology being used, or fundamental design decisions that can’t be easily undone with the existing code.

Because they share the same goals as an existing project, these forks will usually result in a split of the user community around an existing project, unless they differ enough in features that they can targets users not already being served by those projects. However, because they do not share much code or technology with the existing project, they most often grow their own community of developers, rather than splitting them from the existing project as well.

All of these kinds of forks are common enough that we in the open source community can easily name several examples of them. But they are all quite different in important ways. Some, while forks in the literal sense, can almost be considered new projects in a community sense.  Others are not forks of code at all, yet result in splitting an existing community none the less. Many of these forks will fail to gain traction, in fact most of them will, but some will succeed and surpass those that came before them. All of them play a role in keeping the wider open source economy flourishing, even though we may not like them when they affect a community we’ve been involved in building.

facundo

## Malena y el mate

¡Le encanta! Lo señala y dice "agua", para que le ponga agua del termo, y luego toma, y toma... a veces se chorrea un poco, lo cual se complica porque las manchas de mate en la ropa son un tema, pero bueh...

Nicholas Skaggs

## A new test runner approaches

The problem
How acceptance tests are packaged and run has morphed over time. When autopilot was originally conceived the largest user was the unity project and debian packaging was the norm. Now that autopilot has moved well beyond that simple view to support many types of applications running across different form factors, it was time to address the issue of how to run and package these high-level tests.

While helping develop testsuites for the core apps targeting ubuntu touch, it became increasingly difficult for developers to run their application's testsuites. This gave rise to further integration points inside qtcreator, enhancements to click and its manifest files, and tools like the phablet-tools suite and click-buddy. All of these tools operate well within the confines they are intended, but none truly meets the needs for test provisioning and execution.

A solution?
With these thoughts in mind I opened the floor for discussion a couple months ago detailing the need for a proper tool that could meet all of my needs, as well as those of the application developer, test author and CI folks. In a nutshell, a workflow to setup a device as well as properly manage dependencies and resolve them was needed.

Autopkg tests all the things
I'm happy to report that as of a couple weeks ago such a tool now exists in autopkgtest. If the name sounds familar, that's because it is. Autopkgtest already runs all of our automated testing at the archive level. New package uploads are tested utilizing its toolset.

So what does this mean? Utilizing the format laid out by autopkgtest, you can now run your autopilot testsuite on a phablet device in a sane manner. If you have test dependencies, they can be defined and added to the click manifest as specified. If you don't have any test dependencies, then you can run your testsuite today without any modifications to the click manifest.

Yes, but what does this really mean?
This means you can now run a testsuite with adt-run in a similar manner to how debian packages are tested. The runner will setup the device, copy the tests, resolve any dependencies, run them, and report the results back to you.

Some disclaimers
Support for running tests this way is still new. If you do find a bug, please file it!

To use the tool first install autopkgtest. If you are running trusty, the version in the archive is old. For now download the utopic deb file and install it manually. A proper backport still needs to be done.

Also as of this writing, I must caution you that you may run into this bug. If the application fails to download dependencies (you see 404 errors during setup), update your device to the latest image and try again. Note, the latest devel image might be too old if a new image hasn't been promoted in a long time.

I want to see it!
Go ahead, give it a whirl with the calendar application (or your favorite core app). Plug in a device, then run the following on your pc.

bzr branch lp:ubuntu-calendar-app

Autopkgtest will give you some output along the way about what is happening. The tests will be copied, and since --click= was specified, the runner will use the click from the device, install the click in our temporary environment, and read the click manifest file for dependencies and install those too. Finally, the tests will be executed with the results returned to you.

Please try running your autopilot testsuites this way and give feedback! Feel free to contact myself, the upstream authors (thanks Martin Pitt for adding support for this!), or simply file a bug. If you run into trouble, utilize the -d and the --shell switches to get more insight into what is happening while running.

Nicholas Skaggs

## Utopic Test Writing Hackfest

We're having our first hackfest of the utopic cycle this week on Tuesday, July 15th. You can catch us live in a hangout on ubuntuonair.com starting at 1900 UTC. Everything you need to know can be found on the wiki page for the event.

During the hangout, we'll be demonstrating writing a new manual testcase, as well as reviewing writing automated testcases. We'll be answering any questions you have as well about contributing a testcase.

We need your help to write some new testcases! We're targeting both manual and automated testcase, so everyone is welcome to pitch in.

We are looking at writing and finishing some testcases for ubuntu studio and some other flavors. All you need is some basic tester knowledge and the ability to write in English.

If you know python, we are also going to be hacking on the toolkit helper for autopilot for the ubuntu sdk. That's a mouthful! Specifically it's the helpers that we use for writing autopilot tests against ubuntu-sdk applications. All app developers make use of these helpers, and we need more of them to ensure we have good coverage for all components developers use.

Don't worry about getting stuck, we'll be around to help, and there's guides to well, guide you!

Hope to see everyone there!

Michael Hall

## Content Hub to replace Friends API

As part of the continued development of the Ubuntu platform, the Content Hub has gained the ability to share links (and soon text) as a content type, just as it has been able to share images and other file-based content in the past. This allows applications to more easily, and more consistently, share things to a user’s social media accounts.

## Consolidating APIs

Thanks to the collaborative work going on between the Content Hub and the Ubuntu Webapps developers, it is now possible for remote websites to be packaged with local user scripts that provide deep integration with our platform services. One of the first to take advantage of this is the Facebook webapp, which while displaying remote content via a web browser wrapper, is also a Content Hub importer. This means that when you go to share an image from the Gallery app, the Facebook webapp is displayed as an optional sharing target for that image. If you select it, it will use the Facebook web interface to upload that image to your timeline, without having to go through the separate Friends API.

This work not only brings the social sharing user experience inline with the rest of the system’s content sharing experience, it also provide a much simpler API for application developers to use for accomplishing the same thing. As a result, the Friends API is being deprecated in favor of the new Content Hub functionality.

## What it means for App Devs

Because this is an API change, there are things that you as an app developer need to be aware of. First, though the API is being deprecated immediately, it is not being removed from the device images until after the release of 14.10, which will continue to support the ubuntu-sdk-14.04 framework which included the Friends API. The API will not be included in the final ubuntu-sdk-14.10 framework, or any new 14.10-dev frameworks after -dev2.

After the 14.10 release in October, when device images start to build for utopic+1, the ubuntu-sdk-14.04 framework will no longer be on the images. So if you haven’t updated your Click package by then to use the ubuntu-sdk-14.10 framework, it won’t be available to install on devices with the new image. If you are not using the Friends API, this would simply be a matter of changing your package metadata to the new framework version.  For new apps, it will default to the newer version to begin with, so you shouldn’t have to do anything.

Jane Silber

## “Sometimes the best man for the job isn’t.”

The social and business value of having a diverse workforce is well documented.  Equally well documented is the relative lack of women in technology, and in open source.

At Canonical we are working hard to build a globally diverse workforce. We are well positioned to do so, particularly building on our open source roots, and in areas such as supporting geographic diversity we are quite successful.   However, in terms of gender diversity, women make up only 13% of Canonical and, slightly more encouragingly, 18% of our managers.   It is disappointing to me that despite having one of the most welcoming, collaborative, flexible and meritocratic environments I have known, we still have such a large gender disparity.

As a woman in technology and a CEO, I am aware of the power of positive examples.  While we need to learn from and eliminate the discouragement, barriers and illegal behaviour which continues to haunt women in technology, we should also celebrate the possibilities, highlight the opportunities and help illuminate a path for others to follow.  In that vein, I’d like to introduce you to a few of the amazing women in technical leadership roles in Canonical.

Alexis Bruemmer is the Engineering Manager for Canonical’s Juju team – a team of brilliant engineers working to make cloud orchestration easy, portable and flawless.  Alexis has been working in Linux since her graduation in 2005 and is passionate about open source.  Prior to Canonical, Alexis was at IBM’s Linux Technology Center.  Beyond her work as a professional, she is active in the community promoting STEM outreach as Vice Chair for Saturday Academy and long time member of Society of Women Engineers.

Ara Pulido is the Hardware Certification Manager at Canonical, leading the team that defines and ensures the quality bar for desktops and laptops pre-installed with Ubuntu. She discovered Free Software at college, where she was a founding member of the local LUG back in 2002. She joined Canonical 6 years ago in the Ubuntu Engineering QA team. You can follow her at https://twitter.com/arapulido.

Leann Ogasawara is the Engineering Manager for our Kernel Team, following a series of promotions at Canonical from Kernel QA to Kernel Engineer to overall team manager.  She has been involved in Linux and Open Source for over a decade.  Before coming to Canonical in 2007, Leann was at the Open Source Development Labs.

Pat Gaughen is the Engineering Manager for the fabulous Ubuntu Server and Openstack Development team.  She’s worked in Linux since 1999, and has been in love with Operating System internals for even longer. Prior to Canonical, Pat was at the IBM Linux Technology Center.

Roxanne Fan is the Quality Assurance Manager in our Devices Commercial Engineering team. She has been working in data mining for software quality improvement and automation tool development for the past 12 years. She wrote her Masters thesis on the performance of innovative routing for wireless sensor networks in the Ubuntu system. Before Canonical, she was at Pegatron Corp.

There are of course many reasons why women join and succeed at Canonical – great technology, inspirational colleagues, the opportunity to innovate, and to fundamentally have an impact on people’s mobile and cloud computing experiences.  Some of the less visible yet fundamental characteristics of Canonical which allow women to succeed in leadership positions include:

• A commitment to a respectful, collaborative, meritocratic environment sets the stage. One of the earliest manifestations of this commitment was encoded in the Ubuntu Code of Conduct.  This clear statement of expectations has helped make the Ubuntu community a welcoming place for women, and applies in equal measure to Canonical.
• Our recruitment philosophy of ‘hire only the best people’,  largely unrestricted by geographical boundaries, provides us with the opportunity to grow and support a diverse workforce.   It enables us to consider candidates of varying locations,  economic circumstances, gender, and physical ability.   Like all organisations we want the best person for the role, and leveraging our expertise in distributed, multi-cultural environments allows us to widen our recruiting net significantly.  Across all Canonical companies, our staff is 30% UK, 32% US, and 38% rest of world.  Those percentages are approximately the same when looking at all staff or management/leadership roles, thus providing excellent leadership opportunities in sometimes underserved markets.
• We operate on a largely distributed environment and strive to support both home-based and office-based workers in equal measure.    With 75% of our employees working remotely we have an extremely high trust environment, thereby empowering employees to integrate working life with home life.  This approach has enabled us to retain men and women who otherwise may have left due to family demands.

I find the women above inspiring and am proud to work with them and many others of the same calibre. But we still have a long road to travel for our diversity figures to be where they should be.    As with the root causes of the problem, the solution is multi-faceted and complex.  We know that there is much more we can do to attract and retain greater diversity at Canonical, and are redoubling our efforts to do so.  As a first step, come join us!

Colin Ian King

## a final few more features in stress-ng

While hoping to get a feature complete stress-ng sooner than later, I found a few more ways to fiendishly stress a system.

Stress-ng 0.01.22 will be landing soon in Ubuntu 14.10 with three more stress mechanisms:

• CPU affinity stressing; this rapidly changes CPU affinity of the stress processes just to keep the scheduling busy wasting effort.
• Timer stressing using the real-time clock; this allows one to generate a large amount of timer interrupts, so it is a useful interrupt saturation test.
• Directory entry thrashing; this creates and deletes a selectable number of zero length files and hence populates and destroys directory entries.
I have also removed the need to use rand() for random number generation for some of the stress tests and re-used a the faster MWC "random" number generator to add in some well known and very simple math operations for CPU stressing.

Stress-ng now has 15 different simple stress mechanisms that exercise CPU, cache, memory, file system, I/O and CPU schedulers.  I could add more tests, but I think this is a large enough set to allow one to thrash a machine and see how well it performs under pressure.

facundo

## Películas, y más películas

Algunas tardes de sábado libre, más un par de viajes, hicieron que no me quede atrás con las películas...

• A good old fashioned orgy: +0. De esas comedias livianas sobre la amistad y el amor. Divertida, no es gran cosa, pero está bien.
• Apollo 18: -0. La idea general es interesante, pero ese querer simular una película con pedazos de filmaciones "reales" hace que todo sea muy forzado, especialmente cuando se nota lo "no real".
• Catch .44: -0. Desordenada, aburrida, sin una historia que valga la pena.
• Jack Ryan: Shadow recruit: +0. Una de acción, bien hecha, pero no más que eso.
• Le noms des gens: +0. Una comedia simpática que plantea puntos interesantes sobre los franceses... y sobre el amor.
• Like crazy: -0. Una historia de amor, que muestra las dificultades de las distancias. Aunque tiene sus momentos, en general la peli es muy lenta, y no entusiasma.
• Prometheus: +0. Es una más de "alien", pero bien hecha, me gustó bastante, aunque al final no dejar de ser eso... una más de "alien" :)
• Sherlock Holmes: A game of shadows: +0. Está perdiendo un poco la gracia (y es la segunda!), pero las buenas actuaciones y una historia interesante la salvan.
• The darkest hour: -0. Un par de conceptos interesantes... pero no deja de ser una peli yanqui de adolescentes.
• The girl with the dragon tattoo: +1. Una gran película, muy densa en contenidos (deja entrever que el libro es mucho más sabroso) y que te tiene atrapado hasta el final. Eso sí, muy dura (es un warning, no le resta).
• The grey: +0. Fuerte, dura, pero bastante conceptual, con planteos que me gustaron. Estaría mejor si no estuviese bastante llena de golpes bajos al pedo (como ruidos que te quieren sorprender cuando lo único que sucedes es un cambio de encuadre que no tiene nada que ver).
• The monuments men: +0. Un punto de vista sobre la guerra como destructor de culturas que no había pensado antes. Bien armada, con actuaciones decentes.
• The rum diary: -0. Tiene un trasfondo interesante, pero la peli en general es aburrida, y no vale la pena.
• Underworld: Awakening: -0. Ni siquiera más de lo mismo, antes las "underworld" tenían conceptos o historias interesantes... está es una mezcla mala de resident evil y blade; no más underworld para mí, gracias.

Pocas nuevas, eso sí, incluso metiendo una peli vieja en la lista...

Finalmente, el conteo de pendientes por fecha:

(Ene-2009)   12   1   1
(May-2009)   10   5
(Oct-2009)   15  14
(Mar-2010)   18  18  16   4
(Sep-2010)   18  18  18  18   9   2   1
(Dic-2010)   13  13  12  12  12   5   1
(Abr-2011)   23  23  23  23  23  22  17   4
(Ago-2011)   12  12  11  11  11  11  11  11   4
(Ene-2012)       21  21  18  17  17  17  17  11   3
(Jul-2012)           15  15  15  15  15  15  14  11
(Nov-2012)               12  12  11  11  11  11  11
(Feb-2013)                   19  19  16  15  14  14
(Jun-2013)                       19  18  16  15  15
(Sep-2013)                           18  18  18  18
(Dic-2013)                               14  14  12
(Abr-2014)                                    9   9
(Jul-2014)                                       10
Total:      127 125 117 113 118 121 125 121 110 103

Dustin Kirkland

## Scalable, Parallel Video Transcoding on Ubuntu

Transcoding video is a very resource intensive process.

It can take many minutes to process a small, 30-second clip, or even hours to process a full movie.  There are numerous, excellent, open source video transcoding and processing tools freely available in Ubuntu, including libav-toolsffmpegmencoder, and handbrake.  Surprisingly, however, none of those support parallel computing easily or out of the box.  And disappointingly, I couldn't find any MPI support readily available either.

I happened to have an Orange Box for a few days recently, so I decided to tackle the problem myself, and develop a scalable, parallel video transcoding solution myself.  I'm delighted to share the result with you today!

When it comes to commercial video production, it can take thousands of machines, hundreds of compute hours to render a full movie.  I had the distinct privilege some time ago to visit WETA Digital in Wellington, New Zealand and tour the render farm that processed The Lord of the Rings triology, Avatar, and The Hobbit, etc.  And just a few weeks ago, I visited another quite visionary, cloud savvy digital film processing firm in Hollywood, called Digital Film Tree.

Windows and Mac OS may be the first platforms that come to mind, when you think about front end video production, Linux is far more widely used for batch video processing, and with Ubuntu, in particular, being extensively at both WETA Digital and Digital Film Tree, among others.

While I could have worked with any of a number of tools, I settled on avconv (the successor(?) of ffmpeg), as it was the first one that I got working well on my laptop, before scaling it out to the cluster.

I designed an approach on my whiteboard, in fact quite similar to some work I did parallelizing and scaling the john-the-ripper password quality checker.

At a high level, the algorithm looks like this:
1. Create a shared network filesystem, simultaneously readable and writable by all nodes
2. Have the master node split the work into even sized chunks for each worker
3. Have each worker process their segment of the video, and raise a flag when done
4. Have the master node wait for each of the all-done flags, and then concatenate the result
And that's exactly what I implemented that in a new transcode charm and transcode-cluster bundle.  It provides linear scalability and performance improvements, as you add additional units to the cluster.  A transcode job that takes 24 minutes on a single node, is down to 3 minutes on 8 worker nodes in the Orange Box, using Juju and MAAS against physical hardware nodes.

For the curious, the real magic is in the config-changed hook, which has decent inline documentation.

The trick, for anyone who might make their way into this by way of various StackExchange questions and (incorrect) answers, is in the command that splits up the original video (around line 54):

avconv -ss $start_time -i$filename -t $length -s$size -vcodec libx264 -acodec aac -bsf:v h264_mp4toannexb -f mpegts -strict experimental -y ${filename}.part${current_node}.ts

And the one that puts it back together (around line 72):

avconv -i concat:"$concat" -c copy -bsf:a aac_adtstoasc -y${filename}_${size}_x264_aac.${format}

I found this post and this documentation particularly helpful in understanding and solving the problem.

In any case, once deployed, my cluster bundle looks like this.  8 units of transcoders, all connected to a shared filesystem, and performance monitoring too.

I was able to leverage the shared-fs relation provided by the nfs charm, as well as the ganglia charm to monitor the utilization of the cluster.  You can see the spikes in the cpu, disk, and network in the graphs below, during the course of a transcode job.

For my testing, I downloaded the movie Code Rushfreely available under the CC-BY-NC-SA 3.0 license.  If you haven't seen it, it's an excellent documentary about the open source software around Netscape/Mozilla/Firefox and the dotcom bubble of the late 1990s.

Oddly enough, the stock, 746MB high quality MP4 video doesn't play in Firefox, since it's an mpeg4 stream, rather than H264.  Fail.  (Yes, of course I could have used mplayer, vlc, etc., that's not the point ;-)

Perhaps one of the most useful, intriguing features of HTML5 is it's support for embedding multimedia, video, and sound into webpages.  HTML5 even supports multiple video formats.  Sounds nice, right?  If it only were that simple...  As it turns out, different browsers have, and lack support for the different formats.  While there is no one format to rule them all, MP4 is supported by the majority of browsers, including the two that I use (Chromium and Firefox).  This matrix from w3schools.com illustrates the mess.

 http://www.w3schools.com/html/html5_video.asp

The file format, however, is only half of the story.  The audio and video contents within the file also have to be encoded and compressed with very specific codecs, in order to work properly within the browsers.  For MP4, the video has to be encoded with H264, and the audio with AAC.

Among the various brands of phones, webcams, digital cameras, etc., the output format and codecs are seriously all over the map.  If you've ever wondered what's happening, when you upload a video to YouTube or Facebook, and it's a while before it's ready to be viewed, it's being transcoded and scaled in the background.

In any case, I find it quite useful to transcode my videos to MP4/H264/AAC format.  And for that, a scalable, parallel computing approach to video processing would be quite helpful.

During the course of the 3 minute run, I liked watching the avconv log files of all of the nodes, using Byobu and Tmux in a tiled split screen format, like this:

Also, the transcode charm installs an Apache2 webserver on each node, so you can expose the service and point a browser to any of the nodes, where you can find the input, output, and intermediary data files, as well as the logs and DONE flags.

Once the job completes, I can simply click on the output file, Code_Rush.mp4_1280x720_x264_aac.mp4, and see that it's now perfectly viewable in the browser!

In case you're curious, I have verified the same charm with a couple of other OGG, AVI, MPEG, and MOV input files, too.

Beyond transcoding the format and codecs, I have also added configuration support within the charm itself to scale the video frame size, too.  This is useful to take a larger video, and scale it down to a more appropriate size, perhaps for a phone or tablet.  Again, this resource intensive procedure perfectly benefits from additional compute units.

File format, audio/video codec, and frame size changes are hardly the extent of video transcoding workloads.  There are hundreds of options and thousands of combinations, as the manpages of avconv and mencoder attest.  All of my scripts and configurations are free software, open source.  Your contributions and extensions are certainly welcome!

In the mean time, I hope you'll take a look at this charm and consider using it, if you have the need to scale up your own video transcoding ;-)

Cheers,
Dustin

facundo

## ¿Dónde estaba Dios?

Hay un micro de la segunda temporada del genial programa "Filosofía aquí y ahora", de José Pablo Feinmann, que se llama ¿Dónde estaba Dios?. Lo pueden ver acá, pero les transcribo el texto, que es cortito...

En el tema de Auschwitz está el tema de Dios.

Es un tema muy muy complejo, y más de un teólogo se pone de muy mal humor cuando se menciona el tema de Dios con respecto a Auschwitz, pero más de un filósofo ha preguntado ¿dónde estaba Dios en Auschwitz?

E incluso Primo Levi, que es el gran escritor judío, que escribe "Si esto es un hombre", "Los hundidos y los salvados"... dice "Existe Auschwitz, no existe Dios".

Y Karl Löwith, que también es un gran pensador judío, discípulo de Heidegger, dice "Luego de Auschwitz es imposible pensar un Dios totalmente bueno".

Es muy contundente, todo esto. Hay un tango, incluso, genial, que dice "¿dónde estaba Dios cuando te fuiste?". O sea, nuestra tanguística es tan excepcional que puede ser comparada con los grandes filósofos del siglo XX.

Es lo mismo, ¿dónde estaba Dios cuando te fuiste?.

Y como cierre, tenemos al Chino Laborde, invitado de Demoliendo Tangos, interpretando, justamente, "Canción desesperada", de Enrique Santos Discépolo.