Canonical Voices

Posts tagged with 'canonical'

Michael Hall

As most you you know by now, Ubuntu 16.04 will be dropping the old Ubuntu Software Center in favor of the newer Gnome Software as the graphical front-end to both the Ubuntu archives and 3rd party application store.

Gnome Software

Gnome Software provides a lot of the same enhancements over simple package managers that USC did, and it does this using a new metadata format standard called AppStream. While much of the needed AppStream data can be extracted from the existing packages in the archives, sometimes that’s not sufficient, and that’s when we need people to help fill the gaps.

It turns out that the bulk of the missing or incorrect data is caused by the application icons being used by app packages. While most apps already have an icon, it was never strictly enforced before, and the size and format allowed by the desktop specs was more lenient than what’s needed now.  These lower resolution icons might have been fine for a menu item, but they don’t work very well for a nice, beautiful App Store interface like Gnome Software. And that’s where you can help!

Don’t worry, contributing icons isn’t hard, and it doesn’t require any knowledge of programming or packing to do. Best of all, you’ll not only be helping Ubuntu, but you’ll also be contributing to any other distro that uses the AppStream standard too! In the steps below I will walk you through the process of finding an app in need, getting the correct icon for it, and contributing it to the upstream project and Ubuntu.

1) Pick an App

Because the AppStream data is being automatically extracted from the contents of existing packages, we are able to tell which apps are in need of new icons, and we’ve generated a list of them, sorted by popularity (based on PopCon stats) so you can prioritize your contributions to where they will help the most users. To start working on one, first click the “Create” link to file a new bug report against the package in Ubuntu. Then replace that link in the wiki with a link to your new bug, and put your name in the “Claimed” column so that others know you’ve already started work on it.

Apps with Icon ErrorsNote that a package can contain multiple .desktop files, each of which has it’s own icon, and your bug report will be specific to just that one metadata file. You will also need to be a member of the ~ubuntu-etherpad team (or sub-team like ~ubuntumembers) in order to edit the wiki, you will be asked to verify that membership as part of the login process with Ubuntu SSO.

2) Verify that an AppStream icon is needed

While the extraction process is capable of identifying what packages have a missing or unsupported image in them, it’s not always smart enough to know which packages should have this AppStream data in the first place. So before you get started working on icons, it’s best to first make sure that the metadata file you picked should be part of the AppStream index in the first place.

Because AppStream was designed to be application-centric, the metadata extraction process only looks at those with Type=Application in their .desktop file. It will also ignore any .desktop files with NoDisplay=True in them. If you find a file in the list that shouldn’t be indexed by AppStream, chances are one or both of these values are set incorrectly. In that case you should change your bug description to state that, rather than attaching an icon to it.

3) Contact Upstream

Since there is nothing Ubuntu-specific about AppStream data or icons, you really should be sending your contribution upstream to the originating project. Not only is this best for Ubuntu (carrying patches wastes resources), but it’s just the right thing to do in the open source community. So the after you’ve chosen an app to work on and verfied that it does in fact need a new icon for AppStream, the very next thing you should do is start talking to the upstream project developers.

Start by letting them know that you want to contribute to their project so that it integrates better with AppStream enabled stores (you can reference these Guidelines if they’re not familiar with it), and opening a similar bug report in their bug tracker if they don’t have one already. Finally, be sure to include a link to that upstream bug report in the Ubuntu bug you opened previously so that the Ubuntu developers know the work is also going into upstream to (your contribute might be rejected otherwise).

4) Find or Create an Icon

Chances are the upstream developers already have an icon that meets the AppStream requirements, so ask them about it before trying to find one on your own. If not, look for existing artwork assets that can be used as a logo, and remember that it needs to be at least 64×64 pixels (this is where SVGs are ideal, as they can be exported to any size). Whatever you use, make sure that it matches the application’s current branding, we’re not out to create a new logo for them after all. If you do create a new image file, you will need to make it available under the CC-BY-SA license.

While AppStream only requires a 64×64 pixel image, many desktops (including Unity) will benefit from having even higher resolution icons, and it’s always easier to scale them down than up. So if you have the option, try to provide a 256×256 icon image (or again, just an SVG).

5) Submit your icon

Now that you’ve found (or created) an appropriate icon, it’s time to get it into both the upstream project and Ubuntu. Because each upstream will be different in how they want you to do that, you will need to ask them for guidance (and possibly assistance) in order to do that. Just make sure that you update the upstream bug report with your work, so that the Ubuntu developers can see that it’s been done.

Ubuntu 16.04 has already synced with Debian, so it’s too late for these changes in the upstream project to make their way into this release. In order to get them into 16.04, the Ubuntu packages will have to carry a patch until the changes that land in upstream have the time to make their way into the Ubuntu archives. That’s why it’s so important to get your contribution accepted into the upstream project first, the Ubuntu developers want to know that the patches to their packages will eventually be replaced by the same change from upstream.

attach_file_to_bugTo submit your image to Ubuntu, all you need to do is attach the image file to the bug report you created way back in step #1.

launchpad-subscribeThen, subscribe the “ubuntu-sponsors” team to the bug, these are the Ubuntu developers who will review and apply your icon to the target package, and get it into the Ubuntu archives.

6) Talk about it!

Congratulations, you’ve just made a contribution that is likely to affect millions of people and benefit the entire open source community! That’s something to celebrate, so take to Twitter, Google+, Facebook or your own blog and talk about it. Not only is it good to see people doing these kinds of contributions, it’s also highly motivating to others who might not otherwise get involved. So share your experience, help others who want to do the same, and if you enjoyed it feel free to grab another app from the list and do it again.

Read more
Dustin Kirkland


We at Canonical have conducted a legal review, including discussion with the industry's leading software freedom legal counsel, of the licenses that apply to the Linux kernel and to ZFS.

And in doing so, we have concluded that we are acting within the rights granted and in compliance with their terms of both of those licenses.  Others have independently achieved the same conclusion.  Differing opinions exist, but please bear in mind that these are opinions.

While the CDDL and GPLv2 are both "copyleft" licenses, they have different scope.  The CDDL applies to all files under the CDDL, while the GPLv2 applies to derivative works.

The CDDL cannot apply to the Linux kernel because zfs.ko is a self-contained file system module -- the kernel itself is quite obviously not a derivative work of this new file system.

And zfs.ko, as a self-contained file system module, is clearly not a derivative work of the Linux kernel but rather quite obviously a derivative work of OpenZFS and OpenSolaris.  Equivalent exceptions have existed for many years, for various other stand alone, self-contained, non-GPL kernel modules.

Our conclusion is good for Ubuntu users, good for Linux, and good for all of free and open source software.

As we have already reached the conclusion, we are not interested in debating license compatibility, but of course welcome the opportunity to discuss the technology.

Cheers,
Dustin

EDIT: This post was updated to link to the supportive position paper from Eben Moglen of the SFLC, an amicus brief from James Bottomley, as well as the contrarian position from Bradley Kuhn and the SFC.

Read more
Dustin Kirkland



I had the opportunity to speak at Container World 2016 in Santa Clara yesterday.  Thanks in part to the Netflix guys who preceded me, the room was absolutely packed!

You can download a PDF of my slides here, or flip through them embedded below.

I'd really encourage you to try the demo instructions of LXD toward the end!


:-Dustin

Read more
Dustin Kirkland


Ubuntu 16.04 LTS (Xenial) is only a few short weeks away, and with it comes one of the most exciting new features Linux has seen in a very long time...

ZFS -- baked directly into Ubuntu -- supported by Canonical.

What is ZFS?

ZFS is a combination of a volume manager (like LVM) and a filesystem (like ext4, xfs, or btrfs).

ZFS one of the most beloved features of Solaris, universally coveted by every Linux sysadmin with a Solaris background.  To our delight, we're happy to make to OpenZFS available on every Ubuntu system.  Ubuntu's reference guide for ZFS can be found here, and these are a few of the killer features:
  • snapshots
  • copy-on-write cloning
  • continuous integrity checking against data corruption
  • automatic repair
  • efficient data compression.
These features truly make ZFS the perfect filesystem for containers.

What does "support" mean?

  • You'll find zfs.ko automatically built and installed on your Ubuntu systems.  No more DKMS-built modules!
$ locate zfs.ko
/lib/modules/4.4.0-4-generic/kernel/zfs/zfs/zfs.ko
  • You'll see the module loaded automatically if you use it.

$ lsmod | grep zfs
zfs 2801664 11
zunicode 331776 1 zfs
zcommon 57344 1 zfs
znvpair 90112 2 zfs,zcommon
spl 102400 3 zfs,zcommon,znvpair
zavl 16384 1 zfs

  • The user space zfsutils-linux package will be included in Ubuntu Main, with security updates provided by Canonical (as soon as this MIR is completed).
  • As always, industry leading, enterprise class technical support is available from Canonical with Ubuntu Advantage services.

How do I get started?

It's really quite simple!  Here's a few commands to get you up and running with ZFS and LXD in 60 seconds or less.

First, make sure you're running Ubuntu 16.04 (Xenial).

$ head -n1 /etc/issue
Ubuntu Xenial Xerus (development branch) \n \l

Now, let's install lxd and zfsutils-linux, if you haven't already:

$ sudo apt install lxd zfsutils-linux

Next, let's use the interactive lxd init command to setup LXD and ZFS.  In the example below, I'm simply using a sparse, loopback file for the ZFS pool.  For best results (and what I use on my laptop and production servers), it's best to use a raw SSD partition or device.

$ sudo lxd init
Name of the storage backend to use (dir or zfs): zfs
Create a new ZFS pool (yes/no)? yes
Name of the new ZFS pool: lxd
Would you like to use an existing block device (yes/no)? no
Size in GB of the new loop device (1GB minimum): 2
Would you like LXD to be available over the network (yes/no)? no
LXD has been successfully configured.

We can check our ZFS pool now:

$ sudo zpool list
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
lxd 1.98G 450K 1.98G - 0% 0% 1.00x ONLINE -

$ sudo zpool status
pool: lxd
state: ONLINE
scan: none requested
config:

NAME STATE READ WRITE CKSUM
lxd ONLINE 0 0 0
/var/lib/lxd/zfs.img ONLINE 0 0 0
errors: No known data errors

$ lxc config get storage.zfs_pool_name
storage.zfs_pool_name: lxd

Finally, let's import the Ubuntu LXD image, and launch a few containers.  Note how fast containers launch, which is enabled by the ZFS cloning and copy-on-write features:

$ newgrp lxd
$ lxd-images import ubuntu --alias ubuntu
Downloading the GPG key for http://cloud-images.ubuntu.com
Progress: 48 %
Validating the GPG signature of /tmp/tmpa71cw5wl/download.json.asc
Downloading the image.
Image manifest: http://cloud-images.ubuntu.com/server/releases/trusty/release-20160201/ubuntu-14.04-server-cloudimg-amd64.manifest
Image imported as: 54c8caac1f61901ed86c68f24af5f5d3672bdc62c71d04f06df3a59e95684473
Setup alias: ubuntu

$ for i in $(seq 1 5); do lxc launch ubuntu; done
...
$ lxc list
+-------------------------+---------+-------------------+------+-----------+-----------+
| NAME | STATE | IPV4 | IPV6 | EPHEMERAL | SNAPSHOTS |
+-------------------------+---------+-------------------+------+-----------+-----------+
| discordant-loria | RUNNING | 10.0.3.130 (eth0) | | NO | 0 |
+-------------------------+---------+-------------------+------+-----------+-----------+
| fictive-noble | RUNNING | 10.0.3.91 (eth0) | | NO | 0 |
+-------------------------+---------+-------------------+------+-----------+-----------+
| interprotoplasmic-essie | RUNNING | 10.0.3.242 (eth0) | | NO | 0 |
+-------------------------+---------+-------------------+------+-----------+-----------+
| nondamaging-cain | RUNNING | 10.0.3.9 (eth0) | | NO | 0 |
+-------------------------+---------+-------------------+------+-----------+-----------+
| untreasurable-efrain | RUNNING | 10.0.3.89 (eth0) | | NO | 0 |
+-------------------------+---------+-------------------+------+-----------+-----------+

Super easy, right?

Cheers,
:-Dustin

Read more
Dustin Kirkland


There's no shortage of excitement, controversy, and readership, any time you can work "Docker" into a headline these days.  Perhaps a bit like "Donald Trump", but for CIO tech blogs and IT news -- a real hot button.  Hey, look, I even did it myself in the title of this post!

Sometimes an article even starts out about CoreOS, but gets diverted into a discussion about Docker, like this one, where shykes (Docker's founder and CTO) announced that Docker's default image would be moving away from Ubuntu to Alpine Linux.


I have personally been Canonical's business and technical point of contact with Docker Inc, since September of 2013, when I co-presented at an OpenStack Meetup in Austin, Texas, with Ben Golub and Nick Stinemates of Docker.  I can tell you that, along with most of the rest of the Docker community, this casual declaration in an unrelated Hacker News thread, came as a surprise to nearly all of us!

Docker's default container image is certainly Docker's decision to make.  But it would be prudent to examine at a few facts:

(1) Check DockerHub and you may notice that while Busybox (Alpine Linux) has surpassed Ubuntu in the number downloads (66M to 40M), Ubuntu is still by far the most "popular" by number of "stars" -- likes, favorites, +1's, whatever, (3.2K to 499).

(2) Ubuntu's compressed, minimal root tarball is 59 MB, which is what is downloaded over the Internet.  That's different from the 188 MB uncompressed root filesystem, which has been quoted a number of times in the press.

(3) The real magic of Docker is such that you only ever download that base image, one time!  And you only store one copy of the uncompressed root filesystem on your disk! Just once, sudo docker pull ubuntu, on your laptop at home or work, and then launch thousands of images at a coffee shop or airport lounge with its spotty wifi.  Build derivative images, FROM ubuntu, etc. and you only ever store the incremental differences.

Actually, I encourage you to test that out yourself...  I just launched a t2.micro -- Amazon's cheapest instance type with the lowest networking bandwidth.  It took 15.938s to sudo apt install docker.io.  And it took 9.230s to sudo docker pull ubuntu.  It takes less time to download Ubuntu than to install Docker!

ubuntu@ip-172-30-0-129:~⟫ time sudo apt install docker.io -y
...
real 0m15.938s
user 0m2.146s
sys 0m0.913s

As compared to...

ubuntu@ip-172-30-0-129:~⟫ time sudo docker pull ubuntu
latest: Pulling from ubuntu
f15ce52fc004: Pull complete
c4fae638e7ce: Pull complete
a4c5be5b6e59: Pull complete
8693db7e8a00: Pull complete
ubuntu:latest: The image you are pulling has been verified. Important: image verification is a tech preview feature and should not be relied on to provide security.
Digest: sha256:457b05828bdb5dcc044d93d042863fba3f2158ae249a6db5ae3934307c757c54
Status: Downloaded newer image for ubuntu:latest
real 0m9.230s
user 0m0.021s
sys 0m0.016s

Now, sure, it takes even less than that to download Alpine Linux (0.747s by my test), but again you only ever do that once!  After you have your initial image, launching Docker containers take the exact same amount of time (0.233s) and identical storage differences.  See:

ubuntu@ip-172-30-0-129:/tmp/docker⟫ time sudo docker run alpine /bin/true
real 0m0.233s
user 0m0.014s
sys 0m0.001s
ubuntu@ip-172-30-0-129:/tmp/docker⟫ time sudo docker run ubuntu /bin/true
real 0m0.234s
user 0m0.012s
sys 0m0.002s

(4) I regularly communicate sincere, warm congratulations to our friends at Docker Inc, on its continued growth.  shykes publicly mentioned the hiring of the maintainer of Alpine Linux in that Hacker News post.  As a long time Linux distro developer myself, I have tons of respect for everyone involved in building a high quality Linux distribution.  In fact, Canonical employs over 700 people, in 44 countries, working around the clock, all calendar year, to make Ubuntu the world's most popular Linux OS.  Importantly, that includes a dedicated security team that has an outstanding track record over the last 12 years, keeping Ubuntu servers, clouds, desktops, laptops, tablets, and phones up-to-date and protected against the latest security vulnerabilities.  I don't know personally Natanael, but I'm intimately aware of what a spectacular amount of work it is to maintain and secure an OS distribution, as it makes its way into enterprise and production deployments.  Good luck!

(5) There are currently 5,854 packages available via apk in Alpine Linux (sudo docker run alpine apk search -v).  There are 8,862 packages in Ubuntu Main (officially supported by Canonical), and 53,150 binary packages across all of Ubuntu Main, Universe, Restricted, and Multiverse, supported by the greater Ubuntu community.  Nearly all 50,000+ packages are updated every 6 months, on time, every time, and we release an LTS version of Ubuntu and the best of open source software in the world every 2 years.  Like clockwork.  Choice.  Velocity.  Stability.  That's what Ubuntu brings.

Docker holds a special place in the Ubuntu ecosystem, and Ubuntu has been instrumental in Docker's growth over the last 3 years.  Where we go from here, is largely up to the cross-section of our two vibrant communities.

And so I ask you honestly...what do you want to see?  How would you like to see Docker and Ubuntu operate together?

I'm Canonical's Product Manager for Ubuntu Server, I'm responsible for Canonical's relationship with Docker Inc, and I will read absolutely every comment posted below.

Cheers,
:-Dustin

p.s. I'm speaking at Container Summit in New York City today, and wrote this post from the top of the (inspiring!) One World Observatory at the World Trade Center this morning.  Please come up and talk to me, if you want to share your thoughts (at Container Summit, not the One World Observatory)!


Read more
Dustin Kirkland

People of earth, waving at Saturn, courtesy of NASA.
“It Doesn't Look Like Ubuntu Reached Its Goal Of 200 Million Users This Year”, says Michael Larabel of Phoronix, in a post that it seems he's been itching to post for months.

Why the negativity?!? Are you sure? Did you count all of them?

No one has.

How many people in the world use Ubuntu?

Actually, no one can count all of the Ubuntu users in the world!

Canonical, unlike Apple, Microsoft, Red Hat, or Google, does not require each user to register their installation of Ubuntu.

Of course, you can buy laptops preloaded with Ubuntu from Dell, HP, Lenovo, and Asus.  And there are millions of them out there.  And you can buy servers powered by Ubuntu from IBM, Dell, HP, Cisco, Lenovo, Quanta, and compatible with the OpenCompute Project.

In 2011, hardware sales might have been how Mark Shuttleworth hoped to reach 200M Ubuntu users by 2015.

But in reality, hundreds of millions of PCs, servers, devices, virtual machines, and containers have booted Ubuntu to date!

Let's look at some facts...
  • Docker users have launched Ubuntu images over 35.5 million times.
  • HashiCorp's Vagrant images of Ubuntu 14.04 LTS 64-bit have been downloaded 10 million times.
  • At least 20 million unique instances of Ubuntu have launched in public clouds, private clouds, and bare metal in 2015 itself.
    • That's Ubuntu in clouds like AWS, Microsoft Azure, Google Compute Engine, Rackspace, Oracle Cloud, VMware, and others.
    • And that's Ubuntu in private clouds like OpenStack.
    • And Ubuntu at scale on bare metal with MAAS, often managed with Chef.
  • In fact, over 2 million new Ubuntu cloud instances launched in November 2015.
    • That's 67,000 new Ubuntu cloud instances launched per day.
    • That's 2,800 new Ubuntu cloud instances launched every hour.
    • That's 46 new Ubuntu cloud instances launched every minute.
    • That's nearly one new Ubuntu cloud instance launched every single second of every single day in November 2015.
  • And then there are Ubuntu phones from Meizu.
  • And more Ubuntu phones from BQ.
  • Of course, anyone can install Ubuntu on their Google Nexus tablet or phone.
  • Or buy a converged tablet/desktop preinstalled with Ubuntu from BQ.
  • Oh, and the Tesla entertainment system?  All electric Ubuntu.
  • Google's self-driving cars?  They're self-driven by Ubuntu.
  • George Hotz's home-made self-driving car?  It's a homebrewed Ubuntu autopilot.
  • Snappy Ubuntu downloads and updates for Raspberry Pi's and Beagle Bone Blacks -- the response has been tremendous.  Download numbers are astounding.
  • Drones, robots, network switches, smart devices, the Internet of Things.  More Snappy Ubuntu.
  • How about Walmart?  Everyday low prices.  Everyday Ubuntu.  Lots and lots of Ubuntu.
  • Are you orchestrating containers with Kubernetes or Apache Mesos?  There's plenty of Ubuntu in there.
  • Kicking PaaS with Cloud Foundry?  App instances are Ubuntu LXC containers.  Pivotal has lots of serious users.
  • And Heroku?  You bet your PaaS those hosted application containers are Ubuntu.  Plenty of serious users here too.
  • Tianhe-2, the world's largest super computer.  Merely 80,000 Xeons, 1.4 TB of memory, 12.4 PB of disk, all number crunching on Ubuntu.
  • Ever watch a movie on Netflix?  You were served by Ubuntu.
  • Ever hitch a ride with Uber or Lyft?  Your mobile app is talking to Ubuntu servers on the backend.
  • Did you enjoy watching The Hobbit?  Hunger Games?  Avengers?  Avatar?  All rendered on Ubuntu at WETA Digital.  Among many others.
  • Do you use Instagram?  Say cheese!
  • Listen to Spotify?  Music to my ears...
  • Doing a deal on Wall Street?  Ubuntu is serious business for Bloomberg.
  • Paypal, Dropbox, Snapchat, Pinterest, Reddit. Airbnb.  Yep.  More Ubuntu.
  • Wikipedia and Wikimedia, among the busiest sites on the Internet with 8 - 18 billion page views per month, are hosted on Ubuntu.
How many "users" of Ubuntu are there ultimately?  I bet there are over a billion people today, using Ubuntu -- both directly and indirectly.  Without a doubt, there are over a billion people on the planet benefiting from the services, security, and availability of Ubuntu today.
  • More people use Ubuntu than we know.
  • More people use Ubuntu than you know.
  • More people use Ubuntu than they know.
More people use Ubuntu than anyone actually knows.

Because of who we all are.

:-Dustin

Read more
Dustin Kirkland


As always, I enjoyed speaking at the SCALE14x event, especially at the new location in Pasadena, California!

What if you could adapt a package from a newer version of Ubuntu, onto your stable LTS desktop/server?

Or, as a developer, what if you could provide your latest releases to your users running an older LTS version of Ubuntu?

Introducing adapt!

adapt is a lot like apt...  It’s a simple command that installs packages.

But it “adapts” a requested version to run on your current system.

It's a simple command that installs any package from any release of Ubuntu into any version of Ubuntu.

How does adapt work?

Simple… Containers!

More specifically, LXD system containers.

Why containers?

Containers can run anywhere, physical, virtual, desktops, servers, and any CPU architecture.

And containers are light and fast!  Zero latency and no virtualization overhead.

Most importantly, system containers are perfect copies of the released distribution, the operating system itself.

And all of that continuous integration testing we do perform on every single Ubuntu release?

We leverage that!
You can download a PDF of the slides for my talk here, or flip through them here:



I hope you enjoy some of the magic that LXD is making possible ;-)

Cheers!
Dustin

Read more
Dustin Kirkland

tl;dr

  • Put /tmp on tmpfs and you'll improve your Linux system's I/O, reduce your carbon foot print and electricity usage, stretch the battery life of your laptop, extend the longevity of your SSDs, and provide stronger security.
  • In fact, we should do that by default on Ubuntu servers and cloud images.
  • Having tested 502 physical and virtual servers in production at Canonical, 96.6% of them could immediately fit all of /tmp in half of the free memory available and 99.2% could fit all of /tmp in (free memory + free swap).

Try /tmp on tmpfs Yourself

$ echo "tmpfs /tmp tmpfs rw,nosuid,nodev" | sudo tee -a /etc/fstab
$ sudo reboot

Background

In April 2009, I proposed putting /tmp on tmpfs (an in memory filesystem) on Ubuntu servers by default -- under certain conditions, like, well, having enough memory. The proposal was "approved", but got hung up for various reasons.  Now, again in 2016, I proposed the same improvement to Ubuntu here in a bug, and there's a lively discussion on the ubuntu-cloud and ubuntu-devel mailing lists.

The benefits of /tmp on tmpfs are:
  • Performance: reads, writes, and seeks are insanely fast in a tmpfs; as fast as accessing RAM
  • Security: data leaks to disk are prevented (especially when swap is disabled), and since /tmp is its own mount point, we should add the nosuid and nodev options (and motivated sysadmins could add noexec, if they desire).
  • Energy efficiency: disk wake-ups are avoided
  • Reliability: fewer NAND writes to SSD disks
In the interest of transparency, I'll summarize the downsides:
  • There's sometimes less space available in memory, than in your root filesystem where /tmp may traditionally reside
  • Writing to tmpfs could evict other information from memory to make space
You can learn more about Linux tmpfs here.

Not Exactly Uncharted Territory...

Fedora proposed and implemented this in Fedora 18 a few years ago, citing that Solaris has been doing this since 1994. I just installed Fedora 23 into a VM and confirmed that /tmp is a tmpfs in the default installation, and ArchLinux does the same. Debian debated doing so, in this thread, which starts with all the reasons not to put /tmp on a tmpfs; do make sure you read the whole thread, though, and digest both the pros and cons, as both are represented throughout the thread.

Full Data Treatment

In the current thread on ubuntu-cloud and ubuntu-devel, I was asked for some "real data"...

In fact, across the many debates for and against this feature in Ubuntu, Debian, Fedora, ArchLinux, and others, there is plenty of supposition, conjecture, guesswork, and presumption.  But seeing as we're talking about data, let's look at some real data!

Here's an analysis of a (non-exhaustive) set of 502 of Canonical's production servers that run Ubuntu.com, Launchpad.net, and hundreds of related services, including OpenStack, dozens of websites, code hosting, databases, and more. These servers sampled are slightly biased with more physical machines than virtual machines, but both are present in the survey, and a wide variety of uptime is represented, from less than a day of uptime, to 1306 days of uptime (with live patched kernels, of course).  Note that this is not an exhaustive survey of all servers at Canonical.

I humbly invite further study and analysis of the raw, tab-separated data, which you can find at:
The column headers are:
  • Column 1: The host names have been anonymized to sequential index numbers
  • Column 2: `du -s /tmp` disk usage of /tmp as of 2016-01-17 (ie, this is one snapshot in time)
  • Column 3-8: The output of the `free` command, memory in KB for each server
  • Column 9-11: The output of the `free` command, sway in KB for each server
  • Column 12: The number of inodes in /tmp
I have imported it into a Google Spreadsheet to do some data treatment. You're welcome to do the same, or use the spreadsheet of your choice.

For the numbers below, 1 MB = 1000 KB, and 1 GB = 1000 MB, per Wikipedia. (Let's argue MB and MiB elsewhere, shall we?)  The mean is the arithmetic average.  The median is the middle value in a sorted list of numbers.  The mode is the number that occurs most often.  If you're confused, this article might help.  All calculations are accurate to at least 2 significant digits.

Statistical summary of /tmp usage:

  • Max: 101 GB
  • Min: 4.0 KB
  • Mean: 453 MB
  • Median: 16 KB
  • Mode: 4.0 KB
Looking at all 502 servers, there are two extreme outliers in terms of /tmp usage. One server has 101 GB of data in /tmp, and the other has 42 GB. The latter is a very noisy django.log. There are 4 more severs using between 10 GB and 12 GB of /tmp. The remaining 496 severs surveyed (98.8%) are using less than 4.8 GB of /tmp. In fact, 483 of the servers surveyed (96.2%) use less than 1 GB of /tmp. 454 of the servers surveyed (90.4%) use less than 100 MB of /tmp. 414 of the servers surveyed (82.5%) use less than 10 MB of /tmp. And actually, 370 of the servers surveyed (73.7%) -- the overwhelming majority -- use less than 1MB of /tmp.

Statistical summary of total memory available:

  • Max: 255 GB
  • Min: 1.0 GB
  • Mean: 24 GB
  • Median: 10.2 GB
  • Mode: 4.1 GB
All of the machines surveyed (100%) have at least 1 GB of RAM.  495 of the machines surveyed (98.6%) have at least 2GB of RAM.   437 of the machines surveyed (87%) have at least 4 GB of RAM.   255 of the machines surveyed (50.8%) have at least 10GB of RAM.    157 of the machines surveyed (31.3%) have more than 24 GB of RAM.  74 of the machines surveyed (14.7%) have at least 64 GB of RAM.

Statistical summary of total swap available:

  • Max: 201 GB
  • Min: 0.0 KB
  • Mean: 13 GB
  • Median: 6.3 GB
  • Mode: 2.96 GB
485 of the machines surveyed (96.6%) have at least some swap enabled, while 17 of the machines surveyed (3.4%) have zero swap configured. One of these swap-less machines is using 415 MB of /tmp; that machine happens to have 32 GB of RAM. All of the rest of the swap-less machines are using between 4 KB and 52 KB (inconsequential) /tmp, and have between 2 GB and 28 GB of RAM.  5 machines (1.0%) have over 100 GB of swap space.

Statistical summary of swap usage:

  • Max: 19 GB
  • Min: 0.0 KB
  • Mean: 657 MB
  • Median: 18 MB
  • Mode: 0.0 KB
476 of the machines surveyed (94.8%) are using less than 4 GB of swap. 463 of the machines surveyed (92.2%) are using less than 1 GB of swap. And 366 of the machines surveyed (72.9%) are using less than 100 MB of swap.  There are 18 "swappy" machines (3.6%), using 10 GB or more swap.

Modeling /tmp on tmpfs usage

Next, I took the total memory (RAM) in each machine, and divided it in half which is the default allocation to /tmp on tmpfs, and subtracted the total /tmp usage on each system, to determine "if" all of that system's /tmp could actually fit into its tmpfs using free memory alone (ie, without swap or without evicting anything from memory).

485 of the machines surveyed (96.6%) could store all of their /tmp in a tmpfs, in free memory alone -- i.e. without evicting anything from cache.

Now, if we take each machine, and sum each system's "Free memory" and "Free swap", and check its /tmp usage, we'll see that 498 of the systems surveyed (99.2%) could store the entire contents of /tmp in tmpfs free memory + swap available. The remaining 4 are our extreme outliers identified earlier, with /tmp usages of [101 GB, 42 GB, 13 GB, 10 GB].

Performance of tmpfs versus ext4-on-SSD

Finally, let's look at some raw (albeit rough) read and write performance numbers, using a simple dd model.

My /tmp is on a tmpfs:
kirkland@x250:/tmp⟫ df -h .
Filesystem Size Used Avail Use% Mounted on
tmpfs 7.7G 2.6M 7.7G 1% /tmp

Let's write 2 GB of data:
kirkland@x250:/tmp⟫ dd if=/dev/zero of=/tmp/zero bs=2G count=1
0+1 records in
0+1 records out
2147479552 bytes (2.1 GB) copied, 1.56469 s, 1.4 GB/s

And let's write it completely synchronously:
kirkland@x250:/tmp⟫ dd if=/dev/zero of=./zero bs=2G count=1 oflag=dsync
0+1 records in
0+1 records out
2147479552 bytes (2.1 GB) copied, 2.47235 s, 869 MB/s

Let's try the same thing to my Intel SSD:
kirkland@x250:/local⟫ df -h .
Filesystem Size Used Avail Use% Mounted on
/dev/dm-0 217G 106G 100G 52% /

And write 2 GB of data:
kirkland@x250:/local⟫ dd if=/dev/zero of=./zero bs=2G count=1
0+1 records in
0+1 records out
2147479552 bytes (2.1 GB) copied, 7.52918 s, 285 MB/s

And let's redo it completely synchronously:
kirkland@x250:/local⟫ dd if=/dev/zero of=./zero bs=2G count=1 oflag=dsync
0+1 records in
0+1 records out
2147479552 bytes (2.1 GB) copied, 11.9599 s, 180 MB/s

Let's go back and read the tmpfs data:
kirkland@x250:~⟫ dd if=/tmp/zero of=/dev/null bs=2G count=1
0+1 records in
0+1 records out
2147479552 bytes (2.1 GB) copied, 1.94799 s, 1.1 GB/s

And let's read the SSD data:
kirkland@x250:~⟫ dd if=/local/zero of=/dev/null bs=2G count=1
0+1 records in
0+1 records out
2147479552 bytes (2.1 GB) copied, 2.55302 s, 841 MB/s

Now, let's create 10,000 small files (1 KB) in tmpfs:
kirkland@x250:/tmp/foo⟫ time for i in $(seq 1 10000); do dd if=/dev/zero of=$i bs=1K count=1 oflag=dsync ; done
real 0m15.518s
user 0m1.592s
sys 0m7.596s

And let's do the same on the SSD:
kirkland@x250:/local/foo⟫ time for i in $(seq 1 10000); do dd if=/dev/zero of=$i bs=1K count=1 oflag=dsync ; done
real 0m26.713s
user 0m2.928s
sys 0m7.540s

For better or worse, I don't have any spinning disks, so I couldn't repeat the tests there.

So on these rudimentary read/write tests via dd, I got 869 MB/s - 1.4 GB/s write to tmpfs and 1.1 GB/s read from tmps, and 180 MB/s - 285 MB/s write to SSD and 841 MB/s read from SSD.

Surely there are more scientific ways of measuring I/O to tmpfs and physical storage, but I'm confident that, by any measure, you'll find tmpfs extremely fast when tested against even the fastest disks and filesystems.

Summary

  • /tmp usage
    • 98.8% of the servers surveyed use less than 4.8 GB of /tmp
    • 96.2% use less than 1.0 GB of /tmp
    • 73.7% use less than 1.0 MB of /tmp
    • The mean/median/mode are [453 MB / 16 KB / 4 KB]
  • Total memory available
    • 98.6% of the servers surveyed have at least 2.0 GB of RAM
    • 88.0% have least 4.0 GB of RAM
    • 57.4% have at least 8.0 GB of RAM
    • The mean/median/mode are [24 GB / 10 GB / 4 GB]
  • Swap available
    • 96.6% of the servers surveyed have some swap space available
    • The mean/median/mode are [13 GB / 6.3 GB / 3 GB]
  • Swap used
    • 94.8% of the servers surveyed are using less than 4 GB of swap
    • 92.2% are using less than 1 GB of swap
    • 72.9% are using less than 100 MB of swap
    • The mean/median/mode are [657 MB / 18 MB / 0 KB]
  • Modeling /tmp on tmpfs
    • 96.6% of the machines surveyed could store all of the data they currently have stored in /tmp, in free memory alone, without evicting anything from cache
    • 99.2% of the machines surveyed could store all of the data they currently have stored in /tmp in free memory + free swap
    • 4 of the 502 machines surveyed (0.8%) would need special handling, reconfiguration, or more swap

Conclusion


  • Can /tmp be mounted as a tmpfs always, everywhere?
    • No, we did identify a few systems (4 out of 502 surveyed, 0.8% of total) consuming inordinately large amounts of data in /tmp (101 GB, 42 GB), and with insufficient available memory and/or swap.
    • But those were very much the exception, not the rule.  In fact, 96.6% of the systems surveyed could fit all of /tmp in half of the freely available memory in the system.
  • Is this the first time anyone has suggested or tried this as a Linux/UNIX system default?
    • Not even remotely.  Solaris has used tmpfs for /tmp for 22 years, and Fedora and ArchLinux for at least the last 4 years.
  • Is tmpfs really that much faster, more efficient, more secure?
    • Damn skippy.  Try it yourself!
:-Dustin

Read more
Louis

gtimelog is a great tool to keep track of the time you spend  on your work. To me, its major advantage is to log the time at the end of the task and not at the beginning.

The format of the entry should be : {Category} : {Task}

It has a useful reporting functionality that will craft an email with the details of your tasks with the time spent on each one.  When used with the format described above, the reporting will be done according to each {Category} and a summary of the categories with the sum of the time spent on each.

The GUI is very simple :

gtimelog1

There is one line at the bottom that let you enter the task  you  have just completed.  Tasks are logged in a flat text file so it is very simple to use.

Command Line Interface (CLI) helper : tl.py

Since I spend most of my time using the CLI in a shell environment, I thought that it would be useful to log my gtimelog entries using a one line command.  So I wrote a simple python script called tl.py that will add one entry in the gtimelog log file with the correct timestamp format.

You can download the script here : https://github.com/karibou/tl

The other motivation for this script is to maintain the same spelling for each repetitive task, so the accounting remain coherent.  It also avoids to have to retype the same information over and over.

So when I start a new day, I simply let gtimelog know that a new day is starting by typing :

$ tl.py new

This will add the new : Arrived entry in gtimelog.

The categories that I use most frequently are hard coded in the script :

 'ua': 'L3 / L3 support', 
 'up': 'Upstream', 
 'ubu': 'Upstream Ubuntu', 
 'meet': 'Internal meetings', 
 'pers': 'Personal management', 
 'skill': 'Skills building', 
 'know': 'Knowledge transfer', 
 'team': 'Team support', 
 'comm': 'Community Involvment',

You can modify those to your own needs by editing the script.

To list each category available, just use ? as the argument :

$ tl.py ?
Categories : ['comm', 'know', 'meet', 'pers', 'skill', 'team', 'ua', 'ubu', 'up']

When I enter a new task, I simply use the syntax :

$ tl.py ubu : LP1104303 : gtimelog missing icon

ubu corresponds to the category of the task (Upstream Ubuntu) and whatever comes after the colon is the task.

The next time I want to report activity against this task, I simply use the script with the category of the task only and it will list all existing tasks for that category  :

$ tl.py ubu
1) LP1104303 : gtimelog missing icon
2) sosreport CVE-2015-7529
3) nut merge
4) LP: #1318111 - repetitive crashkernel=
5) liblinear library transition
6) makedumpfile packaging
7) LP1506550 - quassel sound bug
8) Xenial upgrade
9) sosreport autopkgtests for SRU
10) LP: #1496317 - kdump OOM killer failure
Select task (0 to exit,<CR> to continue):

Entering the task index number will add a new entry in gtimelog. If more than 10 tasks are available, enter <CR> to continue the listing of the tasks.

Using tl.py from multiple systems

It may be useful to be able to log your activity from multiple systems.  For instance, you may be working off your laptop while travelling and a desktop at home.  One possibility to achieve this is to use Cloud storage services like Dropbox and share the text file that holds your gtimelog tasks.

On Ubuntu, the file is located at $HOME/.local/share/gtimelog/timelog.txt

If you replace this file by a symbolic link to a file stored in the directory shared in the cloud, you can access the same file from different systems.

The Dropbox example

I already have my Dropbox files shared on my laptop so there is a ~/Dropbox directory available.  To host my gtimelog file, I create a gtimelog directory and copy the timelog.txt file there :

$ mkdir ~/Dropbox/gtimelog
$ cp ~/.local/share/gtimelog/timelog.txt ~/.local/share/gtimelog/timelog.bak
$ mv ~/.local/share/gtimelog/timelog.txt ~/Dropbox/gtimelog
$ ln -s ~/Dropbox/gtimelog/timelog.txt ~/.local/share/gtimelog/timelog.txt

By doing using such a configuration on each system where you want to enter gtimelog tasks, you will be able to share your gtimelog data in an easy manner.

Using tl.py on Ubuntu Server

In my specific case, I needed to go one step further : my second system is a server which does not have a graphical interface. Using this[1] blog entry, I was able to setup my server to receive the ./gtimelog/timelog.txt file only by doing a selective sychronization of the gtimelog directory.  The URL for the dropbox_temp file in the blog post is no longer valid but you can download it from github[2] . It only needs to be renamed to dropbox_temp.

[1] Simplify your server using Dropbox, Selective Sync

[2] dropbox_temp script

Read more
Michael Hall

I’ve just published the most recent Community Donations Report highlighting where donations made to the Ubuntu community have been used by members of that community to promote and improve Ubuntu. In this report I’ve included links to write-ups detailing how those funds were put to use.

Over the past two years these donations have allowed Ubuntu Members to travel and speak at events, host local events, accelerate development and testing, and much more. Thank you all who have donated to this fund. You can help us do more, by donating to the fund and helping spread the word about it, both to potential donors and potential beneficiaries.

Read more
Dustin Kirkland


Picture yourself containers on a server
With systemd trees and spawned tty's
Somebody calls you, you answer quite quickly
A world with the density so high

    - Sgt. Graber's LXD Smarts Club Band

Last week, we proudly released Ubuntu 15.10 (Wily) -- the final developer snapshot of the Ubuntu Server before we focus the majority of our attention on quality, testing, performance, documentation, and stability for the Ubuntu 16.04 LTS cycle in the next 6 months.

Notably, LXD has been promoted to the Ubuntu Main archive, now commercially supported by Canonical.  That has enabled us to install LXD by default on all Ubuntu Servers, from 15.10 forward.
Join us for an interactive, live webinar on November 12th at 5pm BST/12pm EST led by James Page, where he will demonstrate LXD as the fastest hypervisor in OpenStack!
That means that every Ubuntu server -- Intel, AMD, ARM, POWER, and even Virtual Machines in the cloud -- is now a full machine container hypervisor, capable of hosting hundreds of machine containers, right out of the box!

LXD in the Sky with Diamonds!  Well, LXD is in the Cloud with Diamond level support from Canonical, anyway.  You can even test it in your web browser here.

The development tree of Xenial (Ubuntu 16.04 LTS) has already inherited this behavior, and we will celebrate this feature broadly through our use of LXD containers in Juju, MAAS, and the reference platform of Ubuntu OpenStack, as well as the new nova-lxd hypervisor in the OpenStack Autopilot within Landscape.

While the young and the restless are already running Wily Ubuntu 15.10, the bold and the beautiful are still bound to their Trusty Ubuntu 14.04 LTS servers.

At Canonical, we understand both motivations, and this is why we have backported LXD to the Trusty archives, for safe, simple consumption and testing of this new generation of machine containers there, on your stable LTS.

Installing LXD on Trusty simply requires enabling the trusty-backports pocket, and installing the lxd package from there, with these 3 little commands:

sudo sed -i -e "/trusty-backports/ s/^# //" /etc/apt/sources.list
sudo apt-get update; sudo apt-get dist-upgrade -y
sudo apt-get -t trusty-backports install lxd

In minutes, you can launch your first LXD containers.  First, inherit your new group permissions, so you can execute the lxc command as your non-root user.  Then, import some images, and launch a new container named lovely-rita.  Shell into that container, and examine the process tree, install some packages, check the disk and memory and cpu available.  Finally, exit when you're done, and optionally delete the container.

newgrp lxd
lxd-images import ubuntu --alias ubuntu
lxc launch ubuntu lovely-rita
lxc list
lxc exec lovely-rita bash
ps -ef
apt-get update
df -h
free
cat /proc/cpuinfo
exit
lxc delete lovely-rita

I was able to run over 600 containers simultaneously on my Thinkpad (x250, 16GB of RAM), and over 60 containers on an m1.small in Amazon (1.6GB of RAM).

We're very interested in your feedback, as LXD is one of the most important features of the Ubuntu 16.04 LTS.  You can learn more about LXD, view the source code, file bugs, discuss on the mailing list, and peruse the Linux Containers upstream projects.

With a little help from my friends!
:-Dustin

Read more
Michael Hall

With the release of the Wily Werewolf (Ubuntu 15.10) we have entered into the Xenial Xerus (to be Ubuntu 16.04) development cycle. This will be another big milestone for Ubuntu, not just because it will be another LTS, but it will be the last LTS before we acheive convergence. What we do here will not only be supported for the next 5 years, it will set the stage for everything to come over that time as we bring the desktop, phone and internet-of-things together into a single comprehensive, cohesive platform.

To help get us there, we have a track dedicated to Convergence at this week’s Ubuntu Online Summit where we will be discussing plans for desktops, phones, IoT and how they are going to come together.

Tuesday

We’ll start the the convergence track at 1600 with the Ubuntu Desktop team talking about the QA (Quality Assurance) plans for the next LTS desktop, which will provide another 5 years of support for Ubuntu users. We’ll end the day with the Kubuntu team who are planning for their 16.04 (Xenial Xerus) release at 1900 UTC.

Wednesday

The second day kicks off at 1400 UTC with plans for what version of the Qt toolkit will ship in 16.04, something that now affects both the KDE and Unity 8 flavors of Ubuntu. That will be followed by development planning for the next Unity 7 desktop version of Ubuntu at 1500, and a talk on how legacy apps (.deb and X11 based) might be supported in the new Snappy versions of Ubuntu. We will end the day with a presentation by the Unity 8 developers at 1800 about how you can get started working on and contributing to the next generation desktop interface for Ubuntu.

Thursday

The third and last day of the Online Summit will begin with a live Questions and Answers session at 1400 UTC about the Convergence plans in general with the project and engineering managers who are driving it forward. At 1500 we’ll take a look at how those plans are being realized in some of the apps already being developed for use on Ubuntu phones and desktop. Then at 1600 UTC members of the design team will be talking to independent app developers about how to design their app with convergence in mind. We will then end the convergence track with a summary from KDE developers on the state and direction of their converged UI, Plama Mobile.

Plenaries

Outside of the Convergence track, you’ll want to watch Mark Shuttleworth’s opening keynote at 1400 UTC on Tuesday, and Canonical CEO Jane Silber’s live Q&A session at 1700 UTC on Wednesday.

Read more
Hardik Dalwadi

Recently i have written blog about assembling Ubuntu Orange Matchbox (Ubuntu Branded Pibow for Raspberry Pi 2 & PiGlow) & Demonstration of Snappy Ubuntu Core with it.

Now with Make-Me-Glow LP Project, we have built an application for controlling PiGlow from Ubuntu Phone. The PiGlow is a small add on board for the Raspberry Pi that provides 18 individually controllable LEDs. Recently Victor Tuson Palau has released glowapi for Snappy Ubuntu Core, which will allow us to control PiGlow over HTTP Protocol.

We decided to build quick Ubuntu Phone application to control  PiGlow during Ubuntu Hackathon India using glowapi for Snappy Ubuntu Core. As a result of this, we came with Make-Me-Glow LP Project. Big Thanks

 

What It Does:

It will allow you to control any LED of PiGlow from Ubuntu Phone. You have to  download “Make Me Glow” application from Ubuntu Phone Store. For example, if you want to On/Off Orange LED on Leg 1 with specific Intensity on PiGlow, Make Me Glow UI will allow you to do the same. Just make sure that you have to enter correct IP address of your Ubuntu Orange Matchbox (Ubuntu Branded Pibow for Raspberry Pi 2 & PiGlow). We are assuming that you have already installed  glowapi for Snappy Ubuntu Core on your  Ubuntu Orange Matchbox (Ubuntu Branded Pibow for Raspberry Pi 2 & PiGlow)

Make Me Glow Installation from Ubuntu Store @ Ubuntu Phone Launch Make Me Glow Launch Make Me Glow Give you input to control PiGlow LEDs Give you input to control PiGlow LEDs Give you input to control PiGlow LEDs

How It Does:

As i said erlier, we are using glowapi for Snappy Ubuntu Core which can operate PiGlow according to HTTP Protocol POST request.  We are issuing POST request over HTTP Protocol from Ubuntu Phone Application – Make Me Glow as per user configuration and PiGlow does operate accordingly. We have written simple QML function for the same, which request URL using POST over HTTP  according to user input. It is very very easy to develop Ubuntu Phone Application using Ubuntu SDK. Big Thanks to XiaoGuo Liu, without it would not be possible to execute this project.

function request(url) {
var xhr = new XMLHttpRequest();
xhr.open(‘POST’, url, true);
xhr.send(”);
}

Future Roadmap:

We are working on few animations. In fact  i have already made the proof of concept using bash script, like Fan &  FadeOut animation with bash scripting. If you want to contribute for the same please visit Make-Me-Glow LP Project.

Conclusion:

Goal behind this demonstration is to defined immense possibility of Snappy Ubuntu Core to control devices remotely, within Ubuntu Ecosystem. If you are planning to dive in to IOT based solutions, Snappy Ubuntu Core is great start for you. After this demonstration, I am planning to control my Home Lights connected to device running Snappy Ubuntu Core and controlling the same through Ubuntu Phone.  Stay tuned…

Read more
Hardik Dalwadi

In this edition i will demonstrate how to assemble your own Ubuntu Orange Matchbox: Ubuntu Branded Pibow for Raspberry Pi 2 & PiGlow. As shown below.

Ubuntu Orange Matchbox: Ubuntu Branded Pibow for Raspberry Pi 2 & PiGlow.

Basically, at the end of this blog i will demonstrate how to assemble your own Ubuntu Orange Matchbox: Ubuntu Branded Pibow for Raspberry Pi 2 & PiGlow. Flash it with latest Snappy Ubuntu Core, which has support for  Raspberry Pi 2’s GPIO & I2C. I will also managing Snap packages from Ubuntu Phone Browser & Snappy Scope (Beta). I will also share tips & tricks to enable WiFi on Snappy Ubuntu Core. At the end, i will demonstrate Snappy Ubuntu Core + Raspberry Pi 2 + PiGLow in action. PiGlow will blink LEDs, as per the CPU Resource Utilization by  Snappy Ubuntu Core.

We have divided this post in four different part:

  • Make In India: Ubuntu Orange Matchbox: Ubuntu Branded Pibow for Raspberry Pi 2 & PiGlow
  • Install Snappy Ubuntu Core @ Ubuntu Orange Matchbox: Ubuntu Branded Pibow for Raspberry Pi 2 & PiGlow
  • Tips & Tricks to enable WiFi @ Snappy Ubuntu Core
  • CPU Resource Utilization Demonstration @Snappy Ubuntu Core

Make In India: Ubuntu Orange Matchbox: Ubuntu Branded Pibow for Raspberry Pi 2 & PiGlow:

Here are the list of ingredients which i have used to cook my own Ubuntu Orange Matchbox. I have also share the link from where you can procure those stuffs in India.

  1. Raspberry Pi 2 with USB WiFi Dongle  & 5V, 2A USB Power Adapter
  2. Tangerine Pibow for Raspberry Pi 2
  3. PiGlow, It is small add on for Raspbeery Pi that provides 18 Individually controllable LEDs.
  4. Mixed Ubuntu Stickers

[1] http://www.amazon.in/Raspberry-Pi-Starter-Kit-Basic/dp/B00U7KAG98/

[2] https://shop.pimoroni.com/products/raspberry-pi-2-pibow

[3] https://shop.pimoroni.com/products/piglow

[4] http://shop.canonical.com/product_info.php?products_id=718

 

 

Assemble Raspberry Pi 2 with Tangerine Pibow.  Remove white laminate plastic sheet of closing cover of Tangerine Pibow.  This actions will give you transparent Pibow for Raspberry Pi 2.

 

Now, it’s time to give Ubuntu Branding. Just stick transparent Ubuntu Logo sticker on  transparent Pibow closing cover. You can also install PiGlow before closing the cover.  If you know any laser engraving service provider for metal / plastic, consult them and get your Ubuntu Branding engraving on transparent Pibow closing cover. But my trick will give you Ubuntu Orange Logo 😉

 

Final Version:

 

Install Snappy Ubuntu Core @ Ubuntu Orange Matchbox: Ubuntu Branded Pibow for Raspberry Pi 2 & PiGlow:

Recently, Snappy Ubuntu Core get latest updated for Raspberry Pi 2, which has latest updates & upgrades. Please follow this page Getting Started with Snappy Ubuntu Core for Raspberry Pi 2. It is having latest info and procedure to play with Snappy Ubuntu Core @ Raspberry Pi 2. You can even build your own image.

Here is sort summary to do the same. Get 4 Gb Class 10 micro SDHC Card. Format it and flash it with latest Snappy Ubuntu Core image.

# Note: replace /dev/sdX with the device name of your SD card (e.g. /dev/mmcblk0, /dev/sdg1 ...)
wget http://people.canonical.com/~platform/snappy/raspberrypi2/ubuntu-15.04-snappy-armhf-rpi2.img.xz
xzcat ubuntu-15.04-snappy-armhf-rpi2.img.xz | sudo dd of=/dev/sdX bs=32M
sync

Now, boot your Ubuntu Orange Matchbox with Snappy Ubuntu Core. Default Username / Password is ubuntu / ubuntu. You can manage Snappy Ubuntu Core web store through Webdm, by pointing your browser to Ubuntu Orange Matchbox IP:4200 or webdm.local. Make sure that you are connected to LAN.

Tips & Tricks to enable WiFi @ Snappy Ubuntu Core:

Snappy Ubuntu Core has inbuilt support for Ethernet port, So, connecting your Raspberry Pi 2 to LAN will get LAN IP from Router. Since, this is headless device, you can always do SSH from your host machine if you do not want to connect it with separate monitor.

It does not have support for WiFi. Here are the steps to enable WiFi and add support for USB wi-fi adapter, having Ralink RA5370 chipset. I got this with my Raspberry Pi 2 kit, please check chipset for your USB wi-fi Adapter, you may need different firmware.

Please follow this solution @ Enable WiFi with Snappy Ubuntu Core. You may need to get below package instead of  – wpasupplicant_0.7.3-6ubuntu2.3_armhf.deb

wget -c http://ports.ubuntu.com/pool/main/w/wpasupplicant/wpasupplicant_0.7.3-6ubuntu2.4_armhf.deb

I would also prefer to install nano for better modification of WiFi SSID configuration in future. Get it from here.

http://ports.ubuntu.com/pool/main/n/nano/nano-udeb_2.2.6-3_armhf.udeb

 

CPU Resource Utilization Demonstration @ Snappy Ubuntu Core:

Now, it’s time to demonstrate CPU Resource Utilization through PiGlow.  It is small add on for Raspbeery Pi that provides 18 Individually controllable LEDs. We will feed CPU Resource Utilization such a way that it will glow more when we have more CPU Resource Utilization. As i said erlier, recently we got GPIO & I2C support on Snappy Ubuntu Core, it is possible to do the same. And we have snap package available from the same, called PiGlow Top.

Access your Snappy Ubuntu Core from WebDM or CLI through SSH and install PiGlow Top snap available from Snappy Ubuntu Core Web Store. After installation, we need to grant access to I2C hardware, do the same over SSH.

sudo snappy hw-assign piglowtop.kyrofa /dev/i2c-1

Now it’s time for the demonstration, grab your Ubuntu Phone, access the Snappy Ubuntu Core Webdm from your Ubuntu Phone Browser, Do some activity to increase CPU Utilization, for example Install / Remove any snap from Snappy Ubuntu Core Web Store. I have prepared small video Demonstration for your reference.

 

http://hardik.in/category/canonical/feed/

Read more
Dustin Kirkland


I delivered a presentation and an exciting live demo in San Francisco this week at the Container Summit (organized by Joyent).

It was professionally recorded by the A/V crew at the conference.  The live demo begins at the 25:21 mark.


You can also find the slide deck embedded below and download the PDFs from here.


Cheers,
:-Dustin

Read more
Michael Hall

Photo from Aaron Honeycutt

Nicholas Skaggs presenting at UbuCon@FOSSETCON 2014

Thanks to the generous organizers of FOSSETCON who have given us a room at their venue, we will be having another UbuCon in Orlando this fall!

FOSSETCON 2015 will be held at the Hilton Orlando Lake Buena Vista‎, from November 19th through the 21st. This year they’ve been able to get Richard Stallman to attend and give a keynote, so it’s certainly an event worth attending for anybody who’s interested in free and open source software.

UbuCon itself will be held all day on the 19th in it’s own dedicate room at the venue. We are currently recruiting presenters to talk to attendees about some aspect of Ubuntu, from the cloud to mobile, community involved and of course the desktop. If you have a fun or interesting topic that you want to share with, please send your proposal to me at mhall119@ubuntu.com

Read more
Daniel Holbach

Tomorrow is a special anniversary: 2005-09-05 I joined Canonical – that’s right: It’s going to be a decade.

A lot of what we as Ubuntu Community experienced and went through I wrote up some time ago and it’s well-documented in blog posts, articles, LoCo event reports and pictures from Ubuntu Allhands events, so don’t expect any of that here.

For me personally it’s been a ride I could never have expected like this. A decade in a single company doesn’t seem to happen very often these days and I would also never have dreamed what we are delivering to the world today. I’m happy and proud to have been part of this all.

I still remember the days when I joined. I had just finished my studies and working next to people who could all easily be described as a wunderkind, it made me feel like I had quite a healthy impostor syndrome. It’s easy to underestimate how much I learned here – not just technically or in terms of other abilities, but also as a person. I got to work on things I never imagined I could do and am happy I was involved in so many different projects.

One thing made this whole ride even more special: the people. I made lots of friends along the way – that’s one of the primary reasons I still feel like I work at a very very special place.

Big hugs everyone and thanks for accompanying me this far! :-)

Read more
beuno

Now that we've open sourced the code for Ubuntu One filesync, I thoughts I'd highlight some of the interesting challenges we had while building and scaling the service to several million users.

The teams that built the service were roughly split into two: the foundations team, who was responsible for the lowest levels of the service (storage and retrieval of files, data model, client and server protocol for syncing) and the web team, focused on user-visible services (website to manage files, photos, music streaming, contacts and Android/iOS equivalent clients).
I joined the web team early on and stayed with it until we shut it down, so that's where a lot of my stories will be focused on.

Today I'm going to focus on the challenge we faced when launching the Photos and Music streaming services. Given that by the time we launched them we had a few years of experience serving files at scale, our challenge turned out to be in presenting and manipulating the metadata quickly to each user, and be able to show the data in appealing ways to users (showing music by artist, genre and searching, for example). Photos was a similar story, people tended to have many thousands of photos and songs and we needed to extract metadata, parse it, store it and then be able to present it back to users quickly in different ways. Easy, right? It is, until a certain scale  :)
Our architecture for storing metadata at the time was about 8 PostgreSQL master databases where we sharded metadata across (essentially your metadata lived on a different DB server depending on your user id) plus at least one read-only slave per shard. These were really beefy servers with a truck load of CPUs, more than 128GB of RAM and very fast disks (when reading this, remember this was 2009-2013, hardware specs seem tiny as time goes by!).  However, no matter how big these DB servers got, given how busy they were and how much metadata was stored (for years, we didn't delete any metadata, so for every change to every file we duplicated the metadata) after a certain time we couldn't get a simple listing of a user's photos or songs (essentially, some of their files filtered by mimetype) in a reasonable time-frame (less than 5 seconds). As it grew we added caches, indexes, optimized queries and code paths but we quickly hit a performance wall that left us no choice but a much feared major architectural change. I say much feared, because major architectural changes come with a lot of risk to running services that have low tolerance for outages or data loss, whenever you change something that's already running in a significant way you're basically throwing out most of your previous optimizations. On top of that as users we expect things to be fast, we take it for granted. A 5 person team spending 6 months to make things as you expect them isn't really something you can brag about in the middle of a race with many other companies to capture a growing market.
In the time since we had started the project, NoSQL had taken off and matured enough for it to be a viable alternative to SQL and seemed to fit many of our use cases much better (webscale!). After some research and prototyping, we decided to generate pre-computed views of each user's data in a NoSQL DB (Cassandra), and we decided to do that by extending our existing architecture instead of revamping it completely. Given our code was pretty well built into proper layers of responsibility we hooked up to the lowest layer of our code,-database transactions- an async process that would send messages to a queue whenever new data was written or modified. This meant essentially duplicating the metadata we stored for each user, but trading storage for computing is usually a good trade-off to make, both in cost and performance. So now we had a firehose queue of every change that went on in the system, and we could build a separate piece of infrastructure who's focus would only be to provide per-user metadata *fast* for any type of file so we could build interesting and flexible user interfaces for people to consume back their own content. The stated internal goals were: 1) Fast responses (under 1 second), 2) Less than 10 seconds between user action and UI update and 3) Complete isolation from existing infrastructure.
Here's a rough diagram of how the information flowed throw the system:

U1 Diagram

It's a little bit scary when looking at it like that, but in essence it was pretty simple: write each relevant change that happened in the system to a temporary table in PG in the same transaction that it's written to the permanent table. That way you get transactional guarantees that you won't loose any data on that layer for free and use PG's built in cache that keeps recently added records cheaply accessible.
Then we built a bunch of workers that looked through those rows, parsed them, sent them to a persistent queue in RabbitMQ and once it got confirmation it was queued it would delete it from the temporary PG table.
Following that we took advantage of Rabbit's queue exchange features to build different types of workers that processes the data differently depending on what it was (music was stored differently than photos, for example).
Once we completed all of this, accessing someone's photos was a quick and predictable read operation that would give us all their data back in an easy-to-parse format that would fit in memory. Eventually we moved all the metadata accessed from the website and REST APIs to these new pre-computed views and the result was a significant reduction in load on the main DB servers, while now getting predictable sub-second request times for all types of metadata in a horizontally scalable system (just add more workers and cassandra nodes).

All in all, it took about 6 months end-to-end, which included a prototype phase that used memcache as a key/value store.

You can see the code that wrote and read from the temporary PG table if you branch the code and look under: src/backends/txlog/
The worker code, as well as the web ui is still not available but will be in the future once we finish cleaning it up to make it available. I decided to write this up and publish it now because I believe the value is more in the architecture rather than the code itself   :)

Read more
Michi

Over the past few months, James Henstridge, Xavi Garcia Mena, and I have implemented a fast and scalable thumbnailing service for Ubuntu and Ubuntu Touch. This post explains how we did it, and how we achieved our performance and reliability goals.

Introduction

On a phone as well as the desktop, applications need to display image thumbnails for various media, such as photos, songs, and videos. Creating thumbnails for such media is CPU-intensive and can be costly in bandwidth if images are retrieved over the network. In addition, different types of media require the use of different APIs that are non-trivial to learn. It makes sense to provide thumbnail creation as a platform API that hides this complexity from application developers and, to improve performance, to cache thumbnails on disk.

This article explains the requirements we had and how we implemented a thumbnailer service that is extremely fast and scalable, and robust in the face of power loss or crashes.

Requirements

We had a number of requirements we wanted to meet in our implementation.

  • Robustness
    In the event of a crash, the implementation must guarantee the integrity of on-disk data structures. This is particularly important on a phone, where we cannot expect the user to perform manual recovery (such as cleaning up damaged files). Because batteries can run out at any time, integrity must be guaranteed even in the face of power loss.
  • Scalability
    It is common for people to store many thousands of songs and photos on a device, so the cache must scale to at least tens of thousands of records. Thumbnails can range in size from a few kilobytes to well over a megabyte (for “thumbnails” at full-screen resolution), so the cache must deal efficiently with large records.
  • Re-usability
    Persistent and reliable on-disk storage of arbitrary records (ranging in size from a few bytes to potentially megabytes) is a common application requirement, so we did not want to create a cache implementation that is specific to thumbnails. Instead, the disk cache is provided as a stand-alone C++ API that can be used for any number of other purposes, such as a browser or HTTP cache, or to build an object file cache similar to ccache.
  • High performance
    The performance of the thumbnailer directly affects the user experience: it is not nice for the customer to look at “please wait a while” icons in, say, an image gallery while thumbnails are being loaded one by one. We therefore had to have a high-performance implementation that delivers cached thumbnails quickly (on the order of a millisecond per thumbnail on an Arm CPU). An efficient implementation also helps to conserve battery life.
  • Location independence and extensibility
    Canonical runs an image server at dash.ubuntu.com that provides album and artist artwork for many musicians and bands. Images from this server are used to display artwork in the music player for media that contains ID3 tags, but does not embed artwork in the media file. The thumbnailer must work with embedded images as well as remote images, and it must be possible to extend it to new media types without unduly disturbing the existing code.
  • Low bandwidth consumption
    Mobile phones typically come with data caps, so the cache has to be frugal with network bandwidth.
  • Concurrency and isolation
    The implementation has to allow concurrent access by multiple applications, as well as concurrent access from a single implementation. Besides needing to be thread-safe, this means that a request for a thumbnail that is slow (such as downloading an image over the network) must not delay other requests.
  • Fault tolerance
    Mobile devices lose network access without warning, and users can add corrupt media files to their device. The implementation must be resilient to partial failures, such as incomplete network replies, dropped connections, and bad image data. Moreover, the recovery strategy for such failures must conserve battery and avoid repeated futile attempts to create thumbnails from media that cannot be retrieved or contains malformed data.
  • Security
    The implementation must ensure that applications cannot see (or, worse, overwrite) each other’s thumbnails or coerce the thumbnailer into delivering images from files that an application is not allowed to read.
  • Asynchronous API
    The customers of the thumbnailer are applications that are written in QML or Qt, which cannot block in the UI thread. The thumbnailer therefore must provide a non-blocking API. Moreover, the application developer should be able to get the best possible performance without having to use threads. Instead, concurrency must be internal to the implementation (which is able to put threads to use intelligently where they make sense), instead of the application throwing threads at the problem in the hope that it might make things faster when, in fact, it might just add overhead.
  • Monitoring
    The effectiveness of a cache cannot be assessed without statistics to show hit and miss rates, evictions, and other basic performance data, so it must provide a way to extract this information.
  • Error reporting
    When something goes wrong with a system service, typically the only way to learn about the problem is to look at log messages. In case of a failure, the implementation must leave enough footprints behind to allow someone to diagnose a failure after the fact with some chance of success.
  • Backward compatibility
    This project was a rewrite of an earlier implementation. Rather than delivering a “big bang” piece of software and potentially upsetting existing clients, we incrementally changed the implementation such that existing applications continued to work. (The only pre-existing interface was a QML interface that required no change.)

System architecture

Here is a high-level overview of the main system components.

A Fast Thumbnailer for UbuntuExternal API

To the outside world, the thumbnailer provides two APIs.

One API is a QML plugin that registers itself as an image provider for QQuickAsyncImageProvider. This allows the caller to to pass a URI that encodes a query for a local or remote thumbnail at a particular size; if the URI matches the registered provider, QML transfers control to the entry points in our plugin.

The second API is a Qt API that provides three methods:

QSharedPointer<Request> getThumbnail(QString const& filePath,
                                     QSize const& requestedSize);
QSharedPointer<Request> getAlbumArt(QString const& artist,
                                    QString const& album,
                                    QSize const& requestedSize);
QSharedPointer<Request> getArtistArt(QString const& artist,
                                     QString const& album,
                                     QSize const& requestedSize);

The getThumbnail() method extracts thumbnails from local media files, whereas getAlbumArt() and getArtistArt() retrieve artwork from the remote image server. The returned Request object provides a finished signal, and methods to test for success or failure of the request and to extract a thumbnail as a QImage. The request also provides a waitForFinished() method, so the API can be used synchronously.

Thumbnails are delivered to the caller in the size they are requested, subject to a (configurable) 1920-pixel limit. The scaling algorithm preserves the original aspect ratio and never scales up from the original, so the returned thumbnails may be smaller than their requested size.

DBus service

The thumbnailer is implemented as a DBus service with two interfaces. The first interface provides the server-side implementation of the three methods of the external API; the second interface is an administrative interface that can deliver statistics, clear the internal disk caches, and shut down the service. A simple tool, thumbnailer-admin, allows both interfaces to be called from the command line.

To conserve resources, the service is started on demand by DBus and shuts down after 30 seconds of idle time.

Image extraction

Image extraction uses an abstract base class. This interface is independent of media location and type. The actual image extraction is performed by derived implementations that download images from the remote server, extract them from local image files, or extract them from local streaming media files. This keeps knowledge of image location and encoding out of the main caching and error handling logic, and allows us to support new media types (whether local or remote) by simply adding extra derived implementations.

Image extraction is asynchronous, with currently three implementations:

  • Image downloader
    To retrieve artwork from the remote image server, the service talks to an abstract base class with asynchronous download_album() and download_artist() methods. This allows multiple downloads to run concurrently and makes it easy to add new local or remote image providers without disturbing the code for existing ones. A class derived from that abstract base implements a REST API with QNetworkAccessManager to retrieve images from dash.ubuntu.com.
  • Photo extractor
    The photo extractor is responsible for delivering images from local image files, such as JPEG or PNG files. It simply delegates that work to the image converter and scaler.
  • Audio and video thumbnail extractor
    To extract thumbnails from audio and video files, we use GStreamer. Due to reliability problems with some codecs that can hang or crash, we delegate the task to a separate vs-thumb executable. This shields the service from failures and also allows us to run several GStreamer pipelines concurrently without a crash of one pipeline affecting the others.

Image converter and scaler

We use a simple Image class with a synchronous interface to convert and scale different image formats to JPEG. The implementation uses Gdk-Pixbuf, which can handle many different input formats and is very efficient.

For JPEG source images, the code checks for the presence of EXIF data using libexif and, if it contains a thumbnail that is at least as large as the requested size, scales the thumbnail from the EXIF data. (For images taken with the camera on a Nexus 4, the original image size is 3264×1836, with an embedded EXIF thumbnail of 512×288. Scaling from the EXIF thumbnail is around one hundred times faster than scaling from the full-size image.)

Disk cache

The thumbnailer service optimizes performance and conserves bandwidth and battery by adopting a layered caching strategy.

Two-level caching with failure lookup

Internally, the service uses three separate on-disk caches:

  • Full-size cache
    This cache stores images that are expensive to retrieve (images that are remote or are embedded in audio and video files) at original resolution (scaled down to a 1920-pixel bounding box if the original image is larger). The default size of this cache is 50 MB, which is sufficient to hold around 400 images at 1920×1080 resolution. Images are stored in JPEG format (at a 90% quality setting).
  • Thumbnail cache
    This cache stores thumbnails at the size that was requested by the caller, such as 512×288. The default size of this cache is 100 MB, which is sufficient to store around 11,000 thumbnails at 512×288, or around 25,000 thumbnails at 256×144.
  • Failure cache
    The failure cache stores the keys for images that could not be extracted because of a failure. For remote images, this means that the server returned an authoritative answer “no such image exists”, or that we encountered an unexpected (non-authoritative) failure, such as the server not responding or a DNS lookup timing out. For local images, it means either that the image data could not be processed because it is damaged, or that an audio file does not contain embedded artwork.

The full-size cache exists because it is likely that an application will request thumbnails at different sizes for the same image. For example, when scrolling through a list of songs that shows a small thumbnail of the album cover beside each song, the user is likely to select one of the songs to play, at which point the media player will display the same cover in a larger size. By keeping full-size images in a separate (smallish) cache, we avoid performing an expensive extraction or download a second time. Instead, we create additional thumbnails by scaling them from the full-size cache (which uses an LRU eviction policy).

The thumbnail cache stores thumbnails that were previously retrieved, also using LRU eviction. Thumbnails are stored as JPEG at the default quality setting of 75%, at the actual size that was requested by the caller. Storing JPEG images (rather than, say, PNG) saves space and increases cache effectiveness. (The minimal quality loss from compression is irrelevant for thumbnails). Because we store thumbnails at the size they are actually needed, we may have several thumbnails for the same image in the cache (each thumbnail at a different size). But applications typically ask for thumbnails in only a small number of sizes, and ask for different sizes for the same image only rarely. So, the slight increase in disk space is minor and amply repaid by applications not having to scale thumbnails after they receive them from the cache, which saves battery and achieves better performance overall.

Finally, the failure cache is used to stop futile attempts to repeatedly extract a thumbnail when we know that the attempt will fail. It uses LRU eviction with an expiry time for each entry.

Cache lookup algorithm

When asked for a thumbnail at a particular size, the lookup and thumbnail generation proceed as follows:

  1. Check if a thumbnail exists in the requested size in the thumbnail cache. If so, return it.
  2. Check if a full-size image for the thumbnail exists in the full-size cache. If so, scale the new thumbnail from the full-size image, add the thumbnail to the thumbnail cache, and return it.
  3. Check if there is an entry for the thumbnail in the failure cache. If so, return an error.
  4. Attempt to download or extract the original image for the thumbnail. If the attempt fails, add an entry to the failure cache and return an error.
  5. If the original image was delivered by the remote server or was extracted from local media, add it to the full-size cache.
  6. Scale the thumbnail to the desired size, add it to the thumbnail cache, and return it.

Note that these steps represent only the logical flow of control for a particular thumbnail. The implementation executes these steps concurrently for different thumbnails.

Designing for performance

Apart from fast on-disk caches (see below), the thumbnailer must make efficient use of I/O bandwidth and threads. This not only means making things fast, but also to not waste resources such as threads, memory, network connections, or file descriptors. Provided that enough requests are made to keep the service busy, we do not want it to ever wait for a download or image extraction to complete while there is something else that could be done in the mean time, and we want it to keep all CPU cores busy. In addition, requests that are slow (because they require a download or a CPU-intensive image extraction) must not block requests that are queued up behind them if those requests would result in cache hits that could be returned immediately.

To achieve a high degree of concurrency without blocking on long-running operations while holding precious resources, the thumbnailer uses a three-phase lookup algorithm:

  1. In phase 1, we look at the caches to determine if we have a hit or an authoritative miss. Phase 1 is very fast. (It takes around a millisecond to return a thumbnail from the cache on a Nexus 4.) However, cache lookup can briefly stall on disk I/O or require a lot of CPU to extract and scale an image. To get good performance, phase 1 requests are passed to a thread pool with as many threads as there are CPU cores. This allows the maximum number of lookups to proceed concurrently.
  2. Phase 2 is initiated if phase 1 determines that a thumbnail requires download or extraction, either of which can take on the order of seconds. (In case of extraction from local media, the task is CPU intensive; in case of download, most of the time is spent waiting for the reply from the server.) This phase is scheduled asynchronously from an event loop. This minimizes task switching and allows large numbers of requests to be queued while only using a few bytes for each request that is waiting in the queue.
  3. Phase 3 is really a repeat of phase 1: if phase 2 produces a thumbnail, it adds it to the cache; if phase 2 does not produce a thumbnail, it creates an entry in the failure cache. By simply repeating phase 1, the lookup then results in either a thumbnail or an error.

If phase 2 determines that a download or extraction is required, that work is performed concurrently: the service schedules several downloads and extractions in parallel. By default, it will run up to two concurrent downloads, and as many concurrent GStreamer pipelines as there are CPUs. This ensures that we use all of the available CPU cores. Moreover, download and extraction run concurrently with lookups for phase 1 and 3. This means that, even if a cache lookup briefly stalls on I/O, there is a good chance that another thread can make use of the CPU.

Because slow operations do not block lookup, this also ensures that a slow request does not stall requests for thumbnails that are already in the cache. In other words, it does not matter how many slow requests are in progress: requests that can be completed quickly are indeed completed quickly, regardless of what is going on elsewhere.

Overall, this strategy works very well. For example, with sufficient workload, the service achieves around 750% CPU utilization on an 8-core desktop machine, while still delivering cache hits almost instantaneously. (On a Nexus 4, cache hits take a little over 1 ms while concurrent extractions or downloads are in progress.)

A re-usable persistent cache for C++

The three internal caches are implemented by a small and flexible C++ API. This API is available as a separate reusable PersistentStringCache component (see persistent-cache-cpp) that provides a persistent store of arbitrary key–value pairs. Keys and values can be binary, and entries can be large. (Megabyte-sized values do not present a problem.)

The implementation uses leveldb, which provides a very fast NoSQL database that scales to multi-gigabyte sizes and provides integrity guarantees. In particular, if the calling process crashes, all inserts that completed at the API level will be intact after a restart. (In case of a power failure or kernel crash, a few buffered inserts can be lost, but the integrity of the database is still guaranteed.)

To use a cache, the caller instantiates it with a path name, a maximum size, and an eviction policy. The eviction policy can be set to either strict LRU (least-recently-used) or LRU with an expiry time. Once a cache reaches its maximum size, expired entries (if any) are evicted first and, if that does not free enough space for a new entry, entries are discarded in least-recently-used order until enough room is available to insert a new record. (In all other respects, expired entries behave like entries that were never added.)

A simple get/put API allows records to be retrieved and added, for example:

auto c = core::PersistentStringCache::open(
    “my_cache”, 100 * 1024 * 1024, core::CacheDiscardPolicy::lru_only);
// Look for an entry and add it if there is a cache miss.
string key = "Bjarne";
auto value = c->get(key);
if (value) {
    cout << key << ″: ″ << *value << endl;
} else {
    value = "C++ inventor";  // Provide a value for the key. 
    c->put(key, *value);     // Insert it.
}

Running this program prints nothing on the first run, and “Bjarne: C++ inventor” on all subsequent runs.

The API also allows application-specific metadata to be added to records, provides detailed statistics, supports dynamic resizing of caches, and offers a simple adapter template that makes it easy to store complex user-defined types without the need to clutter the code with explicit serialization and deserialization calls. (In a pinch, if iteration is not needed, the cache can be used as a persistent map by setting an impossibly large cache size, in which case no records are ever evicted.)

Performance

Our benchmarks indicate good performance. (Figures are for an Intel Ivy Bridge i7-3770k 3.5 GHz machine with a 256 GB SSD.) Our test uses 60-byte string keys. Values are binary blobs filled with random data (so they are not compressible), 20 kB in size with a standard deviation of 7,000, so the majority of values are 13–27 kB in size. The cache size is 100 MB, so it contains around 5,000 records.

Filling the cache with 100 MB of records takes around 2.8 seconds. Thereafter, the benchmark does a random lookup with an 80% hit probability. In case of a cache miss, it inserts a new random record, evicting old records in LRU order to make room for the new one. For 100,000 iterations, the cache returns around 4,800 “thumbnails” per second, with an aggregate read/write throughput of around 93 MB/sec. At 90% hit rate, we see a 50% performance increase at around 7,100 records/sec. (Writes are expensive once the cache is full due to the need to evict entries, which requires updating the main cache table as well as an index.)

Repeating the test with a 1 GB cache produces identical timings so (within limits) performance remains constant for large databases.

Overall, performance is restricted largely by the bandwidth to disk. With a 7,200 rpm disk, we measured around one third of the performance with an SSD.

Recovering from errors

The overall design of the thumbnailer delivers good performance when things work. However, our implementation has to deal with the unexpected, such as network requests that do not return responses, GStreamer pipelines that crash, request overload, and so on. What follows is a partial list of steps we took to ensure that things behave sensibly, particularly on a battery-powered device.

Retry strategy

The failure cache provides an effective way to stop the service from endlessly trying to create thumbnails that, in an earlier attempt, returned an error.

For remote images, we know that, if the server has (authoritatively) told us that it has no artwork for a particular artist or album, it is unlikely that artwork will appear any time soon. However, the server may be updated with more artwork periodically. To deal with this, we add an expiry time of one week to the entries in the failure cache. That way, we do not try to retrieve the same image again until at least one week has passed (and only if we receive a request for a thumbnail for that image again later).

As opposed to authoritative answers from the image server (“I do not have artwork for this artist.”), we can also encounter transient failures. For example, the server may currently be down, or there may be some other network-related issue. In this case, we remember the time of the failure and do not try to contact the remote server again for two hours. This conserves bandwidth and battery power.

The device may also be disconnected from the network, in which case any attempt to retrieve a remote image is doomed. Our implementation returns failure immediately on a cache miss for a remote image if no network is present or the device is in flight mode. (We do not add an entry to the failure cache in this case).

For local files, we know that, if an attempt to get a thumbnail for a particular file has failed, future attempts will fail as well. This means that the only way for the problem to get fixed is by modifying or replacing the actual media file. To deal with this, we add the inode number, modification time, and inode modification time to the key for local images. If a user replaces, say, a music file with a new one that contains artwork, we automatically pick up the new version of the file because its key has changed; the old version will eventually fall out of the cache.

Download and extraction failures

We monitor downloads and extractions for timely completion. (Timeouts for downloads and extractions can be configured separately.) If the server does not respond within 10 seconds, we abandon the attempt and treat it it as a transient network error. Similarly, the vs-thumb processes that extract images from audio and video files can hang. We monitor these processes and kill them if they do not produce a result within 10 seconds.

Database corruption

Assuming an error-free implementation of leveldb, database corruption is impossible. However, in practice, an errant command could scribble over the database files. If leveldb detects that the database is corrupted, the recovery strategy is simple: we delete the on-disk cache and start again from scratch. Because the cache contents are ephemeral anyway, this is fine (other than slower operation until the working set of thumbnails makes it into the cache again).

Dealing with backlog

The asynchronous API provided by the service allows an application to submit an unlimited number of requests. Lots of requests happen if, for example, the user has inserted a flash card with thousands of photos into the device and then requests a gallery view for the collection. If the service’s client-side API blindly forwards requests via DBus, this causes a problem because DBus terminates the connection once there are more than around 400 outstanding requests.

To deal with this, we limit the number of outstanding requests to 20 and send another request via DBus only when an earlier request completes. Additional requests are queued in memory. Because this happens on the client side, the number of outstanding requests is limited only by the amount of memory that is available to the client.

A related problem arises if a client submits many requests for a thumbnail for the same image. This happens when, for example, the user looks at a list of tracks: tracks that belong to the same album have the same artwork. If artwork needs to be retrieved from the remote server, naively forwarding cache misses for each thumbnail to the server would end up re-downloading the same image several times.

We deal with this by maintaining an in-memory map of all remote download requests that are currently in progress. If phase 1 reports a cache miss, before initiating a download, we add the key for the remote image to the map and remove it again once the download completes. If more requests for the same image encounter a cache miss while the download for the original request is still in progress, the key for the in-progress download is still in the map, and we hold additional requests for the same image until the download completes. We then schedule the held requests as usual and create their thumbnails from the image that was cached by the first request.

Security

The thumbnailer runs with normal user privileges. We use AppArmor’s aa_query_label() function to verify that the calling client has read access to a file it wants a thumbnail for. This prevents one application from accessing thumbnails produced by a different application, unless both applications can read the original file. In addition, we place the entire service under an AppArmor profile to ensure that it can write only to its own cache directory.

Conclusion

Overall, we are very pleased with the overall design and performance of the thumbnailer. Each component has a clearly defined role with a clean interface, which made it easy for us to experiment and to refine the design as we went along. The design is extensible, so we can support additional media types or remote data sources without disturbing the existing code.

We used threads sparingly and only where we saw worthwhile concurrency opportunities. Using asynchronous interfaces for long-running operations kept resource usage to a minimum and allowed us to take advantage of I/O interleaving. In turn, this extracts the best possible performance from the hardware.

The thumbnailer now runs on Ubuntu Touch and is used by the gallery, camera, and music apps, as well as for all scopes that display media thumbnails.


Read more
Michi

Over the past few months, James Henstridge, Xavi Garcia Mena, and I have implemented a fast and scalable thumbnailing service for Ubuntu and Ubuntu Touch. This post explains how we did it, and how we achieved our performance and reliability goals.

Introduction

On a phone as well as the desktop, applications need to display image thumbnails for various media, such as photos, songs, and videos. Creating thumbnails for such media is CPU-intensive and can be costly in bandwidth if images are retrieved over the network. In addition, different types of media require the use of different APIs that are non-trivial to learn. It makes sense to provide thumbnail creation as a platform API that hides this complexity from application developers and, to improve performance, to cache thumbnails on disk.

This article explains the requirements we had and how we implemented a thumbnailer service that is extremely fast and scalable, and robust in the face of power loss or crashes.

Requirements

We had a number of requirements we wanted to meet in our implementation.

  • Robustness
    In the event of a crash, the implementation must guarantee the integrity of on-disk data structures. This is particularly important on a phone, where we cannot expect the user to perform manual recovery (such as cleaning up damaged files). Because batteries can run out at any time, integrity must be guaranteed even in the face of power loss.
  • Scalability
    It is common for people to store many thousands of songs and photos on a device, so the cache must scale to at least tens of thousands of records. Thumbnails can range in size from a few kilobytes to well over a megabyte (for “thumbnails” at full-screen resolution), so the cache must deal efficiently with large records.
  • Re-usability
    Persistent and reliable on-disk storage of arbitrary records (ranging in size from a few bytes to potentially megabytes) is a common application requirement, so we did not want to create a cache implementation that is specific to thumbnails. Instead, the disk cache is provided as a stand-alone C++ API that can be used for any number of other purposes, such as a browser or HTTP cache, or to build an object file cache similar to ccache.
  • High performance
    The performance of the thumbnailer directly affects the user experience: it is not nice for the customer to look at “please wait a while” icons in, say, an image gallery while thumbnails are being loaded one by one. We therefore had to have a high-performance implementation that delivers cached thumbnails quickly (on the order of a millisecond per thumbnail on an Arm CPU). An efficient implementation also helps to conserve battery life.
  • Location independence and extensibility
    Canonical runs an image server at dash.ubuntu.com that provides album and artist artwork for many musicians and bands. Images from this server are used to display artwork in the music player for media that contains ID3 tags, but does not embed artwork in the media file. The thumbnailer must work with embedded images as well as remote images, and it must be possible to extend it for new types of media without unduly disturbing the existing code.
  • Low bandwidth consumption
    Mobile phones typically come with data caps, so the cache has to be frugal with network bandwidth.
  • Concurrency and isolation
    The implementation has to allow concurrent access by multiple applications, as well as concurrent access from a single implementation. Besides needing to be thread-safe, this means that a request for a thumbnail that is slow (such as downloading an image over the network) must not delay other requests.
  • Fault tolerance
    Mobile devices lose network access without warning, and users can add corrupt media files to their device. The implementation must be resilient to partial failures, such as incomplete network replies, dropped connections, and bad image data. Moreover, the recovery strategy for such failures must conserve battery and avoid repeated futile attempts to create thumbnails from media that cannot be retrieved or contains malformed data.
  • Security
    The implementation must ensure that applications cannot see (or, worse, overwrite) each other’s thumbnails or coerce the thumbnailer into delivering images from files that an application is not allowed to read.
  • Asynchronous API
    The customers of the thumbnailer are applications that are written in QML or Qt, which cannot block in the UI thread. The thumbnailer therefore must provide a non-blocking API. Moreover, the application developer should be able to get the best possible performance without having to use threads. Instead, concurrency must be internal to the implementation (which is able to put threads to use intelligently where they make sense), instead of the application throwing threads at the problem in the hope that it might make things faster when, in fact, it might just add overhead.
  • Monitoring
    The effectiveness of a cache cannot be assessed without statistics to show hit and miss rates, evictions, and other basic performance data, so it must provide a way to extract this information.
  • Error reporting
    When something goes wrong with a system service, typically the only way to learn about the problem is to look at log messages. In case of a failure, the implementation must leave enough footprints behind to allow someone to diagnose a failure after the fact with some chance of success.
  • Backward compatibility
    This project was a rewrite of an earlier implementation. Rather than delivering a “big bang” piece of software and potentially upsetting existing clients, we incrementally changed the implementation such that existing applications continued to work. (The only pre-existing interface was a QML interface that required no change.)

System architecture

Here is a high-level overview of the main system components.

A Fast Thumbnailer for UbuntuExternal API

To the outside world, the thumbnailer provides two APIs.

One API is a QML plugin that registers itself as an image provider for QQuickAsyncImageProvider. This allows the caller to to pass a URI that encodes a query for a local or remote thumbnail at a particular size; if the URI matches the registered provider, QML transfers control to the entry points in our plugin.

The second API is a Qt API that provides three methods:

QSharedPointer<Request> getThumbnail(QString const& filePath,
                                     QSize const& requestedSize);
QSharedPointer<Request> getAlbumArt(QString const& artist,
                                    QString const& album,
                                    QSize const& requestedSize);
QSharedPointer<Request> getArtistArt(QString const& artist,
                                     QString const& album,
                                     QSize const& requestedSize);

The getThumbnail() method extracts thumbnails from local media files, whereas getAlbumArt() and getArtistArt() retrieve artwork from the remote image server. The returned Request object provides a finished signal, and methods to test for success or failure of the request and to extract a thumbnail as a QImage. The request also provides a waitForFinished() method, so the API can be used synchronously.

Thumbnails are delivered to the caller in the size they are requested, subject to a (configurable) 1920-pixel limit. As an escape hatch, requests with width and height of zero deliver artwork at its original size, even if it exceeds the 1920-pixel limit. The scaling algorithm preserves the original aspect ratio and never scales up from the original, so the returned thumbnails may be smaller than their requested size.

DBus service

The thumbnailer is implemented as a DBus service with two interfaces. The first interface provides the server-side implementation of the three methods of the external API; the second interface is an administrative interface that can deliver statistics, clear the internal disk caches, and shut down the service. A simple tool, thumbnailer-admin, allows both interfaces to be called from the command line.

To conserve resources, the service is started on demand by DBus and shuts down after 30 seconds of idle time.

Image extraction

Image extraction uses an abstract base class. This interface is independent of media location and type. The actual image extraction is performed by derived implementations that download images from the remote server, extract them from local image files, or extract them from local streaming media files. This keeps knowledge of image location and encoding out of the main caching and error handling logic, and allows us to support new media types (whether local or remote) by simply adding extra derived implementations.

Image extraction is asynchronous, with currently three implementations:

  • Image downloader
    To retrieve artwork from the remote image server, the service talks to an abstract base class with asynchronous download_album() and download_artist() methods. This allows multiple downloads to run concurrently and makes it easy to add new local or remote image providers without disturbing the code for existing ones. A class derived from that abstract base implements a REST API with QNetworkAccessManager to retrieve images from dash.ubuntu.com.
  • Photo extractor
    The photo extractor is responsible for delivering images from local image files, such as JPEG or PNG files. It simply delegates that work to the image converter and scaler.
  • Audio and video thumbnail extractor
    To extract thumbnails from audio and video files, we use GStreamer. Due to reliability problems with some codecs that can hang or crash, we delegate the task to a separate vs-thumb executable. This shields the service from failures and also allows us to run several GStreamer pipelines concurrently without a crash of one pipeline affecting the others.

Image converter and scaler

We use a simple Image class with a synchronous interface to convert and scale different image formats to JPEG. The implementation uses Gdk-Pixbuf, which can handle many different input formats and is very efficient.

For JPEG source images, the code checks for the presence of EXIF data using libexif and, if it contains a thumbnail that is at least as large as the requested size, scales the thumbnail from the EXIF data. (For images taken with the camera on a Nexus 4, the original image size is 3264×1836, with an embedded EXIF thumbnail of 512×288. Scaling from the EXIF thumbnail is around one hundred times faster than scaling from the full-size image.)

Disk cache

The thumbnailer service optimizes performance and conserves bandwidth and battery by adopting a layered caching strategy.

Two-level caching with failure lookup

Internally, the service uses three separate on-disk caches:

  • Full-size cache
    This cache stores images that are expensive to retrieve (images that are remote or are embedded in audio and video files) at original resolution (scaled down to a 1920-pixel bounding box if the original image is larger). The default size of this cache is 50 MB, which is sufficient to hold around 400 images at 1920×1080 resolution. Images are stored in JPEG format (at a 90% quality setting).
  • Thumbnail cache
    This cache stores thumbnails at the size that was requested by the caller, such as 512×288. The default size of this cache is 100 MB, which is sufficient to store around 11,000 thumbnails at 512×288, or around 25,000 thumbnails at 256×144.
  • Failure cache
    The failure cache stores the keys for images that could not be extracted because of a failure. For remote images, this means that the server returned an authoritative answer “no such image exists”, or that we encountered an unexpected (non-authoritative) failure, such as the server not responding or a DNS lookup timing out. For local images, it means either that the image data could not be processed because it is damaged, or that an audio file does not contain embedded artwork.

The full-size cache exists because it is likely that an application will request thumbnails at different sizes for the same image. For example, when scrolling through a list of songs that shows a small thumbnail of the album cover beside each song, the user is likely to select one of the songs to play, at which point the media player will display the same cover in a larger size. By keeping full-size images in a separate (smallish) cache, we avoid performing an expensive extraction or download a second time. Instead, we create additional thumbnails by scaling them from the full-size cache (which uses an LRU eviction policy).

The thumbnail cache stores thumbnails that were previously retrieved, also using LRU eviction. Thumbnails are stored as JPEG at the default quality setting of 75%, at the actual size that was requested by the caller. Storing JPEG images (rather than, say, PNG) saves space and increases cache effectiveness. (The minimal quality loss from compression is irrelevant for thumbnails). Because we store thumbnails at the size they are actually needed, we may have several thumbnails for the same image in the cache (each thumbnail at a different size). But applications typically ask for thumbnails in only a small number of sizes, and ask for different sizes for the same image only rarely. So, the slight increase in disk space is minor and amply repaid by applications not having to scale thumbnails after they receive them from the cache, which saves battery and achieves better performance overall.

Finally, the failure cache is used to stop futile attempts to repeatedly extract a thumbnail when we know that the attempt will fail. It uses LRU eviction with an expiry time for each entry.

Cache lookup algorithm

When asked for a thumbnail at a particular size, the lookup and thumbnail generation proceed as follows:

  1. Check if a thumbnail exists in the requested size in the thumbnail cache. If so, return it.
  2. Check if a full-size image for the thumbnail exists in the full-size cache. If so, scale the new thumbnail from the full-size image, add the thumbnail to the thumbnail cache, and return it.
  3. Check if there is an entry for the thumbnail in the failure cache. If so, return an error.
  4. Attempt to download or extract the original image for the thumbnail. If the attempt fails, add an entry to the failure cache and return an error.
  5. If the original image was delivered by the remote server or was extracted locally from streaming media, add it to the full-size cache.
  6. Scale the thumbnail to the desired size, add it to the thumbnail cache, and return it.

Note that these steps represent only the logical flow of control for a particular thumbnail. The implementation executes these steps concurrently for different thumbnails.

Designing for performance

Apart from fast on-disk caches (see below), the thumbnailer must make efficient use of I/O bandwidth and threads. This not only means making things fast, but also to not unnecessarily waste resources such as threads, memory, network connections, or file descriptors. Provided that enough requests are made to keep the service busy, we do not want it to ever wait for a download or image extraction to complete while there is something else that could be done in the mean time, and we want it to keep all CPU cores busy. In addition, requests that are slow (because they require a download or a CPU-intensive image extraction) must not block requests that are queued up behind them if those requests would result in cache hits that could be returned immediately.

To achieve a high degree of concurrency without blocking on long-running operations while holding precious resources, the thumbnailer uses a three-phase lookup algorithm:

  1. In phase 1, we look at the caches to determine if we have a hit or an authoritative miss. Phase 1 is very fast. (It takes around a millisecond to return a thumbnail from the cache on a Nexus 4.) However, cache lookup can briefly stall on disk I/O or require a lot of CPU to extract and scale an image. To get good performance, phase 1 requests are passed to a thread pool with as many threads as there are CPU cores. This allows the maximum number of lookups to proceed concurrently.
  2. Phase 2 is initiated if phase 1 determines that a thumbnail requires download or extraction, either of which can take on the order of seconds. (In case of extraction from local media, the task is CPU intensive; in case of download, most of the time is spent waiting for the reply from the server.) This phase is scheduled asynchronously from an event loop. This minimizes task switching and allows large numbers of requests to be queued while only using a few bytes for each request that is waiting in the queue.
  3. Phase 3 is really a repeat of phase 1: if phase 2 produces a thumbnail, it adds it to the cache; if phase 2 does not produce a thumbnail, it creates an entry in the failure cache. By simply repeating phase 1, the lookup then results in either a thumbnail or an error.

If phase 2 determines that a download or extraction is required, that work is performed concurrently: the service schedules several downloads and extractions in parallel. By default, it will run up to two concurrent downloads, and as many concurrent GStreamer pipelines as there are CPUs. This ensures that we use all of the available CPU cores. Moreover, download and extraction run concurrently with lookups for phase 1 and 3. This means that, even if a cache lookup briefly stalls on I/O, there is a good chance that another thread can make use of the CPU.

Because slow operations do not block lookup, this also ensures that a slow request does not stall requests for thumbnails that are already in the cache. In other words, it does not matter how many slow requests are in progress: requests that can be completed quickly are indeed completed quickly, regardless of what is going on elsewhere.

Overall, this strategy works very well. For example, with sufficient workload, the service achieves around 750% CPU utilization on an 8-core desktop machine, while still delivering cache hits almost instantaneously. (On a Nexus 4, cache hits take a little over 1 ms while concurrent extractions or downloads are in progress.)

A re-usable persistent cache for C++

The three internal caches are implemented by a small and flexible C++ API. This API is available as a separate reusable PersistentStringCache component (see persistent-cache-cpp) that provides a persistent store of arbitrary key–value pairs. Keys and values can be binary, and entries can be large. (Megabyte-sized values do not present a problem.)

The implementation uses leveldb, which provides a very fast NoSQL database that scales to multi-gigabyte sizes and provides integrity guarantees. In particular, if the calling process crashes, all inserts that completed at the API level will be intact after a restart. (In case of a power failure or kernel crash, a few buffered inserts can be lost, but the integrity of the database is still guaranteed.)

To use a cache, the caller instantiates it with a path name, a maximum size, and an eviction policy. The eviction policy can be set to either strict LRU (least-recently-used) or LRU with an expiry time. Once a cache reaches its maximum size, expired entries (if any) are evicted first and, if that does not free enough space for a new entry, entries are discarded in least-recently-used order until enough room is available to insert a new record. (In all other respects, expired entries behave like entries that were never added.)

A simple get/put API allows records to be retrieved and added, for example:

auto c = core::PersistentStringCache::open(
    “my_cache”, 100 * 1024 * 1024, core::CacheDiscardPolicy::lru_only);
// Look for an entry and add it if there is a cache miss.
string key = "Bjarne";
auto value = c->get(key);
if (value) {
    cout << key << ″: ″ << *value << endl;
} else {
    value = "C++ inventor";  // Provide a value for the key. 
    c->put(key, *value);     // Insert it.
}

Running this program prints nothing on the first run, and “Bjarne: C++ inventor” on all subsequent runs.

The API also allows application-specific metadata to be added to records, provides detailed statistics, supports dynamic resizing of caches, and offers a simple adapter template that makes it easy to store complex user-defined types without the need to clutter the code with explicit serialization and deserialization calls. (In a pinch, if iteration is not needed, the cache can be used as a persistent map by setting an impossibly large cache size, in which case no records are ever evicted.)

Performance

Our benchmarks indicate good performance. (Figures are for an Intel Ivy Bridge i7-3770k 3.5 GHz machine with a 256 GB SSD.) Our test uses 60-byte string keys. Values are binary blobs filled with random data (so they are not compressible), 20 kB in size with a standard deviation of 7,000, so the majority of values are 13–27 kB in size. The cache size is 100 MB, so it contains around 5,000 records.

Filling the cache with 100 MB of records takes around 2.8 seconds. Thereafter, the benchmark does a random lookup with an 80% hit probability. In case of a cache miss, it inserts a new random record, evicting old records in LRU order to make room for the new one. For 100,000 iterations, the cache returns around 4,800 “thumbnails” per second, with an aggregate read/write throughput of around 93 MB/sec. At 90% hit rate, we see a 50% performance increase at around 7,100 records/sec. (Writes are expensive once the cache is full due to the need to evict entries, which requires updating the main cache table as well as an index.)

Repeating the test with a 1 GB cache produces identical timings so (within limits) performance remains constant for large databases.

Overall, performance is restricted largely by the bandwidth to disk. With a 7,200 rpm disk, we measured around one third of the performance with an SSD.

Recovering from errors

The overall design of the thumbnailer delivers good performance when things work. However, our implementation has to deal with the unexpected, such as network requests that do not return responses, GStreamer pipelines that crash, request overload, and so on. What follows is a partial list of steps we took to ensure that things behave sensibly, particularly on a battery-powered device.

Retry strategy

The failure cache provides an effective way to stop the service from endlessly trying to create thumbnails that, in an earlier attempt, returned an error.

For remote images, we know that, if the server has (authoritatively) told us that it has no artwork for a particular artist or album, it is unlikely that artwork will appear any time soon. However, the server may be updated with more artwork periodically. To deal with this, we add an expiry time of one week to the entries in the failure cache. That way, we do not try to retrieve the same image again until at least one week has passed (and only if we receive a request for a thumbnail for that image again later).

As opposed to authoritative answers from the image server (“I do not have artwork for this artist.”), we can also encounter transient failures. For example, the server may currently be down, or there may be some other network-related issue. In this case, we remember the time of the failure and do not try to contact the remote server again for two hours. This conserves bandwidth and battery power.

The device may also be disconnected from the network, in which case any attempt to retrieve a remote image is doomed. Our implementation returns failure immediately on a cache miss for a remote image if no network is present or the device is in flight mode. (We do not add an entry to the failure cache in this case).

For local files, we know that, if an attempt to get a thumbnail for a particular file has failed, future attempts will fail as well. This means that the only way for the problem to get fixed is by modifying or replacing the actual media file. To deal with this, we add the inode number, modification time, and inode modification time to the key for local images. If a user replaces, say, a music file with a new one that contains artwork, we automatically pick up the new version of the file because its key has changed; the old version will eventually fall out of the cache.

Download and extraction failures

We monitor downloads and extractions for timely completion. (Timeouts for downloads and extractions can be configured separately.) If the server does not respond within 10 seconds, we abandon the attempt and treat it it as a transient network error. Similarly, the vs-thumb processes that extract images from audio and video files can hang. We monitor these processes and kill them if they do not produce a result within 10 seconds.

Database corruption

Assuming an error-free implementation of leveldb, database corruption is impossible. However, in practice, an errant command could scribble over the database files. If leveldb detects that the database is corrupted, the recovery strategy is simple: we delete the on-disk cache and start again from scratch. Because the cache contents are ephemeral anyway, this is fine (other than slower operation until the working set of thumbnails makes it into the cache again).

Dealing with backlog

The asynchronous API provided by the service allows an application to submit an unlimited number of requests. Lots of requests happen if, for example, the user has inserted a flash card with thousands of photos into the device and then requests a gallery view for the collection. If the service’s client-side API blindly forwards requests via DBus, this causes a problem because DBus terminates the connection once there are more than around 400 outstanding requests.

To deal with this, we limit the number of outstanding requests to 20 and send another request via DBus only when an earlier request completes. Additional requests are queued in memory. Because this happens on the client side, the number of outstanding requests is limited only by the amount of memory that is available to the client.

A related problem arises if a client submits many requests for a thumbnail for the same image. This happens when, for example, the user looks at a list of tracks: tracks that belong to the same album have the same artwork. If artwork needs to be retrieved from the remote server, naively forwarding cache misses for each thumbnail to the server would end up re-downloading the same image several times.

We deal with this by maintaining an in-memory map of all remote download requests that are currently in progress. If phase 1 reports a cache miss, before initiating a download, we add the key for the remote image to the map and remove it again once the download completes. If more requests for the same image encounter a cache miss while the download for the original request is still in progress, the key for the in-progress download is still in the map, and we hold additional requests for the same image until the download completes. We then schedule the held requests as usual and create their thumbnails from the image that was cached by the first request.

Security

The thumbnailer runs with normal user privileges. We use AppArmor’s aa_query_label() function to verify that the calling client has read access to a file it wants a thumbnail for. This prevents one application from accessing thumbnails produced by a different application, unless both applications can read the original file. In addition, we place the entire service under an AppArmor profile to ensure that it can write only to its own cache directory.

Conclusion

Overall, we are very pleased with the overall design and performance of the thumbnailer. Each component has a clearly defined role with a clean interface, which made it easy for us to experiment and to refine the design as we went along. The design is extensible, so we can support additional media types or remote data sources without disturbing the existing code.

We used threads sparingly and only where we saw worthwhile concurrency opportunities. Using asynchronous interfaces for long-running operations kept resource usage to a minimum and allowed us to take advantage of I/O interleaving. In turn, this extracts the best possible performance from the hardware.

The thumbnailer now runs on Ubuntu Touch and is used by the gallery, camera, and music apps, as well as for all scopes that display media thumbnails.


Read more