Canonical Voices


Hello MAASTers

MAAS 2.4.1 has now been released and it is a bug fix release. Please see more details in [1].


Read more

A short lived ride After some time on Kubuntu on this new laptop, I just re-discovered that I did not want to live in the Plasma world anymore. While I do value all the work the team behind it does, the user interface is just not for me as it feels rather busy to my liking. In that aforementioned post I wrote about running the Ubuntu Report Tool on this system, it is not part of the Kubuntu install or first boot experience but you can install it by running apt install ubuntu-report followed by running ubuntu-report to actually create the report and if you want, send it too.

Read more
Christian Brauner

Unprivileged File Capabilities

alt text


File capabilities (fcaps) are capabilities associated with - well - files, usually a binary. They can be used to temporarily elevate privileges for unprivileged users in order to accomplish a privileged task. This allows various tools to drop the dangerous setuid (suid) or setgid (sgid) bits in favor of fcaps.

While fcaps are supported since Linux 2.6.24 they could only be set in the initial user namespace. If they would have been allowed to be set by root in a non-initial user namespace then any unprivileged user on the host would have been able to map their own uid to root in a new user namespace, set fcaps that would grant more privileges to them, and then execute the binary with elevated privileged on the host. This also means that until recently it was not safe to use fcaps in unprivileged containers, i.e. containers using user namespaces. The good news is that starting with Linux kernel version 4.14 it is possible to set fcaps in user namespaces.

Kernel Patchset

The patchset to enable this has been contributed by Serge Hallyn, a co-maintainer and core developer of the LXD and LXC projects:

commit 8db6c34f1dbc8e06aa016a9b829b06902c3e1340
Author: Serge E. Hallyn <>
Date:   Mon May 8 13:11:56 2017 -0500

    Introduce v3 namespaced file capabilities

LXD Now Preserves File Capabilities In User Namespaces

In parallel to the kernel patchset we have now enabled LXD to preserve fcaps in user namespaces. This means if your kernel supports namespaced fcaps LXD will preserve them whenever unprivileged containers are created, or when their idmapping is changed. No matter if you go from privileged to unpriviliged or the other way around. Your filesystem capabilities will be right there with you. In other news, there is now little to no use for the suid and sgid bits even in unprivileged containers.

This is something that the Linux Containers Project has wanted for a long time and we are happy that we are the first runtime to fully support this feature.

If all of the above either makes no sense to you or you’re asking yourself what is so great about this because some distros have been using fcaps for a long time don’t worry we’ll try to shed some light on all of this.

The dark ages: suid and sgid binaries

Not too long ago the suid and sgid bits were the only well-known mechanism to temporarily grant elevated privileges to unprivileged users executing a binary. Once some or all of the following binaries where suid or sgid binaries on most distros:

  • ping
  • newgidmap
  • newuidmap
  • mtr-packet

The binary that most developers will have already used is the ping binary. It’s convenient to just check whether a connection to the internet has been established successfully by pinging a random website. It’s such a common tool that most people don’t even think about it needing any sort of privilege. In fact it does require privileges. ping wants to open sockets of type SOCK_RAW but the kernel prevents unprivileged users from using sockets of type SOCK_RAW because it would allow them to e.g. send ICMP packages directly. But ping seems like a binary that is useful to unprivileged users as well as safe. Short of a better mechanism the most obvious choice is to have it be owned by uid 0 and set the suid bit.

> perms /bin/ping
-rwsr-xr-x 4755 /bin/ping

You can see the little s in the permissions. This indicates that this version of ping has the suid bit set. Hence, if called it will run as uid 0 independent of the uid of the caller. In short, if my user has uid 1000 and calls the ping binary ping will still run with uid 0.

While the suid mechanism gets the job done it is also wildly inappropriate. ping does need elevated privileges in one specific area. But by setting the suid bit and having ping be owned by uid 0 we’re granting it all kinds of privileges, in fact all privileges. If there ever is a major security sensitive bug in a suid binary it is trivial for anyone to exploit the fact that it runs as uid 0.

Of course, the kernel has all kinds of security mechanisms to deflate the impact of the suid and sgid bits. If you strace an suid binary the suid bit will be stripped, there are complex rules regarding execve()ing a binary that has the suid bit set, and the suid bit is also dropped when the owner of the binary in question changes, i.e. when you call chown() on it. Still these are all migitations for something that is inherently dangerous because it grants too much for too little gain. It’s like someone asking for a little sugar and you handing out the key to your house. To quote Eric:

Frankly being able to raise the priveleges of an existing process is such a dangerous mechanism and so limiting on system design that I wish someone would care, and remove all suid, sgid, and capabilities use from a distro. It is hard to count how many neat new features have been shelved because of the requirement to support suid root executables.

Capabilities and File Capabilities

This is where capabilities come into play 1. Capabilities start from the idea that the root privilege could as well be split into subsets of privileges. Whenever something requests to perform an operation that requires privileges it doesn’t have we can grant it a very specific subset instead of all privileges at once 2. For example, the ping binary would only need the CAP_NET_RAW capability because it is the capability that regulates whether a process can open SOCK_RAW sockets.

Capabilities are associated with processes and files. Granted, Linux capabilities are not the cleanest or easiest concept to grasp. But I’ll try to shed some light. In essence, capabilities can be present in four different types of sets. The kernel performs checks against a process by looking at its effective capability set, i.e. the capabilities the process has at the time of trying to perform the operation. The rest of the capability sets are (glossing over details now for the sake of brevity) basically used for calculating each other including the effective capability set. There are permitted capabilities, i.e. the capabilities a process is allowed to raise in the effective set, inheritable capabilities, i.e. capabilities that should be (but are only under certain restricted conditions) preserved across an execve(), and ambient capabilities that are there to fix the shortcomings of inheritable capabilities, i.e. they are there to allow unprivileged processes to preserve capabilities across an execve() call. 3 Last but not least we have file capabilities, i.e. capabilities that are attached to a file. When such a file is execve()ed the associated fcaps are taken into account when calculating the permissions after the execve().

Extended attributes and File Capabilities

The part most users are confused about is how capabilities get associated with files. This is where extended attributes (xattr) come into play. xattrs are <key>:<value> pairs that can be associated with files. They are stored on-disk as part of the metadata of a file. The <key> of an xattr will always be a string identifying the attribute in question whereas the <value> can be arbitrary data, i.e. it can be another string or binary data. Note that it is not guaranteed nor required by the kernel that a filesystem supports xattrs. While the virtual filesystem (vfs) will handle all core permission checks, i.e. it will verify that the caller is allowed to set the requested xattr but the actual operation of writing out the xattr on disk will be left to the filesystem. Without going into the specifics the callchain currently is:

SYSCALL_DEFINE5(setxattr, const char __user *, pathname,
                const char __user *, name, const void __user *, value,
                size_t, size, int, flags)
-> static int path_setxattr(const char __user *pathname,
                            const char __user *name, const void __user *value,
                            size_t size, int flags, unsigned int lookup_flags)
   -> static long setxattr(struct dentry *d, const char __user *name,
                           const void __user *value, size_t size, int flags)
      -> int vfs_setxattr(struct dentry *dentry, const char *name,
                          const void *value, size_t size, int flags)
         -> int __vfs_setxattr_noperm(struct dentry *dentry, const char *name,
                                      const void *value, size_t size, int flags)

and finally __vfs_setxattr_noperm() will call

int __vfs_setxattr(struct dentry *dentry, struct inode *inode, const char *name,
                   const void *value, size_t size, int flags)
        const struct xattr_handler *handler;

        handler = xattr_resolve_name(inode, &name);
        if (IS_ERR(handler))
                return PTR_ERR(handler);
        if (!handler->set)
                return -EOPNOTSUPP;
        if (size == 0)
                value = "";  /* empty EA, do not remove */
        return handler->set(handler, dentry, inode, name, value, size, flags);

The __vfs_setxattr() function will then call xattr_resolve_name() which will find and return the appropriate handler for the xattr in the list struct xattr_handler of the corresponding filesystem. If the filesystem has a handler for the xattr in question it will return it and the attribute will be set and if not EOPNOTSUPP will be surfaced to the caller.

For this article we will only focus on the permission checks that the vfs performs not on the filesystem specifics. An important thing to note is that different xattrs are subject to different permission checks by the vfs. First, the vfs regulates what types of xattrs are supported in the first place. If you look at the xattr.h header you will find all supported xattr namespaces. An xattr namespace is essentially nothing but a prefix like security.. Let’s look at a few examples from the xattr.h header:

#define XATTR_SECURITY_PREFIX "security."

#define XATTR_SYSTEM_PREFIX "system."

#define XATTR_TRUSTED_PREFIX "trusted."

#define XATTR_USER_PREFIX "user."

Based on the detected prefix the vfs will decide what permission checks to perform. For example, the user. namespace is not subject to very strict permission checks since it exists to allow users to store arbitrary information. However, some xattrs are subject to very strict permission checks since they allow to change privileges. For example, this affects the security. namespace. In fact, the xattr.h header even exposes a specific capability suffix to use with the security. namespace:

#define XATTR_CAPS_SUFFIX "capability"

As you might have figured out file capabilities are associated with the security.capability xattr.

In contrast to other xattrs the value associated with the security.capability xattr key is not a string but binary data. The actual implementation is a C struct that contains bitmasks of capability flags. To actually set file capabilities userspace would usually use the libcap library because the low-level bits of the implementation are not very easy to use. Let’s say a user wanted to associate the CAP_NET_RAW capability with the ping binary on a system that only supports non-namespaced file capabilities. Then this is the minimum that you would need to do in order to set CAP_NET_RAW in the effective and permitted set of the file:

 * Do not simply copy this code. For the sake of brevity I e.g. omitted
 * handling the necessary endianess translation. (Not to speak of the apparent
 * ugliness and missing documentation of my sloppy macros.)

struct vfs_cap_data xattr = {0};

#define raise_cap_permitted(x, cap_data)[(x)>>5].permitted   |= (1<<((x)&31))
#define raise_cap_inheritable(x, cap_data)[(x)>>5].inheritable |= (1<<((x)&31))

raise_cap_permitted(CAP_NET_RAW, xattr);

setxattr("/bin/ping", "security.capability", &xattr, sizeof(xattr), 0);

After having done this we can look at the ping binary and use the getcap binary to check whether we successfully set the CAP_NET_RAW capability on the ping binary. Here’s a little demo:


Setting Unprivileged File Capabilities

On kernels that support namespaced file capabilities the straightforward way to set a file capability is to attach to the user namespace in question as root and then simply perform the above operations. The kernel will then transparently handle the translation between a non-namespaced and a namespaced capability by recording the rootid from the kernel’s perspective (the kuid).

However, it is also possible to set file capabilities in lieu of another user namespace. In order to do this the code above needs to be changed slightly:

 * Do not simply copy this code. For the sake of brevity I e.g. omitted
 * handling the necessary endianess translation. (Not to speak of the apparent
 * ugliness and missing documentation of my sloppy macros.)

struct vfs_ns_cap_data ns_xattr = {0};

#define raise_cap_permitted(x, cap_data)[(x)>>5].permitted   |= (1<<((x)&31))
#define raise_cap_inheritable(x, cap_data)[(x)>>5].inheritable |= (1<<((x)&31))

raise_cap_permitted(CAP_NET_RAW, ns_xattr);
ns_xattr.rootid = 1000000;

setxattr("/bin/ping", "security.capability", &ns_xattr, sizeof(ns_xattr), 0);

As you can see the struct we use has changed. Instead of using struct vfs_cap_data we are now using struct vfs_ns_cap_data which has gained an additional field rootid. In our example we are setting the rootid to 1000000 which in my example is the rootid of uid 0 in the container’s user namespace as seen from the host. Additionally, we set the magic_etc bit for the fcap version that the vfs is expected to support to VFS_CAP_REVISION_3.


As you can see from the asciicast we can’t execute the ping binary as an unprivileged user on the host since the fcaps is namespaced and associated with uid 1000000. But if we copy that binary to a container where this uid is mapped to uid 0 we can now call ping as an unprivileged user.

So let’s look at an actual unprivileged container and let’s set the CAP_NET_RAW capability on the ping binary in there:


Some Implementation Details

As you have seen above a new struct vfs_ns_cap_data has been added to the kernel:

 * same as vfs_cap_data but with a rootid at the end
struct vfs_ns_cap_data {
        __le32 magic_etc;
        struct {
                __le32 permitted;    /* Little endian */
                __le32 inheritable;  /* Little endian */
        } data[VFS_CAP_U32];
        __le32 rootid;

In the end this struct is what the kernel expects to be passed and which it will use to calculate fcaps. The location of the permitted and inheritable set in struct vfs_ns_cap_data are obvious but the effective set seems to be missing. Whether or not effective caps are set on the file is determined by raising the VFS_CAP_FLAGS_EFFECTIVE bit in the magic_etc mask. The magic_etc member is also used to tell the kernel which fcaps version the vfs is expected to support. The kernel will verify that either XATTR_CAPS_SZ_2 or XATTR_CAPS_SZ_3 are passed as size and are correctly paired with the VFS_CAP_REVISION_2 and VFS_CAP_REVISION_3 flag. If XATTR_CAPS_SZ_2 is set then the kernel will not try to look for a rootid field in the struct it received, i.e. even if you pass a struct vfs_ns_cap_data with a rootid but set XATTR_CAPS_SZ_2 as size parameter and VFS_CAP_REVISION_2 in magic_etc the kernel will be able to ignore the rootid field and instead use the rootid of the current user namespace. This allows the kernel to transparently translate from VFS_CAP_REVISION_2 to VFS_CAP_REVISION_3 fcaps. The main translation mechanism can be found in cap_convert_nscap() and rootid_from_xattr():

* User requested a write of security.capability.  If needed, update the
* xattr to change from v2 to v3, or to fixup the v3 rootid.
* If all is ok, we return the new size, on error return < 0.
int cap_convert_nscap(struct dentry *dentry, void **ivalue, size_t size)
        struct vfs_ns_cap_data *nscap;
        uid_t nsrootid;
        const struct vfs_cap_data *cap = *ivalue;
        __u32 magic, nsmagic;
        struct inode *inode = d_backing_inode(dentry);
        struct user_namespace *task_ns = current_user_ns(),
                *fs_ns = inode->i_sb->s_user_ns;
        kuid_t rootid;
        size_t newsize;

        if (!*ivalue)
                return -EINVAL;
        if (!validheader(size, cap))
                return -EINVAL;
        if (!capable_wrt_inode_uidgid(inode, CAP_SETFCAP))
                return -EPERM;
        if (size == XATTR_CAPS_SZ_2)
                if (ns_capable(inode->i_sb->s_user_ns, CAP_SETFCAP))
                        /* user is privileged, just write the v2 */
                        return size;

        rootid = rootid_from_xattr(*ivalue, size, task_ns);
        if (!uid_valid(rootid))
                return -EINVAL;

        nsrootid = from_kuid(fs_ns, rootid);
        if (nsrootid == -1)
                return -EINVAL;

        newsize = sizeof(struct vfs_ns_cap_data);
        nscap = kmalloc(newsize, GFP_ATOMIC);
        if (!nscap)
                return -ENOMEM;
        nscap->rootid = cpu_to_le32(nsrootid);
        nsmagic = VFS_CAP_REVISION_3;
        magic = le32_to_cpu(cap->magic_etc);
        if (magic & VFS_CAP_FLAGS_EFFECTIVE)
                nsmagic |= VFS_CAP_FLAGS_EFFECTIVE;
        nscap->magic_etc = cpu_to_le32(nsmagic);
        memcpy(&nscap->data, &cap->data, sizeof(__le32) * 2 * VFS_CAP_U32);

        *ivalue = nscap;
        return newsize;


Having fcaps available in user namespaces just makes the argument to always use unprivileged containers even stronger. The Linux Containers Project is also working on a bunch of other kernel- and userspace features to improve unprivileged containers even more. Stay tuned! :)


  1. While capabilities provide a better mechanism to temporarily and selectively grant privileges to unprivileged processes they are by no means inherently safe. Setting fcaps should still be done rarely. If privilege escalation happens via suid or sgid bits or fcaps doesn’t matter in the end: it’s still a privilege escalation. 

  2. Exactly how to split up the root privilege and how exactly privileges should be implemented (e.g. should they be attached to file descriptors, should they be attached to inodes, etc.) is a good argument to have. For the sake of this article we will skip this discussion and assume the Linux implementation of POSIX capabilities. 

  3. If people are super keen and request this I can make a longer post how exactly they all relate to each other and possibly look at some of the implementation details too. 

Read more

Prologue After a week away from my computer I want to organize my thoughts on the progress made towards build VMs by providing this write up since that forum post can be a bit overwhelming if you are casually wanting to keep up to date. The reasons for this feature work to exist, for those not up to speed, is that we want to have a very consistent build environment for which anyone building a project can have an expectable outcome of a working snap (or non working one if it really doesn’t).

Read more

Hello MAASters!

I’m happy to announce that the current MAAS development release (2.5.0 alpha 1) is now officially available in PPA for early testers.
What’s new?
Most notable MAAS 2.5.0 alpha 1 changes include:
  • Proxying the communication through rack controllers
  • HA improvements for better Rack-to-Region communication and discovery
  • Adding new machines with IPMI credentials or non-PXE IP address
  • Commissioning during enlistment
For more details, please refer to the release notes available in discourse [1].
Where to get it?
MAAS 2.5.0a1 is currently available for Ubuntu Bionic in ppa:maas/next.
sudo add-apt-repository ppa:maas/next
sudo apt-get update
sudo apt-get install maas

Read more

New Laptop

Triggers Recently, as of last week, I decided to purchase a new laptop to replace my Microsoft Surface Pro 4 with which I was having a bittersweet relationship. The Surface Pro 4 is really nice hardware, I originally got it to get a head start and collaborate on the convergence story with Unity 8 on the desktop, but as is of folk knowledge now, some strategic choices were made.

Read more
Colin Ian King

The low-latency kernel offering with Ubuntu provides a kernel tuned for low-latency environments using low-latency kernel configuration options.  The x86 kernels by default run with the Intel-Pstate CPU scheduler set to run with the powersave scaling governor biased towards power efficiency.

While power efficiency is fine for most use-cases, it can introduce latencies due to the fact that the CPU can be running at a low frequency to save power and also switching from a deep C state when idle to a higher C state when servicing an event can also increase on latencies.

In a somewhat contrived experiment, I rigged up an i7-3770 to collect latency timings of clock_nanosleep() wake-ups with timer event coalescing disabled (timer_slack set to zero) over 60 seconds across a range of CPU scheduler and governor settings on a 4.15 low-latency kernel.  This can be achieved using stress-ng, for example:

 sudo stress-ng --cyclic 1 --cyclic-dist 100 –cyclic-sleep=10000 --cpu 1 -l 0 -v \
--cyclic-policy rr --cyclic-method clock_ns --cpu 0 -t 60 --timer-slack 0

..the above runs a cyclic measurement collecting latency counts in 100ns buckets with a clock_nanosecond wakeup interval of 10,000 nanoseconds with zero % load CPU stressor and timer slack set to 0 nanoseconds.  This dumps latency distribution stats that can be plotted to see where the modal latency points occur and the latency characteristics of the CPU scheduler.

I also used powerstat to measure the power consumed by the CPU package over a 60 second interval.  Measurements for the Intel-Pstate CPU scheduler [performance, powersave] and the ACPI CPU scheduler (intel_pstate=disabled) [performance, powersave, conservative and ondemand] were taken for 1,000,000 down to 10,000 nanosecond timer delays.

1,000,000 nanosecond timer delays (1 millisecond)

Strangely the powersave Intel-Pstate is using the most power (not what I expected).

The ACPI CPU scheduler in performance mode has the best latency distribution followed by the Intel-Pstate CPU scheduler also in performance mode.

100,000 nanosecond timer delays (100 microseconds)

Note that Intel-Pstate performance consumes the most power...
...and also has the most responsive low-latency distribution.

10,000 nanosecond timer delays (10 microseconds)

In this scenario, the ACPI CPU scheduler in performance mode was consuming the most power and had the best latency distribution.

It is clear that the best latency responses occur when a CPU scheduler is running in performance mode and this consumes a little more power than other CPU scheduler modes.  However, it is not clear which CPU scheduler (Intel-Pstate or ACPI) is best in specific use-cases.

The conclusion is rather obvious;  but needs to be stated.  For best low-latency response, set the CPU governor to the performance mode at the cost of higher power consumption.  Depending on the use-case, the extra power cost is probably worth the improved latency response.

As mentioned earlier, this is a somewhat contrived experiment, only one CPU was being exercised with a predictable timer wakeup.  A more interesting test would be with data handling, such as incoming packet handling over ethernet at different rates; I will probably experiment with that if and when I get more time.  Since this was a synthetic test using stress-ng, it does not represent real world low-latency scenarios, however, it may be worth exploring CPU scheduler settings to tune a low-latency configuration rather than relying on the default CPU scheduler setting.

Read more
Colin Watson

Here’s a brief changelog for this month.


  • Handle Bugzilla.time() changes in Bugzilla 5.1.1 (#1774838)
  • Cope with the comment author field being renamed to creator in recent Bugzilla versions (#1774838)

Build farm

  • Set the hostname and FQDN of LXD containers to match the host system, though with an IP address pointing to the container (#1747015)
  • If the extra build arguments include fast_cleanup: True, then skip the final cleanup steps of the build; this can be used when building in a VM that is guaranteed to be torn down after the build
  • Allow checking out a git tag rather than a branch (#1687078, forum post)
  • Add a local unauthenticated proxy on port 8222, which proxies through to the remote authenticated proxy; this should allow running a wider range of network clients, since some of them apparently don’t support authenticated proxies very well (#1690834, #1753340, forum post)
  • Run tar with correct working directory when building source tarballs for snaps


  • Port the loggerhead (Bazaar code browser) integration to gunicorn, allowing it to be used as an internal API as well
  • Optimise BuildableDistroSeries.findSeries (#1778732)
  • Proxy loggerhead branch diffs through the webapp, allowing AJAX MP revision diffs to work for private branches (#904070)


  • Convert most code to use explicit proxy configuration settings rather than picking up a proxy from the environment, making the effective production settings easier to understand


  • Fix crash while adding an ssh key with unknown type (#1777507)


  • Improve documentation of what deactivating an account does (#993153)

Read more
Patricia Domingues

Hello world!

Welcome to Canonical Voices. This is your first post. Edit or delete it, then start blogging!

Read more
Colin Watson

Here’s a brief changelog for this month.

Build farm

  • Send fast_cleanup: True to virtualised builds, since they can safely skip the final cleanup steps


  • Add spam controls to code review comments (#1680746)
  • Only consider the most recent nine successful builds when estimating recipe build durations (#1770121)
  • Make updated code import emails more informative


  • Upgrade to Twisted 17.9.0
  • Get the test suite passing on Ubuntu 18.04 LTS
  • Allow admins to configure users such that unsigned email from them will be rejected, as a spam defence (#1714967)


  • Prune old snap files that have been uploaded to the store; this cleaned up about 5TB of librarian space
  • Make the snap store client cope with a few more edge cases (#1766911)
  • Allow branches in snap store channel names (#1754405)

Soyuz (package management)

  • Add DistroArchSeries.setChrootFromBuild, allowing setting a chroot from a file produced by a live filesystem build
  • Disambiguate URLs to source package files in the face of filename clashes in imported archives
  • Optimise SourcePackagePublishingHistory:+listing-archive-extra (#1769979)


  • Disable purchasing of new commercial subscriptions; existing customers have been contacted, and people with questions about this can contact Canonical support
  • Various minor revisions to the terms of service from Canonical’s legal department, and a clearer data privacy policy

Read more

Hello MAASters!

I’m happy to announce that MAAS 2.4.0 (final) is now available!
This new MAAS release introduces a set of exciting features and improvements that improve performance, stability and usability of MAAS.
MAAS 2.4.0 will be immediately available in the PPA, but it is in the process of being SRU’d into Ubuntu Bionic.
PPA’s Availability
MAAS 2.4.0 is currently available for Ubuntu Bionic in ppa:maas/stable for the coming week.
sudo add-apt-repository ppa:maas/stable
sudo apt-get update
sudo apt-get install maas
What’s new?
Most notable MAAS 2.4.0 changes include:
  • Performance improvements across the backend & UI.
  • KVM pod support for storage pools (over API).
  • DNS UI to manage resource records.
  • Audit Logging
  • Machine locking
  • Expanded commissioning script support for firmware upgrades & HBA changes.
  • NTP services now provided with Chrony.
For the full list of features & changes, please refer to the release notes:

Read more
Colin Watson

Once again it’s been a while since we posted a general update, so here’s a changelog-style summary of what we’ve been up to.  As usual, this changelog preserves a reasonable amount of technical detail, but I’ve omitted changes that were purely internal refactoring with no externally-visible effects.


  • Hide questions on inactive projects from the results of non-pillar-specific searches


  • Optimise the main query on Person:+upcomingwork (#1692120)
  • Apply the spec privacy check on Person:+upcomingwork only to relevant specs (#1696519)
  • Move base clauses for specification searches into a CTE to avoid slow sequential scans


  • Switch to HTTPS for CVE references
  • Fix various failures to sync from Red Hat’s Bugzilla instance (#1678486)

Build farm

  • Send the necessary set of archive signing keys to builders (#1626739)
  • Hide the virt/nonvirt queue portlets on BuilderSet:+index if they’d be empty
  • Add a feature flag which can be used to prevent dispatching any build under a given minimum score
  • Write files fetched from builders to a temporary name, and only rename them into place on success
  • Emit the build URL at the start of build logs


  • Fix crash when scanning a Git-based MP when we need to link a new RevisionAuthor to an existing Person (#1693543)
  • Add source ref name to breadcrumbs for Git-based MPs; this gets the ref name into the page title, which makes it easier to find Git-based MPs in browser history
  • Allow registry experts to delete recipes
  • Explicitly mark the local apt archive for recipe builds as trusted (#1701826)
  • Set +code as the default view on the code layer for (Person)DistributionSourcePackage
  • Improve handling of branches with various kinds of partial data
  • Add and export BranchMergeProposal.scheduleDiffUpdates (#483945)
  • Move “Updating repository…” notice above the list of branches so that it’s harder to miss (#1745161)
  • Upgrade to Pygments 2.2.0, including better formatting of *.md files (#1740903)
  • Sort cancelled-before-starting recipe builds to the end of the build history (#746140)
  • Clean up the {Branch,GitRef}:+register-merge UI slightly
  • Optimise merge detection when the branch has no landing candidates


  • Use correct method separator in Allow headers (#1717682)
  • Optimise lp_sitecustomize so that bin/py starts up more quickly
  • Add a utility to make it easier to run Launchpad code inside lxc exec
  • Convert lp-source-dependencies to git
  • Remove the post-webservice-GET commit
  • Convert build system to virtualenv and pip, unblocking many upgrades of dependencies
  • Use eslint to lint JavaScript files
  • Tidy up various minor problems in the top-level Makefile (#483782)
  • Offering ECDSA or Ed25519 SSH keys to Launchpad SSH servers no longer causes a hang, although it still isn’t possible to use them for authentication (#830679)
  • Reject SSH public keys that Twisted can’t load (#230144)
  • Backport GPGME file descriptor handling improvements to fix timeouts importing GPG keys (#1753019)
  • Improve OOPSes for jobs
  • Switch the site-wide search to Bing Custom Search, since Google Site Search has been discontinued
  • Don’t send email to direct recipients without active accounts


  • Fix the privacy banner on PersonProduct pages
  • Show GPG fingerprints rather than collidable short key IDs (#1576142)
  • Fix PersonSet.getPrecachedPersonsFromIDs to handle teams with mailing lists
  • Optimise front page, mainly by gathering more statistics periodically rather than on the fly
  • Construct public keyserver links using HTTPS without an explicit port (#1739110)
  • Fall back to emailing the team owner if the team has no admins (#1270141)


  • Log some useful information from authorising macaroons while uploading snaps to the store, to make it easier to diagnose problems
  • Extract more useful error messages when snap store operations fail (#1650461, #1687068)
  • Send mail rather than OOPSing if refreshing snap store upload macaroons fails (#1668368)
  • Automatically retry snap store upload attempts that return 502 or 503
  • Initialise git submodules in snap builds (#1694413)
  • Make SnapStoreUploadJob retries go via celery and be much more responsive (#1689282)
  • Run snap builds in LXD containers, allowing them to install snaps as build-dependencies
  • Allow setting Snap.git_path directly on the webservice
  • Batch snap listing views (#1722562)
  • Fix AJAX update of snap builds table to handle all build statuses
  • Set SNAPCRAFT_BUILD_INFO=1 to tell snapcraft to generate a manifest
  • Only emit snap:build:0.1 webhooks from SnapBuild.updateStatus if the status has changed
  • Expose extended error messages (with external link) for snap build jobs (#1729580)
  • Begin work on allowing snap builds to install snapcraft as a snap; this can currently be set up via the API, and work is in progress to add UI and to migrate to this as the default (#1737994)
  • Add an admin option to disable external network access for snap builds
  • Export ISnapSet.findByOwner on the webservice
  • Prefer Snap.store_name over for the “name” argument dispatched to snap builds
  • Pass build URL to snapcraft using SNAPCRAFT_IMAGE_INFO
  • Add an option to build source tarballs for snaps (#1763639)

Soyuz (package management)

  • Stop SourcePackagePublishingHistory.getPublishedBinaries materialising rows outside the current batch; this fixes webservice timeouts for sources with large numbers of binaries (#1695113)
  • Implement proxying of PackageUpload binary files via the webapp, since DistroSeries:+queue now assumes that that works (#1697680)
  • Truncate signing key common-names to 64 characters (#1608615)
  • Allow setting a relative build score on live filesystems (#1452543)
  • Add signing support for vmlinux for use on ppc64el Opal (and compatible) firmware
  • Run live filesystem builds in LXD containers, allowing them to install snaps as build-dependencies
  • Accept a “debug” entry in live filesystem build metadata, which enables detailed live-build debugging
  • Accept and ignore options (e.g. [trusted=yes]) in sources.list lines passed via external_dependencies
  • Send proper email notifications about most failures to parse the .changes file (#499438)
  • Ensure that PPA .htpasswd salts are drawn from the correct alphabet (#1722209)
  • Improve DpkgArchitectureCache‘s timeline handling, and speed it up a bit in some cases (#1062638)
  • Support passing a snap channel into a live filesystem build through the environment
  • Add support for passing apt proxies to live-build
  • Allow anonymous launchpad.View on IDistributionSourcePackage
  • Handle queries starting with “ppa:” when searching the PPA vocabulary
  • Make PackageTranslationsUploadJob download librarian files to disk rather than memory
  • Send email notifications when an upload is signed with an expired key
  • Add Release, Release.gpg, and InRelease to by-hash directories
  • After publishing a custom file, mark its target suite as dirty so that it will be published (#1509026)


  • Fix text_to_html to not parse HTML as a C format string
  • Fall back to the package name from AC_INIT when expanding $(PACKAGE) in translation configuration files if no other definition can be found


  • Show a search icon for pickers where possible rather than “Choose…”

Read more
James Henstridge

While working on a feature for snapd, we had a need to perform a “secure bind mount”. In this context, “secure” meant:

  1. The source and/or target of the mount is owned by a less privileged user.
  2. User processes will continue to run while we’re performing the mount (so solutions that involve suspending all user processes are out).
  3. While we can’t prevent the user from moving the mount point, they should not be able to trick us into mounting to locations they don’t control (e.g. by replacing the path with a symbolic link).

The main problem is that the mount system call uses string path names to identify the mount source and target. While we can perform checks on the paths before the mounts, we have no way to guarantee that the paths don’t point to another location when we move on to the mount() system call: a classic time of check to time of use race condition.

One suggestion was to modify the kernel to add a MS_NOFOLLOW flag to prevent symbolic link attacks. This turns out to be harder than it would appear, since the kernel is documented as ignoring any flags other than MS_BIND and MS_REC when performing a bind mount. So even if a patched kernel also recognised the MS_NOFOLLOW, there would be no way to distinguish its behaviour from an unpatched kernel. Fixing this properly would probably require a new system call, which is a rabbit hole I don’t want to dive down.

So what can we do using the tools the kernel gives us? The common way to reuse a reference to a file between system calls is the file descriptor. We can securely open a file descriptor for a path using the following algorithm:

  1. Break the path into segments, and check that none are empty, ".", or "..".
  2. Open the root directory with open("/", O_PATH|O_DIRECTORY).
  3. Open the first segment with openat(parent_fd, "segment", O_PATH|O_NOFOLLOW|O_DIRECTORY).
  4. Repeat for each of the remaining file descriptors, closing parent descriptors as needed.

Now we just need to find a way to use these file descriptors with the mount system call. I came up with two strategies to achieve this.

Use the current working directory

The first idea I tried was to make use of the fact that the mount system call accepts relative paths. We can use the fchdir system call to change to a directory identified by a file descriptor, and then refer to it as ".". Putting those together, we can perform a secure bind mount as a multi step process:

  1. fchdir to the mount source directory.
  2. Perform a bind mount from "." to a private stash directory.
  3. fchdir to the mount target directory.
  4. Perform a bind mount from the private stash directory to ".".
  5. Unmount the private stash directory.

While this works, it has a few downsides. It requires a third intermediate location to stash the mount. It could interfere with anything else that relies on the working directory. It also only works for directory bind mounts, since you can’t fchdir to a regular file.

Faced with these downsides, I started thinking about whether there was any simpler options available.

Use magic /proc symbolic links

For every open file descriptor in a process, there is a corresponding file in /proc/self/fd/. These files appear to be symbolic links that point at the file associated with the descriptor. So what if we pass these /proc/self/fd/NNN paths to the mount system call?

The obvious question is that if these paths are symbolic links, is this any different than passing the real paths directly? It turns out that there is a difference, because the kernel does not resolve these symbolic links in the standard fashion. Rather than recursively resolving link targets, the kernel short circuits the process and uses the path structure associated with the file descriptor. So it will follow file moves and can even refer to paths in other mount namespaces. Furthermore the kernel keeps track of deleted paths, so we will get a clean error if the user deletes a directory after we’ve opened it.

So in pseudo-code, the secure mount operation becomes:

source_fd = secure_open("/path/to/source", O_PATH);
target_fd = secure_open("/path/to/target", O_PATH);
mount("/proc/self/fd/${source_fd}", "/proc/self/fd/${target_fd}",

This resolves all the problems I had with the first solution: it doesn’t alter the working directory, it can do file bind mounts, and doesn’t require a protected third location.

The only downside I’ve encountered is that if I wanted to flip the bind mount to read-only, I couldn’t use "/proc/self/fd/${target_fd}" to refer to the mount point. My best guess is that it continues to refer to the directory shadowed by the mount point, rather than the mount point itself. So it is necessary to re-open the mount point, which is another opportunity for shenanigans. One possibility would be to read /proc/self/fdinfo/NNN to determine the mount ID associated with the file descriptor and then double check it against /proc/self/mountinfo.

Read more

Hello MAASters!

I’m happy to announce that MAAS 2.4.0 beta 2 is now released and is available for Ubuntu Bionic.
MAAS Availability
MAAS 2.4.0 beta 2 is currently available in Bionic’s Archive or in the following PPA:

MAAS 2.4.0 (beta2)

New Features & Improvements

MAAS Internals optimisation

Continuing with MAAS’ internal surgery, a few more improvements have been made:

  • Backend improvements

  • Improve the image download process, to ensure rack controllers immediately start image download after the region has finished downloading images.

  • Reduce the service monitor interval to 30 seconds. The monitor tracks the status of the various services provided alongside MAAS (DNS, NTP, Proxy).

  • UI Performance optimizations for machines, pods, and zones, including better filtering of node types.

KVM pod improvements

Continuing with the improvements for KVM pods, beta 2 adds the ability to:

  • Define a default storage pool

This feature allows users to select the default storage pool to use when composing machines, in case multiple pools have been defined. Otherwise, MAAS will pick the storage pool automatically depending which pool has the most available space.

  • API – Allow allocating machines with different storage pools

Allows users to request a machine with multiple storage devices from different storage pools. This feature uses storage tags to automatically map a storage pool in libvirt with a storage tag in MAAS.

UI Improvements

  • Remove remaining YUI in favor of AngularJS.

As of beta 2, MAAS has now fully dropped the use of YUI for the Web Interface. The last section using YUI was the Settings page and the login page. Both sections have now been transitioned to use AngularJS instead.

  • Re-organize Settings page

The MAAS settings  have now been reorganized into multiple tabs.

Minor improvements

  • API for default DNS domain selection

Adds the ability to define the default DNS domain. This is currently only available via the API.

  • Vanilla framework upgrade

We would like to thank the Ubuntu web team for their hard work upgrading MAAS to the latest version of the Vanilla framework. MAAS is looking better and more consistent every day!

Bug fixes

Please refer to the following for all 37 bug fixes in this release, which address issues with MAAS across the board:


Read more
Colin Watson


Mohamed Alaa reported that Launchpad’s Bing site search implementation had a cross-site-scripting vulnerability.  This was introduced on 2018-03-29, and fixed on 2018-04-10.  We have not found any evidence of this bug being actively exploited by attackers; the rest of this post is an explanation of the problem for the sake of transparency.


Some time ago, Google announced that they would be discontinuing their Google Site Search product on 2018-04-01.  Since this served as part of the backend for Launchpad’s site search feature (“Search Launchpad” on the front page), we began to look around for a replacement.  We eventually settled on Bing Custom Search, implemented appropriate support in Launchpad, and switched over to it on 2018-03-29.

Unfortunately, we missed one detail.  Google Site Search’s XML API returns excerpts of search results as pre-escaped HTML, using <b> tags to indicate where search terms match.  This makes complete sense given its embedding in XML; it’s hard to see how that API could do otherwise.  The Launchpad integration code accordingly uses TAL code along these lines, using the structure keyword to explicitly indicate that the excerpts in question do not require HTML-escaping (like most good web frameworks, TAL’s default is to escape all variable content, so successful XSS attacks on Launchpad have historically been rare):

<div class="summary" tal:content="structure page/summary" />

However, Bing Custom Search’s JSON API returns excerpts of search results without any HTML escaping.  Again, in the context of the API in question, this makes complete sense as a default behaviour (though a textFormat=HTML switch is available to change this); but, in the absence of appropriate handling, this meant that those excerpts were passed through to the TAL code above without escaping.  As a result, if you could craft search terms that match a portion of an existing page on Launchpad that shows scripting tags (such as a bug about an XSS vulnerability in another piece of software hosted on Launchpad), and convince other people to follow a suitable search link, then you could cause that code to be executed in other users’ browsers.

The fix was, of course, to simply escape the data returned by Bing Custom Search.  Thanks to Mohamed Alaa for their disclosure.

Read more

Hello MAASters!

I’m happy to announce that MAAS 2.4.0 beta 1 and python-libmaas 0.6.0 have now been released and are available for Ubuntu Bionic.
MAAS Availability
MAAS 2.4.0 beta 1 is currently available in Bionic -proposed waiting to be published into Ubuntu, or in the following PPA:

MAAS 2.4.0 (beta1)

Important announcements

Debian package maas-dns no longer needed

The Debian package ‘maas-dns’ has now been made a transitional package. This package provided some post-installation configuration to prepare bind to be managed by MAAS, but it required maas-region-api to be installed first.

In order to streamline the installation and make it easier for users to install HA environments, the configuration of bind has now been integrated to the ‘maas-region-api’ package itself, which and we have made ‘maas-dns’ a dummy transitional package that can now be removed.

New Features & Improvements

MAAS Internals optimization

Major internal surgery to MAAS 2.4 continues improve various areas not visible to the user. These updates will advance the overall performance of MAAS in larger environments. These improvements include:

  • Database query optimizations

Further reductions in the number of database queries, significantly cutting the queries made by the boot source cache image import process from over 100 to just under 5.

  • UI optimizations

MAAS is being optimized to reduce the amount of data using the websocket API to render the UI. This is targeted at only processing data only for viewable information, improving various legacy areas. Currently, the work done for this release includes:

  • Only load historic script results (e.g. old commissioning/testing results) when requested / accessed by the user, instead of always making them available over the websocket.

  • Only load node objects in listing pages when the specific object type is requested. For instance, only load machines when accessing the machines tab instead of also loading devices and controllers.

  • Change the UI mechanism to only request OS Information only on initial page load rather than every 10 seconds.

KVM pod improvements

Continuing with the improvements from alpha 2, this new release provides more updates to KVM pods:

  • Added overcommit ratios for CPU and memory.

When composing or allocating machines, previous versions of MAAS would allow the user to request as many resources as the user wanted regardless of the available resources. This created issues when dynamically allocating machines as it could allow users to create an infinite number of machines even when the physical host was already over committed. Adding this feature allows administrators to control the amount of resources they want to over commit.

  • Added ability to filter which pods or pod types to avoid when allocating machines

Provides users with the ability to select which pods or pod types not to allocate resources from. This makes it particularly useful when dynamically allocating machines when MAAS has a large number of pods.

DNS UI Improvements

MAAS 2.0 introduced the ability to manage DNS, not only to allow the creation of new domains, but also to the creation of resources records such as A, AAA, CNAME, etc. However, most of this functionality has only been available over the API, as the UI only allowed to add and remove new domains.

As of 2.4, MAAS now adds the ability to manage not only DNS domains but also the following resource records:

  • Added ability to edit domains (e.g. TTL, name, authoritative).

  • Added ability to create and delete resource records (A, AAA, CNAME, TXT, etc).

  • Added ability to edit resource records.

Navigation UI improvements

MAAS 2.4 beta 1 is changing the top-level navigation:

  • Rename ‘Zones’ for ‘AZs’

  • Add ‘Machines, Devices, Controllers’ to the top-level navigation instead of ‘Hardware’.

Minor improvements

A few notable improvements being made available in MAAS 2.4 include:

  • Add ability to force the boot type for IPMI machines.

Hardware manufactures have been upgrading their BMC firmware versions to be more compliant with the Intel IPMI 2.0 spec. Unfortunately, the IPMI 2.0 spec has made changes that provide a non-backward compatible user experience. For example, if the administrator configures their machine to always PXE boot over EFI, and the user executed an IPMI command without specifying the boot type, the machine would use the value of the configured BIOS. However, with  these new changes, the user is required to always specify a boot type, avoiding a fallback to the BIOS.

As such, MAAS now allows the selection of a boot type (auto, legacy, efi) to force the machine to always PXE with the desired type (on the next boot only) .

  • Add ability, via the API, to skip the BMC configuration on commissioning.

Provides an API option to skip the BMC auto configuration during commissioning for IPMI systems. This option helps admins keep credentials provided over the API when adding new nodes.

Bug fixes

Please refer to the following for all 32 bug fixes in this release.


Read more


Después de casi un año, con Nico liberamos una nueva versión de fades.

¿Qué hay de nuevo en esta release?

  • Revisar si todo lo pedido está realmente disponible en PyPI antes de comenzar a instalarlo
  • Ignora dependencias duplicadas
  • Varias mejoras y correcciones en los mensajes que fades muestra en modo verbose
  • Prohibimos el mal uso de fades: instalarlo en legacy Python y ejecutarlo desde adentro de otro virtualenv
  • Un montón de mejoras relacionadas al proyecto en sí (pero no directamente visibles para el usuario final) y algunas pequeñas otras correcciones


Loguito de fades


infoauth es un un pequeño pero práctico módulo de Python y script para grabar/cargar tokens a/desde disco.

Esto es lo que hace:

  • graba tokens en un archivo en disco, pickleado y zippeado
  • cambia el archivo a sólo lectura, y sólo legible por vos
  • carga los tokens de ese archivo en disco

En qué casos este módulo es útil? Digamos que tenés un script o programa que necesita algunos tokens secretos (autenticación de mail, tokens de Twitter, la info para conectarse a una base de datos, etc...), pero no querés incluir estos tokens en el código, porque el mismo es público, entonces con este módulo harías:

    tokens = infoauth.load(os.path.expanduser("~/.my-tokens"))

Fijate que el archivo va a quedar legible sólo por vos y no en el directorio del proyecto (así no tenés el riesgo de compartirlo por accidente).

CUIDADO: infoauth NO protege tus secretos con una clave o algo así, este módulo NO asegura tus secretos de ninguna manera. Sí, los tokens están enmarañados (porque se picklean y comprimen) y otra gente quizás no pueda accederlos fácilmente (legible sólo por vos), pero no hay más protección que esa. Usalo bajo tu propio riesgo.

Entonces, ¿cómo usarlo desde un programa en Python? Es fácil, para cargar la data:

    import infoauth
    auth = infoauth.load(os.path.expanduser("~/.my-mail-auth"))
    # ...
    mail.auth(auth['user'], auth['password'])

Para grabarla:

    import infoauth
    secrets = {'some-stuff': 'foo', 'code': 67}
    infoauth.dump(secrets, os.path.expanduser("~/.secrets"))

Fijate que como grabar los tokens es algo que normalmente se hace una sola vez, seguro es más práctico hacerlo desde la linea de comandos, como se muestra a continuación...

Por eso, ¿cómo usarlo desde la linea de comandos? Para mostrar la info:

    $ infoauth show ~/.my-mail-auth
    password: ...
    user: ...

Y para grabar un archivo con los datos::

    $ infoauth create ~/.secrets some-stuff=foo code=67

Fijate que al crear el archivo desde la linea de comandos tenemos la limitación de que todos los valores almacenados van a ser cadenas de texto; si querés grabar otros tipos de datos, como enteros, listas, o lo que quieras, tendrías que usar la forma programática que se muestra arriba.

Esta es la página del proyecto, y claro que está en PyPI así que se puede usar sin problema desde fades (guiño, guiño).

Read more
Marcus Haslam

Introducing the new Snapcraft branding

Early developments of the the Snapcraft brand mark

If you’re a regular visitor to or any of its associated sites, you will have noticed a change recently to the logo and overall branding which has been in development over the past few months. We have developed a stand-alone brand for Snapcraft, the command line tool for writing and publishing software as a snap. One of the challenges we faced was how to create a brand for Snapcraft that stands out in its own right yet fitted in with the existing Ubuntu brand and be a part of the extended family. To achieve this, we took reference from the Suru visual language. The Suru philosophy stems from our brand values alluding to Japanese culture. Working with paper metaphors we were inspired by origami as a solid and tangible foundation.

We identified the key attributes of:

Simple – to package leveraging your existing tools

Integrated – easily with build and CL infrastructure

Effortless – roll back versions

Easy – integrate with build and CI infrastructures

We worked with these whilst keeping with the overall Ubuntu brand values of:





We started exploring the Origami language and felt the idea of a bird fitted well with the attributes. We worked at simplifying the design until we had it at it’s most minimal we could whilst retaining the elegance of a bird in flight.

A new colour scheme was developed to have a clean fresh look while sitting comfortably with the primary and secondary Ubuntu colour palette. We used the Ubuntu font for consistency with the overall parent brand, and in this instance we used the thin weight along side the light to create a dynamic word mark reflecting the values of Freedom, Precision, Collaboration and Reliable.


Colour Palette


We have a number of different Lock-ups we can use for different circumstances




Read more
Colin Ian King

Kernel Commits with "Fixes" tag

Over the past 5 years there has been a steady increase in the number of kernel bug fix commits that use the "Fixes" tag.  Kernel developers use this annotation on a commit to reference an older commit that originally introduced the bug, which is obviously very useful for bug tracking purposes. What is interesting is that there has been a steady take-up of developers using this annotation:

With the 4.15 release, 1859 of the 16223 commits (11.5%) were tagged as "Fixes", so that's a fair amount of work going into bug fixing.  I suspect there are more commits that are bug fixes, but aren't using the "Fixes" tag, so it's hard to tell for certain how many commits are fixes without doing deeper analysis.  Probably over time this tag will be widely adopted for all bug fixes and the trend line will level out and we will have a better idea of the proportion of commits per release that are just devoted to fixing issues.  Let's see how this looks in another 5 years time,  I'll keep you posted!

Read more
Christian Brauner

History Of Linux Containers By Serge Hallyn

alt text

Serge Hallyn recently wrote a post outlining the actual history of containers on Linux. Worth a read!


Read more