Canonical Voices


Hello MAASters!

I’m happy to announce that MAAS 2.4.0 (final) is now available!
This new MAAS release introduces a set of exciting features and improvements that improve performance, stability and usability of MAAS.
MAAS 2.4.0 will be immediately available in the PPA, but it is in the process of being SRU’d into Ubuntu Bionic.
PPA’s Availability
MAAS 2.4.0 is currently available for Ubuntu Bionic in ppa:maas/stable for the coming week.
sudo add-apt-repository ppa:maas/stable
sudo apt-get update
sudo apt-get install maas
What’s new?
Most notable MAAS 2.4.0 changes include:
  • Performance improvements across the backend & UI.
  • KVM pod support for storage pools (over API).
  • DNS UI to manage resource records.
  • Audit Logging
  • Machine locking
  • Expanded commissioning script support for firmware upgrades & HBA changes.
  • NTP services now provided with Chrony.
For the full list of features & changes, please refer to the release notes:

Read more
Colin Watson

Once again it’s been a while since we posted a general update, so here’s a changelog-style summary of what we’ve been up to.  As usual, this changelog preserves a reasonable amount of technical detail, but I’ve omitted changes that were purely internal refactoring with no externally-visible effects.


  • Hide questions on inactive projects from the results of non-pillar-specific searches


  • Optimise the main query on Person:+upcomingwork (#1692120)
  • Apply the spec privacy check on Person:+upcomingwork only to relevant specs (#1696519)
  • Move base clauses for specification searches into a CTE to avoid slow sequential scans


  • Switch to HTTPS for CVE references
  • Fix various failures to sync from Red Hat’s Bugzilla instance (#1678486)

Build farm

  • Send the necessary set of archive signing keys to builders (#1626739)
  • Hide the virt/nonvirt queue portlets on BuilderSet:+index if they’d be empty
  • Add a feature flag which can be used to prevent dispatching any build under a given minimum score
  • Write files fetched from builders to a temporary name, and only rename them into place on success
  • Emit the build URL at the start of build logs


  • Fix crash when scanning a Git-based MP when we need to link a new RevisionAuthor to an existing Person (#1693543)
  • Add source ref name to breadcrumbs for Git-based MPs; this gets the ref name into the page title, which makes it easier to find Git-based MPs in browser history
  • Allow registry experts to delete recipes
  • Explicitly mark the local apt archive for recipe builds as trusted (#1701826)
  • Set +code as the default view on the code layer for (Person)DistributionSourcePackage
  • Improve handling of branches with various kinds of partial data
  • Add and export BranchMergeProposal.scheduleDiffUpdates (#483945)
  • Move “Updating repository…” notice above the list of branches so that it’s harder to miss (#1745161)
  • Upgrade to Pygments 2.2.0, including better formatting of *.md files (#1740903)
  • Sort cancelled-before-starting recipe builds to the end of the build history (#746140)
  • Clean up the {Branch,GitRef}:+register-merge UI slightly
  • Optimise merge detection when the branch has no landing candidates


  • Use correct method separator in Allow headers (#1717682)
  • Optimise lp_sitecustomize so that bin/py starts up more quickly
  • Add a utility to make it easier to run Launchpad code inside lxc exec
  • Convert lp-source-dependencies to git
  • Remove the post-webservice-GET commit
  • Convert build system to virtualenv and pip, unblocking many upgrades of dependencies
  • Use eslint to lint JavaScript files
  • Tidy up various minor problems in the top-level Makefile (#483782)
  • Offering ECDSA or Ed25519 SSH keys to Launchpad SSH servers no longer causes a hang, although it still isn’t possible to use them for authentication (#830679)
  • Reject SSH public keys that Twisted can’t load (#230144)
  • Backport GPGME file descriptor handling improvements to fix timeouts importing GPG keys (#1753019)
  • Improve OOPSes for jobs
  • Switch the site-wide search to Bing Custom Search, since Google Site Search has been discontinued
  • Don’t send email to direct recipients without active accounts


  • Fix the privacy banner on PersonProduct pages
  • Show GPG fingerprints rather than collidable short key IDs (#1576142)
  • Fix PersonSet.getPrecachedPersonsFromIDs to handle teams with mailing lists
  • Optimise front page, mainly by gathering more statistics periodically rather than on the fly
  • Construct public keyserver links using HTTPS without an explicit port (#1739110)
  • Fall back to emailing the team owner if the team has no admins (#1270141)


  • Log some useful information from authorising macaroons while uploading snaps to the store, to make it easier to diagnose problems
  • Extract more useful error messages when snap store operations fail (#1650461, #1687068)
  • Send mail rather than OOPSing if refreshing snap store upload macaroons fails (#1668368)
  • Automatically retry snap store upload attempts that return 502 or 503
  • Initialise git submodules in snap builds (#1694413)
  • Make SnapStoreUploadJob retries go via celery and be much more responsive (#1689282)
  • Run snap builds in LXD containers, allowing them to install snaps as build-dependencies
  • Allow setting Snap.git_path directly on the webservice
  • Batch snap listing views (#1722562)
  • Fix AJAX update of snap builds table to handle all build statuses
  • Set SNAPCRAFT_BUILD_INFO=1 to tell snapcraft to generate a manifest
  • Only emit snap:build:0.1 webhooks from SnapBuild.updateStatus if the status has changed
  • Expose extended error messages (with external link) for snap build jobs (#1729580)
  • Begin work on allowing snap builds to install snapcraft as a snap; this can currently be set up via the API, and work is in progress to add UI and to migrate to this as the default (#1737994)
  • Add an admin option to disable external network access for snap builds
  • Export ISnapSet.findByOwner on the webservice
  • Prefer Snap.store_name over for the “name” argument dispatched to snap builds
  • Pass build URL to snapcraft using SNAPCRAFT_IMAGE_INFO
  • Add an option to build source tarballs for snaps (#1763639)

Soyuz (package management)

  • Stop SourcePackagePublishingHistory.getPublishedBinaries materialising rows outside the current batch; this fixes webservice timeouts for sources with large numbers of binaries (#1695113)
  • Implement proxying of PackageUpload binary files via the webapp, since DistroSeries:+queue now assumes that that works (#1697680)
  • Truncate signing key common-names to 64 characters (#1608615)
  • Allow setting a relative build score on live filesystems (#1452543)
  • Add signing support for vmlinux for use on ppc64el Opal (and compatible) firmware
  • Run live filesystem builds in LXD containers, allowing them to install snaps as build-dependencies
  • Accept a “debug” entry in live filesystem build metadata, which enables detailed live-build debugging
  • Accept and ignore options (e.g. [trusted=yes]) in sources.list lines passed via external_dependencies
  • Send proper email notifications about most failures to parse the .changes file (#499438)
  • Ensure that PPA .htpasswd salts are drawn from the correct alphabet (#1722209)
  • Improve DpkgArchitectureCache‘s timeline handling, and speed it up a bit in some cases (#1062638)
  • Support passing a snap channel into a live filesystem build through the environment
  • Add support for passing apt proxies to live-build
  • Allow anonymous launchpad.View on IDistributionSourcePackage
  • Handle queries starting with “ppa:” when searching the PPA vocabulary
  • Make PackageTranslationsUploadJob download librarian files to disk rather than memory
  • Send email notifications when an upload is signed with an expired key
  • Add Release, Release.gpg, and InRelease to by-hash directories
  • After publishing a custom file, mark its target suite as dirty so that it will be published (#1509026)


  • Fix text_to_html to not parse HTML as a C format string
  • Fall back to the package name from AC_INIT when expanding $(PACKAGE) in translation configuration files if no other definition can be found


  • Show a search icon for pickers where possible rather than “Choose…”

Read more
James Henstridge

While working on a feature for snapd, we had a need to perform a “secure bind mount”. In this context, “secure” meant:

  1. The source and/or target of the mount is owned by a less privileged user.
  2. User processes will continue to run while we’re performing the mount (so solutions that involve suspending all user processes are out).
  3. While we can’t prevent the user from moving the mount point, they should not be able to trick us into mounting to locations they don’t control (e.g. by replacing the path with a symbolic link).

The main problem is that the mount system call uses string path names to identify the mount source and target. While we can perform checks on the paths before the mounts, we have no way to guarantee that the paths don’t point to another location when we move on to the mount() system call: a classic time of check to time of use race condition.

One suggestion was to modify the kernel to add a MS_NOFOLLOW flag to prevent symbolic link attacks. This turns out to be harder than it would appear, since the kernel is documented as ignoring any flags other than MS_BIND and MS_REC when performing a bind mount. So even if a patched kernel also recognised the MS_NOFOLLOW, there would be no way to distinguish its behaviour from an unpatched kernel. Fixing this properly would probably require a new system call, which is a rabbit hole I don’t want to dive down.

So what can we do using the tools the kernel gives us? The common way to reuse a reference to a file between system calls is the file descriptor. We can securely open a file descriptor for a path using the following algorithm:

  1. Break the path into segments, and check that none are empty, ".", or "..".
  2. Open the root directory with open("/", O_PATH|O_DIRECTORY).
  3. Open the first segment with openat(parent_fd, "segment", O_PATH|O_NOFOLLOW|O_DIRECTORY).
  4. Repeat for each of the remaining file descriptors, closing parent descriptors as needed.

Now we just need to find a way to use these file descriptors with the mount system call. I came up with two strategies to achieve this.

Use the current working directory

The first idea I tried was to make use of the fact that the mount system call accepts relative paths. We can use the fchdir system call to change to a directory identified by a file descriptor, and then refer to it as ".". Putting those together, we can perform a secure bind mount as a multi step process:

  1. fchdir to the mount source directory.
  2. Perform a bind mount from "." to a private stash directory.
  3. fchdir to the mount target directory.
  4. Perform a bind mount from the private stash directory to ".".
  5. Unmount the private stash directory.

While this works, it has a few downsides. It requires a third intermediate location to stash the mount. It could interfere with anything else that relies on the working directory. It also only works for directory bind mounts, since you can’t fchdir to a regular file.

Faced with these downsides, I started thinking about whether there was any simpler options available.

Use magic /proc symbolic links

For every open file descriptor in a process, there is a corresponding file in /proc/self/fd/. These files appear to be symbolic links that point at the file associated with the descriptor. So what if we pass these /proc/self/fd/NNN paths to the mount system call?

The obvious question is that if these paths are symbolic links, is this any different than passing the real paths directly? It turns out that there is a difference, because the kernel does not resolve these symbolic links in the standard fashion. Rather than recursively resolving link targets, the kernel short circuits the process and uses the path structure associated with the file descriptor. So it will follow file moves and can even refer to paths in other mount namespaces. Furthermore the kernel keeps track of deleted paths, so we will get a clean error if the user deletes a directory after we’ve opened it.

So in pseudo-code, the secure mount operation becomes:

source_fd = secure_open("/path/to/source", O_PATH);
target_fd = secure_open("/path/to/target", O_PATH);
mount("/proc/self/fd/${source_fd}", "/proc/self/fd/${target_fd}",

This resolves all the problems I had with the first solution: it doesn’t alter the working directory, it can do file bind mounts, and doesn’t require a protected third location.

The only downside I’ve encountered is that if I wanted to flip the bind mount to read-only, I couldn’t use "/proc/self/fd/${target_fd}" to refer to the mount point. My best guess is that it continues to refer to the directory shadowed by the mount point, rather than the mount point itself. So it is necessary to re-open the mount point, which is another opportunity for shenanigans. One possibility would be to read /proc/self/fdinfo/NNN to determine the mount ID associated with the file descriptor and then double check it against /proc/self/mountinfo.

Read more

Hello MAASters!

I’m happy to announce that MAAS 2.4.0 beta 2 is now released and is available for Ubuntu Bionic.
MAAS Availability
MAAS 2.4.0 beta 2 is currently available in Bionic’s Archive or in the following PPA:

MAAS 2.4.0 (beta2)

New Features & Improvements

MAAS Internals optimisation

Continuing with MAAS’ internal surgery, a few more improvements have been made:

  • Backend improvements

  • Improve the image download process, to ensure rack controllers immediately start image download after the region has finished downloading images.

  • Reduce the service monitor interval to 30 seconds. The monitor tracks the status of the various services provided alongside MAAS (DNS, NTP, Proxy).

  • UI Performance optimizations for machines, pods, and zones, including better filtering of node types.

KVM pod improvements

Continuing with the improvements for KVM pods, beta 2 adds the ability to:

  • Define a default storage pool

This feature allows users to select the default storage pool to use when composing machines, in case multiple pools have been defined. Otherwise, MAAS will pick the storage pool automatically depending which pool has the most available space.

  • API – Allow allocating machines with different storage pools

Allows users to request a machine with multiple storage devices from different storage pools. This feature uses storage tags to automatically map a storage pool in libvirt with a storage tag in MAAS.

UI Improvements

  • Remove remaining YUI in favor of AngularJS.

As of beta 2, MAAS has now fully dropped the use of YUI for the Web Interface. The last section using YUI was the Settings page and the login page. Both sections have now been transitioned to use AngularJS instead.

  • Re-organize Settings page

The MAAS settings  have now been reorganized into multiple tabs.

Minor improvements

  • API for default DNS domain selection

Adds the ability to define the default DNS domain. This is currently only available via the API.

  • Vanilla framework upgrade

We would like to thank the Ubuntu web team for their hard work upgrading MAAS to the latest version of the Vanilla framework. MAAS is looking better and more consistent every day!

Bug fixes

Please refer to the following for all 37 bug fixes in this release, which address issues with MAAS across the board:


Read more
Colin Watson


Mohamed Alaa reported that Launchpad’s Bing site search implementation had a cross-site-scripting vulnerability.  This was introduced on 2018-03-29, and fixed on 2018-04-10.  We have not found any evidence of this bug being actively exploited by attackers; the rest of this post is an explanation of the problem for the sake of transparency.


Some time ago, Google announced that they would be discontinuing their Google Site Search product on 2018-04-01.  Since this served as part of the backend for Launchpad’s site search feature (“Search Launchpad” on the front page), we began to look around for a replacement.  We eventually settled on Bing Custom Search, implemented appropriate support in Launchpad, and switched over to it on 2018-03-29.

Unfortunately, we missed one detail.  Google Site Search’s XML API returns excerpts of search results as pre-escaped HTML, using <b> tags to indicate where search terms match.  This makes complete sense given its embedding in XML; it’s hard to see how that API could do otherwise.  The Launchpad integration code accordingly uses TAL code along these lines, using the structure keyword to explicitly indicate that the excerpts in question do not require HTML-escaping (like most good web frameworks, TAL’s default is to escape all variable content, so successful XSS attacks on Launchpad have historically been rare):

<div class="summary" tal:content="structure page/summary" />

However, Bing Custom Search’s JSON API returns excerpts of search results without any HTML escaping.  Again, in the context of the API in question, this makes complete sense as a default behaviour (though a textFormat=HTML switch is available to change this); but, in the absence of appropriate handling, this meant that those excerpts were passed through to the TAL code above without escaping.  As a result, if you could craft search terms that match a portion of an existing page on Launchpad that shows scripting tags (such as a bug about an XSS vulnerability in another piece of software hosted on Launchpad), and convince other people to follow a suitable search link, then you could cause that code to be executed in other users’ browsers.

The fix was, of course, to simply escape the data returned by Bing Custom Search.  Thanks to Mohamed Alaa for their disclosure.

Read more

Hello MAASters!

I’m happy to announce that MAAS 2.4.0 beta 1 and python-libmaas 0.6.0 have now been released and are available for Ubuntu Bionic.
MAAS Availability
MAAS 2.4.0 beta 1 is currently available in Bionic -proposed waiting to be published into Ubuntu, or in the following PPA:

MAAS 2.4.0 (beta1)

Important announcements

Debian package maas-dns no longer needed

The Debian package ‘maas-dns’ has now been made a transitional package. This package provided some post-installation configuration to prepare bind to be managed by MAAS, but it required maas-region-api to be installed first.

In order to streamline the installation and make it easier for users to install HA environments, the configuration of bind has now been integrated to the ‘maas-region-api’ package itself, which and we have made ‘maas-dns’ a dummy transitional package that can now be removed.

New Features & Improvements

MAAS Internals optimization

Major internal surgery to MAAS 2.4 continues improve various areas not visible to the user. These updates will advance the overall performance of MAAS in larger environments. These improvements include:

  • Database query optimizations

Further reductions in the number of database queries, significantly cutting the queries made by the boot source cache image import process from over 100 to just under 5.

  • UI optimizations

MAAS is being optimized to reduce the amount of data using the websocket API to render the UI. This is targeted at only processing data only for viewable information, improving various legacy areas. Currently, the work done for this release includes:

  • Only load historic script results (e.g. old commissioning/testing results) when requested / accessed by the user, instead of always making them available over the websocket.

  • Only load node objects in listing pages when the specific object type is requested. For instance, only load machines when accessing the machines tab instead of also loading devices and controllers.

  • Change the UI mechanism to only request OS Information only on initial page load rather than every 10 seconds.

KVM pod improvements

Continuing with the improvements from alpha 2, this new release provides more updates to KVM pods:

  • Added overcommit ratios for CPU and memory.

When composing or allocating machines, previous versions of MAAS would allow the user to request as many resources as the user wanted regardless of the available resources. This created issues when dynamically allocating machines as it could allow users to create an infinite number of machines even when the physical host was already over committed. Adding this feature allows administrators to control the amount of resources they want to over commit.

  • Added ability to filter which pods or pod types to avoid when allocating machines

Provides users with the ability to select which pods or pod types not to allocate resources from. This makes it particularly useful when dynamically allocating machines when MAAS has a large number of pods.

DNS UI Improvements

MAAS 2.0 introduced the ability to manage DNS, not only to allow the creation of new domains, but also to the creation of resources records such as A, AAA, CNAME, etc. However, most of this functionality has only been available over the API, as the UI only allowed to add and remove new domains.

As of 2.4, MAAS now adds the ability to manage not only DNS domains but also the following resource records:

  • Added ability to edit domains (e.g. TTL, name, authoritative).

  • Added ability to create and delete resource records (A, AAA, CNAME, TXT, etc).

  • Added ability to edit resource records.

Navigation UI improvements

MAAS 2.4 beta 1 is changing the top-level navigation:

  • Rename ‘Zones’ for ‘AZs’

  • Add ‘Machines, Devices, Controllers’ to the top-level navigation instead of ‘Hardware’.

Minor improvements

A few notable improvements being made available in MAAS 2.4 include:

  • Add ability to force the boot type for IPMI machines.

Hardware manufactures have been upgrading their BMC firmware versions to be more compliant with the Intel IPMI 2.0 spec. Unfortunately, the IPMI 2.0 spec has made changes that provide a non-backward compatible user experience. For example, if the administrator configures their machine to always PXE boot over EFI, and the user executed an IPMI command without specifying the boot type, the machine would use the value of the configured BIOS. However, with  these new changes, the user is required to always specify a boot type, avoiding a fallback to the BIOS.

As such, MAAS now allows the selection of a boot type (auto, legacy, efi) to force the machine to always PXE with the desired type (on the next boot only) .

  • Add ability, via the API, to skip the BMC configuration on commissioning.

Provides an API option to skip the BMC auto configuration during commissioning for IPMI systems. This option helps admins keep credentials provided over the API when adding new nodes.

Bug fixes

Please refer to the following for all 32 bug fixes in this release.


Read more


Después de casi un año, con Nico liberamos una nueva versión de fades.

¿Qué hay de nuevo en esta release?

  • Revisar si todo lo pedido está realmente disponible en PyPI antes de comenzar a instalarlo
  • Ignora dependencias duplicadas
  • Varias mejoras y correcciones en los mensajes que fades muestra en modo verbose
  • Prohibimos el mal uso de fades: instalarlo en legacy Python y ejecutarlo desde adentro de otro virtualenv
  • Un montón de mejoras relacionadas al proyecto en sí (pero no directamente visibles para el usuario final) y algunas pequeñas otras correcciones


Loguito de fades


infoauth es un un pequeño pero práctico módulo de Python y script para grabar/cargar tokens a/desde disco.

Esto es lo que hace:

  • graba tokens en un archivo en disco, pickleado y zippeado
  • cambia el archivo a sólo lectura, y sólo legible por vos
  • carga los tokens de ese archivo en disco

En qué casos este módulo es útil? Digamos que tenés un script o programa que necesita algunos tokens secretos (autenticación de mail, tokens de Twitter, la info para conectarse a una base de datos, etc...), pero no querés incluir estos tokens en el código, porque el mismo es público, entonces con este módulo harías:

    tokens = infoauth.load(os.path.expanduser("~/.my-tokens"))

Fijate que el archivo va a quedar legible sólo por vos y no en el directorio del proyecto (así no tenés el riesgo de compartirlo por accidente).

CUIDADO: infoauth NO protege tus secretos con una clave o algo así, este módulo NO asegura tus secretos de ninguna manera. Sí, los tokens están enmarañados (porque se picklean y comprimen) y otra gente quizás no pueda accederlos fácilmente (legible sólo por vos), pero no hay más protección que esa. Usalo bajo tu propio riesgo.

Entonces, ¿cómo usarlo desde un programa en Python? Es fácil, para cargar la data:

    import infoauth
    auth = infoauth.load(os.path.expanduser("~/.my-mail-auth"))
    # ...
    mail.auth(auth['user'], auth['password'])

Para grabarla:

    import infoauth
    secrets = {'some-stuff': 'foo', 'code': 67}
    infoauth.dump(secrets, os.path.expanduser("~/.secrets"))

Fijate que como grabar los tokens es algo que normalmente se hace una sola vez, seguro es más práctico hacerlo desde la linea de comandos, como se muestra a continuación...

Por eso, ¿cómo usarlo desde la linea de comandos? Para mostrar la info:

    $ infoauth show ~/.my-mail-auth
    password: ...
    user: ...

Y para grabar un archivo con los datos::

    $ infoauth create ~/.secrets some-stuff=foo code=67

Fijate que al crear el archivo desde la linea de comandos tenemos la limitación de que todos los valores almacenados van a ser cadenas de texto; si querés grabar otros tipos de datos, como enteros, listas, o lo que quieras, tendrías que usar la forma programática que se muestra arriba.

Esta es la página del proyecto, y claro que está en PyPI así que se puede usar sin problema desde fades (guiño, guiño).

Read more
Marcus Haslam

Introducing the new Snapcraft branding

Early developments of the the Snapcraft brand mark

If you’re a regular visitor to or any of its associated sites, you will have noticed a change recently to the logo and overall branding which has been in development over the past few months. We have developed a stand-alone brand for Snapcraft, the command line tool for writing and publishing software as a snap. One of the challenges we faced was how to create a brand for Snapcraft that stands out in its own right yet fitted in with the existing Ubuntu brand and be a part of the extended family. To achieve this, we took reference from the Suru visual language. The Suru philosophy stems from our brand values alluding to Japanese culture. Working with paper metaphors we were inspired by origami as a solid and tangible foundation.

We identified the key attributes of:

Simple – to package leveraging your existing tools

Integrated – easily with build and CL infrastructure

Effortless – roll back versions

Easy – integrate with build and CI infrastructures

We worked with these whilst keeping with the overall Ubuntu brand values of:





We started exploring the Origami language and felt the idea of a bird fitted well with the attributes. We worked at simplifying the design until we had it at it’s most minimal we could whilst retaining the elegance of a bird in flight.

A new colour scheme was developed to have a clean fresh look while sitting comfortably with the primary and secondary Ubuntu colour palette. We used the Ubuntu font for consistency with the overall parent brand, and in this instance we used the thin weight along side the light to create a dynamic word mark reflecting the values of Freedom, Precision, Collaboration and Reliable.


Colour Palette


We have a number of different Lock-ups we can use for different circumstances




Read more
Colin Ian King

Kernel Commits with "Fixes" tag

Over the past 5 years there has been a steady increase in the number of kernel bug fix commits that use the "Fixes" tag.  Kernel developers use this annotation on a commit to reference an older commit that originally introduced the bug, which is obviously very useful for bug tracking purposes. What is interesting is that there has been a steady take-up of developers using this annotation:

With the 4.15 release, 1859 of the 16223 commits (11.5%) were tagged as "Fixes", so that's a fair amount of work going into bug fixing.  I suspect there are more commits that are bug fixes, but aren't using the "Fixes" tag, so it's hard to tell for certain how many commits are fixes without doing deeper analysis.  Probably over time this tag will be widely adopted for all bug fixes and the trend line will level out and we will have a better idea of the proportion of commits per release that are just devoted to fixing issues.  Let's see how this looks in another 5 years time,  I'll keep you posted!

Read more
Christian Brauner

History Of Linux Containers By Serge Hallyn

alt text

Serge Hallyn recently wrote a post outlining the actual history of containers on Linux. Worth a read!


Read more

De trabajo en Hungría

Estuve una semana en Budapest, trabajando en un sprint con otros compañeros de equipo y de otros equipos, en general.

El viaje largo, pero sin sorpresas... sólo el detalle que me perdieron la valija en el viaje de ida :(. Cuando fui a hacer el reclamo, se fijaron y la ubicaron en Frankfurt (donde era la escala) y me dijeron que llegaba esa noche. Incluso me dieron un pelpa para que el hotel pueda recibir la valija por mí. Obviamente, cuando hice el checkin les comenté la situación. A las diez de la noche golpearon la puerta de la habitación y era alguien del hotel con mi valija \o/.

Así y todo tuve que salir vestido como venía (pantalón finito y zapatillas náuticas) a pasear durante la tarde... y me cagué de frío, aunque tenía polar y campera. Salimos a pegar una vuelta con Naty, Matías y Guillo, y caminamos un par de horas a la tardecita, antes de que caiga el sol, porque luego teníamos el coctel de bienvenida de la empresa. Aunque era "de día", estaba muy nublado, y eso, hacía mucho frío...

El auto tapado de hielo, ese frío hacía

El agua se congelaba a mitad del chorro (?)

En general no paseé demasiado, porque los días eran grises y fríos, y cuando terminábamos el día laboral (entre las 17:30 y las 18) ya era de noche. Excepto el viernes, que terminamos a las 16hs, y encima salió el sol. Y el sábado, claro, que salí a pegar una vuelta durante la mañana y mediodía. A diferencia de los primeros días, ya teníamos como 12°C, estábamos como queríamos (?)

Los primeros días nos poníamos toda la ropa que teníamos

Nerdeando en un bar de cerveza artesanal; indoor las temperaturas eran otras, claro

El sábado hice paseo por la zona del Danubio, subí un montecito donde estaba la Estatua de la Libertad (levantada originalmente en 1947, en recuerdo a la liberación soviética de Hungría durante la Segunda Guerra Mundial, finalizando la ocupación nazi), fui al mercado central de la ciudad, y caminé bastante para un lado y para el otro.

La gente, en general, educada. La mayoría no sabe inglés, incluso en zonas turísticas y en lugares como para comer o comprar cosas "de turistas", así que a veces uno vuelve a la típica charla de señas y sonidos varios. O se termina hablando en italiano, como nos pasó en una heladería :p.

Estatua de la Libertad

Claro, los otros días también estuvimos caminando por acá y por allá, pero en general de noche y con todos los negocios (excepto los relacionados a comer y beber) cerrados... Budapest realmente es una ciudad distinta antes y después de las 18 horas (porque a las seis cierran la mayoría de los negocios, y ya es de noche...).

Pero bueno, eso obviamente no impidió que saliéramos a comer, y yo me dediqué a los gulashs. El gulash, originario de Hungría, justamente, es simplemente un estofado de carne, y de ahí salen muchas variantes... con papa, sin papa, con spätzl chicos, grandes o directamente sin nada de eso, con cebolla o sin, etc... siempre con carne, cocida varias horas, apenas picante (por eso se lo acompaña con alguna salsita para apicantarlo, como hacemos nosotros con el locro), y MUY RICO.

A varios gulashs, le entré

Todas las fotos, acá.

Read more

Tenemos algunos eventos de Python Argentina en el primer semestre del año. Vayamos cronológicamente.

El miércoles 4 de Abril, 19hs,  hacemos un meetup de Python Argentina en Devecoop. Tendremos un par de charlas técnicas, y quizás hagamos un "Consultorio Python", veremos cómo lo armamos. Si no están anotados en el meetup de Python Buenos Aires, regístrense así reciben las noticias de estos meetups y se anotan fácil y etc.

Del 28 de Abril al 1 de Mayo tenemos una nueva edición del PyCamp, nuevamente en Baradero. Ya está abierta la inscripción, considerando incluso descuentos para socios de Python Argentina. Pueden buscar toda la info en la página del evento.

También tenemos dos PyDays, ambos en Mayo. Sí, dos, por ahora... habrán más en el año. El primero en La Plata, todavía pendiente de definir cual sábado, y el segundo en Corrientes, el sábado 19. Ya les traeré más noticias cuando nos acerquemos a las fechas.

Por otro lado, estamos armando la PyCon Argentina de este año en Buenos Aires. Fecha todavía no tenemos (estamos buscando el lugar, todavía) pero será Septiembre, Octubre o Noviembre.

Todo esto es en parte gracias a la Asociación Civil de Python Argentina, que nos permite tener un marco estructural con el cual movernos y que no nos venga la AFIP a meter en cana por mover guita por ahí... así que si querés apoyarnos en general considerá hacerte socia/o vos o ayudanos pasando este folleto en donde trabajes para que esa empresa/cooperativa lo considere también. Gracias!

Read more

Hello MAASters!

I’m happy to announce that MAAS 2.4.0 alpha 2 has now been released and is available for Ubuntu Bionic.
MAAS Availability
MAAS 2.4.0 alpha 1 is available in the Bionic -proposed archive or in the following PPA:

MAAS 2.4.0 (alpha2)

Important announcements

NTP services now provided by Chrony

Starting with 2.4 Alpha 2, and in common with changes being made to Ubuntu Server, MAAS replaces ‘ntpd’ with Chrony for the NTP protocol. MAAS will handle the upgrade process and automatically resume NTP service operation.

Vanilla CSS Framework Transition

MAAS 2.4 is undergoing a Vanilla CSS framework transition to a new version of vanilla, which will bring a fresher look to the MAAS UI. This framework transition is currently work in progress and not all of the UI have been fully updated. Please expect to see some inconsistencies in this new release.

New Features & Improvements

NTP services now provided by Chrony.

Starting from MAAS 2.4alpha2, chrony is now the default NTP service, replacing ntpd. This work has been done to align with the Ubuntu Server and Security team to support chrony instead of ntpd. MAAS will continue to provide services exactly the same way and users will not be affected by the changes, handling the upgrade process transparently. This means that:

  • MAAS will configure chrony as peers on all Region Controllers
  • MAAS will configure chrony as a client of peers for all Rack Controllers
  • Machines will use the Rack Controllers as they do today

MAAS Internals optimization

MAAS 2.4 is currently undergoing major surgery to improve various areas of operation that are not visible to the user. These updates will improve the overall performance of MAAS in larger environments. These improvements include:

  • AsyncIO based event loop
    • MAAS has an event loop which performs various internal actions. In older versions of MAAS, the event loop was managed by the default twisted event loop. MAAS now uses an asyncio based event loop, driven by uvloop, which is targeted at improving internal performance.

  • Improved daemon management
    • MAAS has changed the way daemons are run to allow users to see both ‘regiond’ and ‘rackd’ as processes in the process list.
    • As part of these changes, regiond workers are now managed by a master regiond process. In older versions of MAAS each worker was directly run by systemd. The master process is now in charge of ensuring workers are running at all times, and re-spawning new workers in case of failures. This also allows users to see the worker hierarchy in the process list.
  • Ability to increase the number of regiond workers
    • Following the improved way MAAS daemons are run, further internal changes have been made to allow the number of regiond workers to be increased automatically. This allows MAAS to scale to handle more internal operations in larger environments.
    • While this capability is already available, it is not yet available by default. It will become available in the following milestone release.
  • Database query optimizations
    • In the process of inspecting the internal operations of MAAS, it was discovered that multiple unnecessary database queries are performed for various operations. Optimising these requires internal improvements to reduce the footprint of these operations. Some areas that have been addressed in this release include:
      • When saving node objects (e.g. making any update of a machine, device, rack controller, etc), MAAS validated changes across various fields. This required an increased number of queries for fields, even when they were not being updated. MAAS now tracks specific fields that change and only performs queries for those fields.
        • Example: To update a power state, MAAS would perform 11 queries. After these improvements, , only 1 query is now performed.
      • On every transaction, MAAS performed 2 queries to update the timestamp. This has now been consolidated into a single query per transaction.
    • These changes  greatly improves MAAS performance and database utilisation in larger environments. More improvements will continue to be made as we continue to examine various areas in MAAS.
  • UI optimisations
    • MAAS is now being optimised to reduce the amount of data loaded in the websocket API to render the UI. This is targeted at only processing data for viewable information, improving various legacy areas. Currently, the work done in this area includes:
      • Script results are only loaded for viewable nodes in the machine listing page, reducing the overall amount of data loaded.
      • The node object is updated in the websocket only when something has changed in the database, reducing the data transferred to the clients as well as the amount of internal queries.

Audit logging

Continuing with the audit logging improvements, alpha2 now adds audit logging for all user actions that affect Hardware Testing & Commissioning.

KVM pod improvements

MAAS’ KVM pods was initially developed as a feature to help developers quickly iterate and test new functionality while developing MAAS. This, however, because a feature that allow not only developers, but also administrators to make better use of resources across their datacenter. Since the feature was initially create for developers, some features were lacking. As such, in 2.4 we are improving the usability of KVM pods:

  • Pod AZ’s.
    MAAS now allows setting the physical zone for the pod. This helps administrators by conceptually placing their KVM pods in a AZ, which enables them to request/allocate machines on demand based on its AZ. All VM’s created from a pod will inherit the AZ.

  • Pod tagging
    MAAS now adds the ability to set tags for a pod. This allows administrators to use tags to allow/prevent the creation of a VM inside the pod using tags. For example, if the administrator would like a machine with a ‘tag’ named ‘virtual’, MAAS will filter all physical machines and only consider other VM’s or a KVM pod for machine allocation.

Bug fixes

Please refer to the following for all bug fixes in this release.

Read more

In chapter 5 of his mind-blowing “The Road to Reality”, Penrose devotes a section to complex powers, that is, to the solutions to

$$w^z~~~\text{with}~~~w,z \in \mathbb{C}$$

In this post I develop a bit more what he exposes and explore what the solutions look like with the help of some simple Python scripts. The scripts can be found in this github repo, and all the figures in this post can be replicated by running

git clone
cd exponential-spiral; ./

The scripts make use of numpy and matplotlib, so make sure those are installed before running them.

Now, let’s develop the math behind this. The values for \(w^z\) can be found by using the exponential function as

$$w^z=e^{z\log{w}}=e^{z~\text{Log}~w}e^{2\pi nzi}$$

In this equation, “log” is the complex natural logarithm multi-valued function, while “Log” is one of its branches, concretely the principal value, whose imaginary part lies in the interval \((−\pi, \pi]\). In the equation we reflect the fact that \(\log{w}=\text{Log}~w + 2\pi ni\) with \(n \in \mathbb{Z}\). This shows the remarkable fact that, in the general case, we have infinite solutions for the equation. For the rest of the discussion we will separate \(w^z\) as follows:

$$w^z=e^{z~\text{Log}~w}e^{2\pi nzi}=C \cdot F_n$$

with constant \(C=e^{z~\text{Log}~w}\) and the rest being the sequence \(F_n=e^{2\pi nzi}\). Being \(C\) a complex constant that multiplies \(F_n\), the only influence it has is to rotate and scale equally all solutions. Noticeably, \(w\) appears only in this constant, which shows us that the \(z\) values are what is really determinant for the number and general shape of the solutions. Therefore, we will concentrate in analyzing the behavior of \(F_n\), by seeing what solutions we can find when we restrict \(z\) to different domains.

Starting by restricting \(z\) to integers (\(z \in \mathbb{Z}\)), it is easy to see that there is only one resulting solution in this case, as the factor \(F_n=e^{2\pi nzi}=1\) in this case (it just rotates the solution \(2\pi\) radians an integer number of times, leaving it unmodified). As expected, a complex number to an integer power has only one solution.

If we let \(z\) be a rational number (\(z=p/q\), being \(p\) and \(q\) integers chosen so we have the canonical form), we obtain

$$F_n=e^{2\pi\frac{pn}{q} i}$$

which makes the sequence \(F_n\) periodic with period \(q\), that is, there are \(q\) solutions for the equation. So we have two solutions for \(w^{1/2}\), three for \(w^{1/3}\), etc., as expected as that is the number of solutions for square roots, cube roots and so on. The values will be the vertex of a regular polygon in the complex plane. For instance, in figure 1 the solutions for \(2^{1/5}\) are displayed.

Figure 1

Fig. 1: The five solutions to \(2^{1/5}\)

If \(z\) is real, \(e^{2\pi nzi}\) is not periodic anymore has infinite solutions in the unit circle, and therefore \(w^z\) has infinite values that lie on a circle of radius \(|C|\).

In the more general case, \(z \in \mathbb{C}\), that is, \(z=a+bi\) being \(a\) and \(b\) real numbers, and we have

$$F_n=e^{-2\pi bn}e^{2\pi ani}.$$

There is now a scaling factor, \(e^{-2\pi bn}\) that makes the module of the solutions vary with \(n\), scattering them across the complex plane, while \(e^{2\pi ani}\) rotates them as \(n\) changes. The result is an infinite number of solutions for \(w^z\) that lie in an equiangular spiral in the complex plane. The spiral can be seen if we change the domain of \(F\) to \(\mathbb{R}\), this is

$$F(t)=e^{-2\pi bt}e^{2\pi ati}~~~\text{with}~~~t \in \mathbb{R}.$$

In figure 2 we can see one example which shows some solutions to \(2^{0.4-0.1i}\), plus the spiral that passes over them.

Fig. 2: Roots for \(2^{0.4-0.1i}\)
Fig. 2: Roots and spiral for \(2^{0.4-0.1i}\)

In fact, in Penrose’s book it is stated that these values are found in the intersection of two equiangular spirals, although he leaves finding them as an exercise for the reader (problem 5.9).

Let’s see then if we can find more spirals that cross these points. We are searching for functions that have the same value as \(F(t)\) when \(t\) is an integer. We can easily verify that the family of functions

$$F_k'(t)=F(t)e^{2\pi kti}~~~\text{with}~~~k \in \mathbb{Z}$$

are compatible with this restriction, as \(e^{2\pi kti}=1\) in that case (integer \(t\)). Figures 3 and 4 represent again some solutions to \(2^{0.4-0.1i}\), \(F(t)\) (which is the same as the spiral for \(k=0\)), plus the spirals for \(k=-1\) and \(k=1\) respectively. We can see there that the solutions lie in the intersection of two spirals indeed.

Fig. 3
Fig. 3: Roots for \(2^{0.4-0.1i}\) plus spirals for k=0 and k=-1

Fig. 4
Fig. 4: Roots for \(2^{0.4-0.1i}\) plus spirals for k=0 and k=1

If we superpose these 3 spirals, the ones for \(k=1\) and \(k=-1\) cross also in places different to the complex powers, as can be seen in figure 5. But, if we choose two consecutive numbers for \(k\), the two spirals will cross only in the solutions to \(w^z\). See, for instance, figure 6 where the spirals for \(k=\{-2,-1\}\) are plotted. We see that any pair of such spirals fulfills Penrose’s description.

Fig. 5
Fig. 5: Roots for \(2^{0.4-0.1i}\) plus spirals for k=-1,0,1

Fig. 6
Fig. 6: Roots for \(2^{0.4-0.1i}\) plus spirals for k=-1,-2

In general, the number of places at which two spirals cross depends on the difference between their \(k\)-number. If we have, say, \(F_k’\) and \(F_l’\) with \(k>l\), they will cross when


That is, they will cross when \(t\) is an integer (at the solutions to \(w^z\)) and also at \(k-l-1\) points between consecutive solutions.

Let’s see now another interesting special case: when \(z=bi\), that is, it is pure imaginary. In this case, \(e^{2\pi ati}\) is \(1\), and there is no turn in the complex plane when \(t\) grows. We end up with the spiral \(F(t)\) degenerating to a half-line that starts at the origin (which is reached when \(t=\infty\) if \(b>0\)). This can be appreciated in figure 7, where the line and the spirals for \(k=-1\) and \(k=1\) are plotted for \(20^{0.1i}\). The two spirals are mirrored around the half-line.

Fig. 7
Fig. 7: Roots for \(10^{0.1i}\), \(F(t)\), and spirals for k=-1,1

Digging more into this case, it turns out that a pure imaginary number to a pure imaginary power can produce a real result. For instance, for \(i^{0.1i}\), we see in figure 8 that the roots are in the half-positive real line.

Fig. 8
Fig. 8: Roots for \(i^{0.1i}\), \(F(t)\), and spirals for k=-1,1

That something like this can produce real numbers is a curiosity that has intrigued historically mathematicians (\(i^i\) has real values too!). And with this I finish the post. It is really amusing to start playing with the values of \(w\) and \(z\), if you want to do so you can use the python scripts I pointed to in the beginning of the post. I hope you enjoyed the post as much as I did writing it.

Read more
Christian Brauner

Mutexes And fork()ing In Shared Libraries

alt text


In this short - let’s call it “semi-informative rant” - I’m going to be looking at mutexes and fork() in shared libraries with threaded users. I’m going to leave out other locking primitives including semaphores and file locks which would deserve posts of their own.

The Stuff You Came Here For

A mutex is simply put one of the many synchronization primitives to protect a range of code usually referred to as “critical section” from concurrent operations. Reasons for using them are many. Examples include:

  • avoiding data corruption through multiple writers changing the same data structure at the same time
  • preventing readers from retrieving inconsistent data because a writer is changing the data structure at the same time
  • .
  • .
  • .

In its essence it is actually a pretty easy concept once you think about it. You want ownership of a resource, you want that ownership to be exclusive, you want that ownership to be limited from t_1 to t_n where you yield it. In the language of C and the pthread implementation this can be expressed in code e.g. as:

static pthread_mutex_t thread_mutex = PTHREAD_MUTEX_INITIALIZER;

static int some_function(/* parameters of relevance*/)
        int ret;

        ret = pthread_mutex_lock(&thread_mutex);
        if (ret != 0) {
                /* handle error */

        /* critical section */

        ret = pthread_mutex_unlock(&thread_mutex);
        if (ret != 0) {
                /* handle error */

Using concepts like mutexes in a shared library is always a tricky thing. What I mean by that is that if you can avoid them avoid them. For a start, mutexes usually come with a performance impact. The size of the impact varies with a couple of different parameters e.g. how long the critical section is. Depending on what you are doing these performance impacts might or might not matter to you or not even register as significant. So the performance impact argument is a difficult one to make. Usually programmers with a decent understanding of locking can find ways to minimize the impact of mutexes by toying with the layout and structure of critical sections ranging from choosing the right data structures to simply moving code out of critical sections.

There are better arguments to be made against casually using mutexes though. One is closely coupled to what type of program you’re writing. If you’re like me coming from the background of a low-level C shared library like LXC you will at some point find yourself thinking about the question whether there’s any possibility that you might be used in threaded contexts. If you can confidently answer this question with “no” you can likely stop caring and move on. If you can’t then you should think really really hard in order to avoid mutexes. The problem is a classical one and I’m not going to do a deep dive as this has been done before all over the web. What I’m alluding to is of course the mess that is fork()ing in threads. Most shared libraries that do anything interesting will likely want to fork() of helper tasks in API functions. In threaded contexts this quickly becomes a source of undefined behavior. The way fork()ing in threads works is that only the thread that called fork() gets duplicated in the child, the others are terminated. Given that fork() duplicates memory state, locking etc. all of which is shared amongst threads you quickly run into deadlocks whereby mutexes that were held in other threads are never unlocked. But it can also cause nasty undefined behavior whereby file pointers via e.g. fopen() although set up so as to be unique to each thread get corrupted due to inconsistent locking caused by e.g. dynamically allocating memory via malloc() or friends in the child process because behind the scenes in a lot of libcs mutexes are used when allocating memory.

The possibilities for bugs are endless. Another good example is the use of the exit() function to terminate child processes. The exit() function is not thread-safe since it runs standard and user-registered exit handlers from a shared resource. This is a common source of process corruption. The lesson here is of course to always use _exit() instead of exit(). The former is thread-safe and doesn’t run exit handlers. But that presupposes that you don’t care about exit handlers.

A lot of these bugs are hard to understand, debug, and - to be honest - even to explain given that they are a mixture of undefined behavior and legal thread and fork() semantics.

Running Handlers At fork()

Of course, these problems were realized early on and one way to address those is to register handlers that would be called at each fork(). In the pthread slang the name of the function to register such handlers is appropriately “pthread_atfork()”. In the case of mutexes this means you would register three handlers that would be called at different times at fork(). One right before the fork() - prepare handler - to e.g. unlock any implicitly held mutexes. One to be called after fork() processing has finished in the child - child handler - and one called after fork() processing in the parent finishes - parent handler. In the pthread implementation and for a shared library this would likely look something like this:

void process_lock(void)
        int ret;

	ret = pthread_mutex_lock(&thread_mutex);
        if (ret != 0)

void process_unlock(void)
        int ret;

	ret = pthread_mutex_unlock(&thread_mutex);
        if (ret != 0)

__attribute__((constructor)) static void __register_atfork_handlers(void)
        /* Acquire lock right before fork() processing to avoid undefined
         * behavior by unlocking an unlocked mutex. Then release mutex in child
         * and parent.
        pthread_atfork(process_lock, process_unlock, process_unlock);

While this sounds like a reasonable approach it has various and serious drawbacks:

  1. These atfork handlers come with a cost that - again depending on your program - you maybe would like to avoid.
  2. They don’t allow you to explicitly hold a lock when fork()ing in the same task depending on what handlers you are registering.

    This is straightforward. Let’s reason about the following code sequence for a minute ignoring whether holding the mutex would make sense that way:

             int ret, status;
             pid_t pid;
             pid = fork();
             if (pid < 0)
                     return -1;
             if (pid == 0) {
                     /* critical section */
             ret = waitpid(pid, &status, 0);
             if (ret < 0) {
                     if (errno == EINTR)
                             goto again;
                     return -1;
             if (ret != pid)
                     goto again;
             if (!WIFEXITED(status) || WEXITSTATUS(status) != 0)
                     return -1;
             return 0;

    No let’s add the logic caused by pthread_atfork() in there (The mutex annotation is slightly misleading but should make things a little easier to follow):

             int ret, status;
             pid_t pid;
             process_lock(); /* <mutex 1> (explicitly acquired) */
             process_lock(); /* <mutex 2> (implicitly acquired by prepare atfork handler) */
             pid = fork();
             if (pid < 0)
                     return -1;
             if (pid == 0) {
                     /* <mutex 1> held (transparently held) */
                     /* <mutex 2> held (opaquely held) */
                     /* critical section */
                     process_unlock(); /* <mutex 1> (explicitly released) */
                     process_unlock(); /* <mutex 2> (implicitly released by child atfork handler) */
             process_unlock(); /* mutex_2 (implicitly released by parent atfork handler) */
             process_unlock(); /* mutex_1 (explicitly released) */
             ret = waitpid(pid, &status, 0);
             if (ret < 0) {
                     if (errno == EINTR)
                             goto again;
                     return -1;
             if (ret != pid)
                     goto again;
             if (!WIFEXITED(status) || WEXITSTATUS(status) != 0)
                     return -1;
             return 0;

    That doesn’t look crazy at a first glance. But let’s explicitly look at the problem:

     int ret, status;
     pid_t pid;
     process_lock(); /* <mutex 1> (explicitly acquired) */
     process_lock(); /* **DEADLOCK** <mutex 2> (implicitly acquired by prepare atfork handler) */
     pid = fork();
     if (pid < 0)
             return -1;
  3. They aren’t run when you use clone() (which obviously is a big deal for a container API like LXC). So scenarios like the following are worrying:
     /* premise: some other thread holds a mutex */
     pid_t pid;
     void *stack = alloca(/* standard page size */);
     /* Here atfork prepare handler needs to be run but won't. */
     pid = clone(foo, stack + /* standard page size */, SIGCHLD, NULL);
     if (pid < 0)
             return -1;

    The point about clone() is interestingly annoying. Since clone() is Linux specific there’s no POSIX standard that gives you a guarantee that atfork handlers are run or that they are not run. That’s up to the implementation (read “libc in question”). Currently glibc doesn’t run atfork handlers but if my fellow maintainers and I where to build consensus that it would be a great idea to change it in the next release then we would be free to do so (Don’t worry, we won’t.). So to make sure that no atfork handlers are run you need to go directly through the syscall() helper that all libcs should provide. This should give you a strong enough guarantee. That is of course an excellent solution if you don’t care about atfork handlers. However, when you do care about them you better not use clone().

  4. Running a subset of already registered atfork handlers is a royal pain.

    This relates back to the earlier point about e.g. wanting to explicitly hold a lock in a task while fork()ing. In this case you might want to exclude the handler right before the fork() that locks the mutex. If you need to do this then you’re going to have to venture into the dark dark land of function interposition. Something which is really ugly. It’s like asking how to make Horcruxes or - excuse the pun - fork()cruxes. Sure, you’ll eventually trick some low-level person into explaining it to you because it’s just such a weird and exotic thing to know or care about but that explanation will ultimately end with phrases such as “That’s all theoretical, right?” or “You’re not going to do this, right?” or - the most helpful one (honestly) - “The probability that something’s wrong with your programm’s design is higher than the probability that you really need interposition wrappers.”. In this specific case interposing pthread_atfork() would probably involve using pthread_once() calling dlsym(RTLD_NEXT, "pthread_atfork") and recording the function pointer in a global variable. Additionally, you likely want to start maintaining a jump table (essentially an array of function pointers) and register a callback wrapper around the jump table entries. You can then go on to call the callback in pthread_atfork() with different indices into the jump table. If you’re super ambitious (read “insane”) you could then have a different set of callbacks for each fork() in your program. Also, I just told you how to make a fork()crux. Let me tell you while I did this for “fun” once there’s a limit to how dirty you can feel without hating yourself. Also, this is all theoretical, right?

The list could go on and be even more detailed but the gist is: if there’s a chance that your shared library is called in threaded contexts try to come up with a design that lets you avoid mutexes and atfork handlers. On the road to LXC 3.0 we’ve recently managed to kick out all mutexes and atfork handlers of which there were very few already. This has greatly improved our confidence in threaded use cases. This is especially important since we have API consumers that call LXC from inherently threaded contexts such as the Go runtime. LXD obviously is the prime example but also the general go-lxc bindings are threaded API consumers. To be fair, we’ve never had issues before as mutexes were extremely rare in the first place but one should always remember that no locking is the best locking. :)


  • Coming back once more to the point about running atfork handlers. Atfork handlers are of course an implementation detail in the pthread and POSIX world. They are by no means a conceptual necessity when it comes to mutexes. But some standard is better than no standard when it comes to system’s design. Any decent libc implementation supporting pthread will very likely also support atfork handlers (even Bionic has gained atfork support along the way). But this immediately raises another problem as it requires programming languages on POSIX systems to go through the system’s libc when doing a fork(). If they don’t then atfork handlers won’t be run even if you call fork(). One prime example is Go. The syscall and sys/unix packages will not go through the system’s libc. They will directly do the corresponding syscall. So atfork handlers are not available when fork()ing in Go. Now, Go is a little special as it doesn’t support fork() properly in the first place because of all the reasons (and more) I outlined above.
  • Solaris:

    Let’s talk about Solaris for a minute. Before I said

    The way fork()ing in threads works is that only the thread that called fork() gets duplicated in the child, the others are terminated.

    That is an implementation detail of the pthread world. There are other implementations that don’t terminate all other threads but the calling thread. One example are Solaris Threads (or as I like to call it sthread). Actually, - hold on to your seats - sthreads support both semantics. Specifically, the sthread implementation used to have fork1() and fork() where fork1() would only duplicate the fork1()ing thread and fork() would duplicate all threads. The fork() behavior was obviously dependent on whether you linked with -lpthread or -lthread on Solaris which of course was a massive source of confusion. (Changing the behavior of functions depening on linker flags seeems like a good way into anarchy.) So Solaris started enforcing pthread semantics for fork() for both -lthread and -lpthread and added forkall() to support duplicating all threads.


Read more
Christian Brauner

alt text

Hey everyone,

This is another update about the development of LXC 3.0.

A few days ago the pam module has been moved out of the LXCFS tree and into the LXC tree. This means LXC 3.0 will be shipping with included. The pam module has been placed under the flags --enable-pam and --disable-pam. By default is disabled. Distros that are currently shipping through LXCFS should adapt their packaging accordingly and pass --enable-pam during the configure stage of LXC.

What’s That Pam Module Again?

Let’s take short detour (“short” cough cough). LXC has supported fully unprivileged containers since 2013 when user namespace support was merged into the kernel. (/me tips hat to Serge Hallyn and Eric Biedermann). Fully unprivileged containers are containers using user namespaces and idmappings which are run by normal (non-root) users. But let’s not talk about this let’s show it. The first asciicast shows a fully unprivileged system container running with a rather complex idmapping in a new user namespace:


The second asciicast shows a fully unprivileged application container running without a mapping for root inside the container. In fact, it runs with just a single idmap that maps my own host uid 1000 and host gid 1000 to container uid 1000 and container gid 1000. Something which I can do without requiring any privilege at all. We’ve been doing this a long time at LXC:


As you can see no non-standard privileges are used when setting up and running such containers. In fact, you could remove even the standard privileges all unprivileged users have available through standard system tools like newuidmap and newgidmap to setup idmappings (This is what you see in the second asciicast.). But this comes at a price, namely that cgroup management is not available for fully unprivileged containers. But we at LXC want you to be able to restrict the containers your run in the same way that the system administrator wants to restrict unprivileged users themselves. This is just good practice to prevent excessive resource consumption. What this means is that you should be free to delegate resources that you have been given by the system administrator to containers. This e.g. allows you to limit the cpu usage of the container, or the number of processes it is allowed to spawn, or the memory it is allowed to consume. But unprivileged cgroup management is not easily possible with most init system. That’s why the LXC team came up with a long time ago to make things easier. In essence, the pam module takes care of placing unprivileged users into writable cgroups at login. The cgroups that are supposed to be writable can be specified in the corresponding pam configuration file for your distro (probably something under /etc/pam.d). For example, if you wanted your user to be placed into a writable cgroup for all enabled cgroup hierarchies you could specify all:

session	optional -c all

If you only want your user to be placed into writable cgroups for the freezer, memory, unified and the named systemd hierarchy you would specify:

session	optional -c freezer,memory,name=systemd,unified

This would lead to create the common cgroup user and also create a cgroup just for my own user in there. For example, my user is called chb. This would cause to create the /sys/fs/cgroup/freezer/user/chb/0 inside the freezer hierarchy. If finds that your init system has already placed your users inside a session specific cgroup it will be smart enough to detect it and re-use that cgroup. This is e.g. the case for the named systemd cgroup hierarchy.

> cat /proc/self/cgroup


Read more
Christian Brauner

alt text

Hey everyone,

This is another update about the development of LXC 3.0.

We are currently in the process of moving various parts of LXC out of the main LXC repository and into separate repositories.

Splitting Out The Language Bindings For Lua And Python 3

The lua language bindings will be moved into the new lua-lxc repository and the Python 3 bindings to the new python3-lxc repository. This is in line with other language bindings like Python 2 (see python2-lxc) that were always kept out of tree.

Splitting Out The Legacy Template Build System

A big portion of the LXC templates will be moved to the new lxc-templates repository. LXC used to maintain simple shell scripts to build container images from for a lot of distributions including CentOS, Fedora, ArchLinux, Ubuntu, Debian and a lot of others. While the shell scripts worked well for a long time they suffered from the problem that they were often different in terms of coding style, the arguments that they expected to be passed, and the features they supported. A lot of things these shells scripts did when creating an image is not needed any more. For example, most distros nowadays provide a custom cloud image suitable for containers and virtual machines or at least provide their own tooling to build clean new images from scratch. Another problem we saw was that security and maintenance for the scripts was not sufficient. This is why we decided to come up with a simple yet elegant replacement for the template system that would still allow users to build custom LXC and LXD container images for the distro of their choice. So the templates will be replaced by distrobuilder as the preferred way to build LXC and LXD images locally. distrobuilder is a project my colleague Thomas is currently working on. It aims to be a very simple Go project focussed on letting you easily build full system container images by either using the official cloud image if one is provided by the distro or by using the respective distro’s recommended tooling (e.g. debootstrap for Debian or pacman for ArchLinux). It aims to be declarative, using the same set of options for all distributions while having extensive validation code to ensure everything that’s downloaded is properly validated.

After this cleanup only four POSIX shell compliant templates will remain in the main LXC repository:

  • busybox

This is a very minimal template which can be used to setup a busybox container. As long as the busybox binary is found you can always built yourself a very minimal privileged or unprivileged system or application container image; no networking or any other dependencies required. All you need to do is:

lxc-create c3 -t busybox


  • download

This template lets you download pre-built images from our image servers. This is likely what currently most users are using to create unprivileged containers.

  • local

This is a new template which consumes standard LXC and LXD system container images. A container can be created with:

lxc-create c1 -t local -- --metadata /path/to/meta.tar.xz --fstree /path/to/rootfs.tar.xz

where the --metadata flag needs to point to a file containing the metadata for the container. This is simply the standard meta.tar.xz file that comes with any pre-built LXC container image. The --fstree flag needs to point to a filesystem tree. Creating a container is then just:


  • oci

This is the template which can be used to download and run OCI containers. Using it is as simple as:

lxc-create c2 -t oci -- --url docker://alpine

Here’s another asciicast:


Read more

Durante el veranito nos tomamos unos días con la familia para pasear por distintos lados.

Sí, hace un montón. Es que nos tardamos en procesar las fotos... no escala; para las próximas vacaciones largas, creo que voy a poner a la familia a revisar fotos durante las vacaciones mismas, sino llegamos y tenemos 1238941 fotos, y revisando un cachito durante la cena en los sucesivos días, no terminamos más... al punto incluso que todavía no están listas! Pero bueno, ya no espero más, saco este post, las fotos vendrán después...


Entre Navidad y Año Nuevo nos fuimos en carpa a Mar Azul. Apuntábamos al camping al que habíamos ido cuando Felipe era pequeño, pero ese ya no existe más y el único que quedó es el Camping de Los Ingenieros, que estuvo bastante bien (aunque el baño que teníamos cerca nunca tuvo agua caliente, y en general le faltan mesas).

La pasamos muy bien, aunque hizo mucho calor y en algunos momentos era agobiante (apenas sobrevivíamos bajo unas sombritas poco densas). Claro, está el mar para meterse, pero hay toda una franja horaria en la que no íbamos a llevar a los chicos a la playa para que se calcinen bajo el sol.

Preparándonos para ir a la playa

Almorzando en el camping

Obviamente fuimos en carpa, y nos dimos cuenta que ya es un poco chica para los cuatro. Eso, sumado a que se quebró una vara de la estructura y rasgó el techo, nos estaría forzando a comprar una carpa nueva más grande, tenemos que ver qué hay...

La playa de Mar Azul es muy linda, bastante ancha, y aunque en algunos momentos había bastante gente, nunca estuvimos amuchados (pero a veces se complicaba armar una cancha de tejo grande...).

Atardecer en la playa, el día que llegamos

Jugando al tejo con Felipe

Al Este de las Sierras

En Enero nos fuimos a Córdoba un par de semanas (bien en contra mano de "findes de semana" o "quincenas", para no desquiciarnos al viajar). Estuvimos unos días en El Espinillo, y después cruzamos las sierras y nos quedamos otros días en La Población.

Un minuto de tranquilidad :)

El primer lugar, en El Espinillo, era en un complejo de tres cabañas con una pileta compartida. La pileta estuvo muy bien, ya que esos días hizo mucho calor entonces los chicos se metieron varias veces y la disfrutaron un montón. Con Moni no nos metimos tanto pero aprovechamos para tomar sol. Incluso, como a veces la gente de las otras cabañas se iban temprano, tuve un par de días en los que pude tomar sol en bolas lo más tranquilo.

Además de la pileta, también hicimos "agua" en varios ríos cercanos (Del Espinillo, Del Medio, y Los Reartes)... algunos con más sombra, otros con más pastito alrededor, algunos con el piso de piedra, en otros teníamos playita de arena... muchas combinaciones, lo importante era que nos podíamos meter un rato al agua, tomar sol, disfrutar el verde, pasarla bien.

Los chicos disfrutando la pileta a full

En un arroyo de por ahíEn un arroyo de por ahí

Uno de los días fuimos a visitar la Villa General Belgrano. Estuvo bueno porque visitamos el Museo Politemático Castillo Romano y paseamos un rato, pero el restaurant al qué íbamos siempre ya no es lo que era (la cerveza artesanal como aguada, la comida meh, la atención dejó que desear) y en general es una ciudad "demasiado turística" a la que no llegamos a disfrutar.

Un paseo que sí nos encantó de esos días fue el del Parque Recreativo de La Serranita, un lugar bárbaro con mil juegos y actividades. Hicimos minigolf, arquería, futpool, jugamos al tejo, carrera de bolitas, nos tiramos por un tobogán gigante, recorrimos un laberinto, hicimos actividades de treparse y equilibrio, adivinanzas y un montón de actividades más... estuvimos como seis horas ahí adentro (sin contar el almuerzo), súper recomendado para ir con la familia!

Practicando el mini-golf

Felipe atacando un helado con ganas

Otra visita que disfrutamos fue a Nico y Jesi en La Quintana, donde pasamos una tarde bárbara charlando y tomando mate. Estuvo complicado llegar porque ya en viaje nos agarró una lluvia torrencial, y cuando llegamos a La Quintana los caminos estaban complicados... no sólo por el barro, sino también porque la lluvia había sido tanta que cada calle era un pequeño arroyo.

Encima tuve la suerte de presenciar (y casi participar!) en la instalación del primer prototipo funcional del LibreRouter en su antena, :D

Visitando La Quintana

En el monumento a Mercedes Sosa

Las Sierras, lado Oeste

A mitad de las vacaciones cambiamos de lugar. Hicimos un pequeño viaje cruzando las sierras y nos fuimos hasta el segundo lugar que teníamos alquilado.

Era una casita mediana, súper equipada, en el medio de un inmenso terreno, cerca de una casa bien grande que era donde vivía la dueña. El paisaje era muy lindo, la verdad, todo muy cuidado, verde por doquier, hermoso.

Los peques, contemplando el paisaje

Foto grupal en una de las construcciones de la Villa

Los días de la segunda fase estuvieron en general más frescos, lo que nos llevó a realizar menos actividades "acuáticas", pero no por eso aprovechamos menos las vacaciones: además de un par de veces en la pileta, y algunas en ríos/arroyos, "pancheamos" a full y paseamos algo.

A nivel de chapoteo, nos fuimos una vez a un Balneario en Paso de las Tropas, cerca de Nono, con la idea de almorzar, pasar la tarde, y luego apuntar para el Museo Rocsen, pero se nos fueron pasando las horas y después ya estábamos muy cansados, así que nos volvimos, dejamos el museo para la próxima. También encontramos una zona linda para aprovechar un arroyo cerca de donde nos hospedábamos (recorrimos un caminito con el auto hasta que no se pudo más, y luego un tramo a pie siguiendo el arroyo, hasta llegar a una linda zona para pasar la tarde). La verdad es que la primera vez nos quedamos en un punto con una pequeña cascadita, pero la segunda vez como cuando llegamos ya había gente, seguimos caminando un poco más y encontramos un lugar mejor :)

En un arroyito con unas cascadas hermosas

Preparando el almuerzo al costado del agua

Con respecto a los paseos, fuimos varias veces a la Plaza de San Javier, unos kilómetros al norte. No sólo para las compras de supermercado y eso, sino porque había feria artesanal todos los días, entonces visitamos varias veces, e incluso caimos de casualidad en una obra de títeres que estuvo muy buena. Encima era una zona donde había varios lugares lindos para comer, e incluso un pequeño parque cervecero/heladería, al cual fuimos más de una vez :)

También hicimos varios kilómetros para el sur y visitamos la fábrica de aceite de oliva Sierra Pura. Nos encantó. Te reciben, te muestran los árboles contándote las diferencias entre los tipos de olivo que hay, luego te muestran las máquinas con las que producen, y te explican todo el proceso, contestándote las preguntas que se te ocurran. Como no hay turnos ni horarios, llegás, y al rato ya te hacen el paseo, sean 3, 7, o 15 personas. Al final, hay una degustación de todos los aceites que hacen (tres blends, tres varietales, y muchos saborizados), y obvio uno puede comprar ahí con algún descuento :)

Malena, disfrutando del aire libre

En fin. Vacaciones, paseo, descanso. Ahora de nuevo en la jungla, desde hace rato...

Read more
Colin Ian King

Linux Kernel Module Growth

The Linux kernel grows at an amazing pace, each kernel release adds more functionality, more drivers and hence more kernel modules.  I recently wondered what the trend was for kernel module growth per release, so I performed module builds on kernels v2.6.24 through to v4.16-rc2 for x86-64 to get a better idea of growth rates: one can see, the rate of growth is relatively linear with about 89 modules being added to each kernel release, which is not surprising as the size of the kernel is growing at a fairly linear rate too.  It is interesting to see that the number of modules has easily more than tripled in the 10 years between v2.6.24 and v4.16-rc2,  with a rate of about 470 new modules per year. At this rate, Linux will see the 10,000th module land in around the year 2025.

Read more
Dustin Kirkland

One of the many excellent suggestions from last year's HackerNews thread, Ask HN: What do you want to see in Ubuntu 17.10?, was to refresh the Ubuntu server's command line installer:

We're pleased to introduce this new installer, which will be the default Server installer for 18.04 LTS, and solicit your feedback.

Follow the instructions below, to download the current daily image, and install it into a KVM.  Alternatively, you could write it to a flash drive and install a physical machine, or try it in your virtual machine of your choice (VMware, VirtualBox, etc.).

$ wget
$ qemu-img create -f raw target.img 10G
$ kvm -m 1024 -boot d -cdrom bionic-live-server-amd64.iso -hda target.img
$ kvm -m 1024 target.img

For those too busy to try it themselves at the moment, I've taken a series of screenshots below, for your review.

Finally, you can provide feedback, bugs, patches, and feature requests against the Subiquity project in Launchpad:


Read more