Canonical Voices

I’ve been working on a d-bus service to replace some of the management guts of my project for a while now. We started out creating a simple service, but some of our management processes take a long time to run, causing a timeout error when calling these methods. I needed a way to run these tasks in the background and report status to any possible clients. I’d like to outline my approach to making this possible. This will be a multi-part blog series starting from the bottom: a very simple, synchronous d-bus service. By the end of this series, we’ll have a small codebase with asynchronous tasks which can be interacted with (input/output) from D-Bus clients.

All of this code is written with python3.5 on Ubuntu 17.04 (beta), is MIT licensed, and can be found on Github: https://github.com/larryprice/python-dbus-blog-series/tree/part1.

What is D-Bus?

From Wikipedia:

In computing, D-Bus or DBus (for “Desktop Bus”), a software bus, is an inter-process communication (IPC) and remote procedure call (RPC) mechanism that allows communication between multiple computer programs (that is, processes) concurrently running on the same machine.

D-Bus allows different processes to communicate indirectly through a known interface. The bus can be system-wide or user-specific (session-based). A D-Bus service will post a list of available objects with available methods which D-Bus clients can consume. It’s at the heart of much Linux desktop software, allowing processes to communicate with one another without forcing direct dependencies.

A synchronous service

Let’s start by building a base of a simple, synchronous service. We’re going to initialize a loop as a context to run our service within, claim a unique name for our service on the session bus, and then start the loop.

service
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#!/usr/bin/env python3

import dbus, dbus.service, dbus.exceptions
import sys

from dbus.mainloop.glib import DBusGMainLoop
from gi.repository import GLib

# Initialize a main loop
DBusGMainLoop(set_as_default=True)
loop = GLib.MainLoop()

# Declare a name where our service can be reached
try:
    bus_name = dbus.service.BusName("com.larry-price.test",
                                    bus=dbus.SessionBus(),
                                    do_not_queue=True)
except dbus.exceptions.NameExistsException:
    print("service is already running")
    sys.exit(1)

# Run the loop
try:
    loop.run()
except KeyboardInterrupt:
    print("keyboard interrupt received")
except Exception as e:
    print("Unexpected exception occurred: '{}'".format(str(e)))
finally:
    loop.quit()

Make this binary executable (chmod +x service) and run it. Your service should run indefinitely and do… nothing. Although we’ve already written a lot of code, we haven’t added any objects or methods which can be accessed on our service. Let’s fix that.

dbustest/random_data.py
1
2
3
4
5
6
7
8
9
10
11
12
import dbus.service
import random

class RandomData(dbus.service.Object):
    def __init__(self, bus_name):
        super().__init__(bus_name, "/com/larry_price/test/RandomData")
        random.seed()

    @dbus.service.method("com.larry_price.test.RandomData",
                         in_signature='i', out_signature='s')
    def quick(self, bits=8):
        return str(random.getrandbits(bits))

We’ve defined a D-Bus object RandomData which can be accessed using the path /com/larry_price/test/RandomData. This style of string is the general style of an object path. We’ve defined an interface implemented by RandomData called com.larry_price.test.RandomData with a single method quick as declared with the @dbus.service.method context decorator. quick will take in a single parameter, bits, which must be an integer as designated by the in_signature in our context decorator. quick will return a string as specified by the out_signature parameter. All that quick does is return a random string given a number of bits. It’s simple and it’s fast.

Now that we have an object, we need to declare an instance of that object in our service to attach it properly. Let’s assume that random_data.py is in a directory dbustest with an empty __init__.py, and our service binary is still sitting in the root directory. Just before we start the loop in the service binary, we can add the following code:

service
1
2
3
4
5
6
7
8
9
# ...
# Run the loop
try:
    # Create our initial objects
    from dbustest.random_data import RandomData
    RandomData(bus_name)

    loop.run()
# ...

We don’t need to do anything with the object we’ve initialized; creating it is enough to attach it to our D-Bus service and prevent it from being garbage collected until the service exits. We pass in bus_name so that RandomData will connect to the right bus name.

A synchronous client

Now that you have an object with an available method on our service, you’re probably interested in calling that method. You can do this on the command line with something like dbus-send, or you could find the service using a GUI tool such as d-feet and call the method directly. But eventually we’ll want to do this with a custom program, so let’s build a very small program to get started.

client
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#!/usr/bin/env python3

# Take in a single optional integral argument
import sys
bits = 16
if len(sys.argv) == 2:
    try:
        bits = int(sys.argv[1])
    except ValueError:
        print("input argument must be integer")
        sys.exit(1)

# Create a reference to the RandomData object on the  session bus
import dbus, dbus.exceptions
try:
    bus = dbus.SessionBus()
    random_data = bus.get_object("com.larry-price.test", "/com/larry_price/test/RandomData")
except dbus.exceptions.DBusException as e:
    print("Failed to initialize D-Bus object: '%s'" % str(e))
    sys.exit(2)

# Call the quick method with the given number of bits
print("Your random number is: %s" % random_data.quick(bits))

A large chunk of this code is parsing an input argument as an integer. By default, client will request a 16-bit random number unless it gets a number as input from the command line. Next we spin up a reference to the session bus and attempt to find our RandomData object on the bus using our known service name and object path. Once that’s initialized, we can directly call the quick method over the bus with the specified number of bits and print the result.

Make this binary executable also. If you try to run client without running service, you should see an error message explaining that the com.larry-price.test D-Bus service is not running (which would be true). Start service, and then run client with a few different input options and observe the results:

1
2
3
4
5
6
7
8
9
$ ./service & # to kill service later, be sure to note the pid here!
$ ./client
Your random number is: 41744
$ ./client 100
Your random number is: 401996322348922753881103222071
$ ./client 4
Your random number is: 14
$ ./client "new donk city"
input argument must be integer

That’s all there is to it. A simple, synchronous server and client. The server and client do not directly depend on each other but are able to communicate unidirectionally through simple method calls.

Next time

Next time, I’ll go into detail on how we can create an asynchronous service and client, and hopefully utilize signals to add a new direction to our communication.

Again, all the code can be found on Github: https://github.com/larryprice/python-dbus-blog-series/tree/part1.

Read more
Inayaili de León Persson

Last month the web team ran its first design sprint as outlined in The Sprint Book, by Google Ventures’ Jake Knapp. Some of us had read the book recently and really wanted to give the method a try, following the book to the letter.

In this post I will outline what we’ve learned from our pilot design sprint, what went well, what could have gone better, and what happened during the five sprint days. I won’t go into too much detail about explaining what each step of the design sprint consists of — for that you have the book. If you don’t have that kind of time, but would still like to know what I’m talking about, here’s an 8-minute video that explains the concept:

 

Before the sprint

One of the first things you need to do when running a design sprint is to agree on a challenge you’d like to tackle. Luckily, we had a big challenge that we wanted to solve: ubuntu.com‘s navigation system.

 

ubuntu.com navigation layers: global nav, main nav, second and third level navubuntu.com’s different levels of navigation

 

Assigning roles

If you’ve decided to run a design sprint, you’ve also probably decided who will be the Facilitator. If you haven’t, you should, as this person will have work to do before the sprint starts. In our case, I was the Facilitator.

My first Facilitator task was to make sure we knew who was going to be the Decider at our sprint.

We also agreed on who was going to participate, and booked one of our meeting rooms for the whole week plus an extra one for testing on Friday.

My suggestion for anyone running a sprint for the first time is to also name an Assistant. There is so much work to do before and during the sprint, that it will make the Facilitator’s life a lot easier. Even though we didn’t officially name anyone, Greg was effectively helping to plan the sprint too.

Evangelising the sprint

In the week that preceded the sprint, I had a few conversations with other team members who told me the sprint sounded really great and they were going to ‘pop in’ whenever they could throughout the week. I had to explain that, sadly, this wasn’t going to be possible.

If you need to do the same, explain why it’s important that the participants commit to the entire week, focusing on the importance of continuity and of accumulated knowledge that the sprint’s team will gather throughout the week. Similarly, be pleasant but firm when participants tell you they will have to ‘pop out’ throughout the week to attend to other matters — only the Decider should be allowed to do this, and even so, there should be a deputy Decider in the room at all times.

Logistics

Before the sprint, you also need to make sure that you have all the supplies you need. I tried as much as possible to follow the suggestions for materials outlined in the book, and I even got a Time Timer. In retrospect, it would have been fine for the Facilitator to just keep time on a phone, or a less expensive gadget if you really want to be strict with the no-phones-in-the-room policy.

Even though the book says you should start recruiting participants for the Friday testing during the sprint, we started a week before that. Greg took over that side of the preparation, sending prompts on social media and mailing lists for people to sign up. When participants didn’t materialise in this manner, Greg sent a call for participants to the mailing list of the office building we work at, which worked wonders for us.

Know your stuff

Assuming you have read the book before your sprint, if it’s your first sprint, I recommend re-reading the chapter for the following day the evening before, and take notes.

I printed out the checklists provided in the book’s website and wrote down my notes for the following day, so everything would be in one place.

 

Facilitator checklist with handwritten notesFacilitator checklists with handwritten notes

 

I also watched the official video for the day (which you can get emailed to you by the Sprint Bot the evening before), and read all the comments in the Q&A discussions linked to from the emails. These questions and comments from other people who have run sprints was incredibly useful throughout the week.

 

Sprint Bot emailSprint Bot email for the first day of the sprint

 

Does this sound like a lot of work? It was. I think if/when we do another sprint the time spent preparing will probably be reduced by at least 50%. The uncertainty of doing something as involved as this for the first time made it more stressful than preparing for a normal workshop, but it’s important to spend the time doing it so that things run smoothly during the sprint week.

Day 1

The morning of the sprint I got in with plenty of time to spare to set up the room for the kick-off at 10am.

I bought lots of healthy snacks (which were promptly frowned on by the team, who were hoping for sweater treats); brought a jug of water and cups, and all the supplies to the room; cleared the whiteboards; and set up the chairs.

What follows are some of the outcomes, questions and other observations from our five days.

Morning

In the morning of day 1 you define a long term goal for your project, list the ways in which the project could fail in question format, and draw a flowchart, or map, of how customers interact with your product.

  • Starting the map was a little bit tricky as it wasn’t clear how the map should look when there are more than one type of customer who might have different outcomes
  • In the book there are no examples with more than one type of customer, which meant we had to read and re-read that part of the book until we decided how to proceed as we have several customer types to cater for
  • Moments like these can take the confidence in the process away from the team, that’s why it’s important for the Facilitator to read everything carefully more than once, and ideally for him or her not to be the only person to do so
  • We did the morning exercises much faster than prescribed, but the same didn’t happen in the afternoon!

 

The team discussing the target for the sprint in front of the journey mapDiscussing the target for the sprint

 

Afternoon

In the afternoon experts from the sprint and guests come into the room and you ask them lots of questions about your product and how things work. Throughout the interviews the team is taking notes in the “How Might We” format (for example, “How might we reduce the amount of copy?”). By the end of the interviews, you group the notes into themes, vote on the ones you find most useful or interesting, move the most voted notes onto their right place within your customer map and pick a target in the map as the focus for the rest of the sprint.

  • If you have time, explain “How Might We” notes work before the lunch break, so you save that time for interviews in the afternoon
  • Each expert interview should last for about 15-30 minutes, which didn’t fee like long enough to get all the valuable knowledge from our experts — we had to interrupt them somewhat abruptly to make sure the interviews didn’t run over. Next time it might be easier to have a list of questions we want to cover before the interviews start
  • Choreographing the expert interviews was a bit tricky as we weren’t sure how long each would take. If possible, tell people you’ll call them a couple of minutes before you need them rather than set a fixed time — we had to send people back a few times because we weren’t yet finished asking all the question to the previous person!
  • It took us a little longer than expected to organise the notes, but in the end, the most voted notes did cluster around the key section of the map, as predicted in the book!

 

How Might We notes on the wallsSome of the How Might We notes on the wall after the expert interviews

 

Other thoughts on day 1

  • Sprint participants might cancel at the last minute. If this happens, ask yourself if they could still appear as experts on Monday afternoon? If not, it’s probably better to write them off the sprint completely
  • There was a lot of checking the book as the day went by, to confirm we were doing the right thing
  • We wondered if this comes up in design sprints frequently: what if the problem you set out to solve pre-sprint doesn’t match the target area of the map at the end of day 1? In our case, we had planned to focus on navigation but the target area was focused on how users learn more about the products/services we offer

A full day of thinking about the problem and mapping it doesn’t come naturally, but it was certainly useful. We conduct frequent user research and usability testing, and are used to watching interviews and analysing findings, nevertheless the expert interviews and listening to different perspectives from within the company was very interesting and gave us a different type of insight that we could build upon during the sprint.

Day 2

By the start of day 2, it felt like we had been in the sprint for a lot longer than just one day — we had accomplished a lot on Monday!

Morning

The morning of day 2 is spent doing “Lightning Demos” after a quick 20-minute research. These can be anything that might be interesting, from competitor products to previous internal attempts at solving the sprint challenge. Before lunch, the team decides who will sketch what in the afternoon: will everyone sketch the same thing or different parts of the map.

  • We thought the “Lightning Demos” was a great way to do demos — it was fast and captured the most important thing quickly
  • Deciding who would sketch what wasn’t as straightforward as we might have thought. We decided that everyone should do a journey through our cloud offerings so we’d get different ideas on Wednesday, knowing there was the risk of not everything being covered in the sketches
  • Before we started sketching, we made a list of sections/pages that should be covered in the storyboards
  • As on day 1, the morning exercises were done faster than prescribed, we were finished by 12:30 with a 30 minute break from 11-11:30

 

Sketches from lightning demosOur sketches from the lightning demos

 

Afternoon

In the afternoon, you take a few minutes to walk around the sprint room and take down notes of anything that might be useful for the sketching. You then sketch, starting with quick ideas and moving onto a more detailed sketch. You don’t look at the final sketches until Wednesday morning.

  • We spent the first few minutes of the afternoon looking at the current list of participants for the Friday testing to decide which products to focus on in our sketches, as our options were many
  • We had a little bit of trouble with the “Crazy 8s” exercise, where you’re supposed to sketch 8 variations of one idea in 8 minutes. It wasn’t clear what we had to do so we re-read that part a few times. This is probably the point of the exercise: to remove you from your comfort zone, make you think of alternative solutions and get your creative muscles warmed up
  • We had to look at the examples of detailed sketches in the book to have a better idea of what was expected from our sketches
  • It took us a while to get started sketching but after a few minutes everyone seemed to be confidently and quietly sketching away
  • With complicated product offerings there’s the instinct to want to have access to devices to check product names, features, etc – I assumed this was not allowed but some people were sneakily checking their laptops!
  • Naming your sketch wasn’t as easy as it sounded
  • Contrary to what we expected, the afternoon sketching exercises took longer than the morning’s, at 5pm some people were still sketching

 

The team sketchingEveryone sketching in silence on Tuesday afternoon

 

Tuesday was lots of fun. Starting the day with the demos, without much discussion on the validity of the ideas, creates a positive mood in the team. Sketching in a very structured manner removes some of the fear of the blank page, as you build up from loose ideas to a very well-defined sketch. The silent sketching was also great as it meant we had some quiet time to pause and think a solution through, giving the people who tend to be more quiet an opportunity to have their ideas heard on par with everyone else.

Day 3

No-one had seen the sketches done on Tuesday, so the build-up to the unveiling on day 3 was more exciting than for the usual design review!

Morning

On the Wednesday morning, you decide which sketch (or sketches) you will prototype. You stick the sketches on the wall and review them in silence, discuss each sketch briefly and each person votes on their favourite. After this, the Decider casts three votes, which can follow or not the votes of the rest of the team. Whatever the Decider votes on will be prototyped. Before lunch, you decide whether you will need to create one or more prototypes, depending on whether the Decider’s (or Deciders’) votes fit together or not.

  • We had 6 sketches to review
  • Although the book wasn’t clear as to when the guest Decider should participate, we invited ours from 10am to 11.30am as it seemed that he should participate in the entire morning review process — this worked out well
  • During the speed critique people started debating the validity or feasibility of solutions, which was expected but meant some work for the Facilitator to steer the conversation back on track
  • The morning exercises put everyone in a positive mood, it was an interesting way to review and select ideas
  • Narrating the sketches was harder than what might seem at first, and narrating your own sketch isn’t much easier either!
  • It was interesting to see that many of the sketches included similar solutions — there were definite patterns that emerged
  • Even though I emphasised that the book recommends more than one prototype, the team wasn’t keen on it and the focus of the pre-lunch discussion was mostly on how to merge all the voted solutions into one prototype
  • As for all other days, and because we decided for an all-in-one prototype, we finished the morning exercises by noon

 

Reviewing the sketches in silenceThe team reviewing the sketches in silence on Wednesday morning

 

Afternoon

In the afternoon of day 3, you sketch a storyboard of the prototype together, starting one or two steps before the customer encounters your prototype. You should move the existing sketches into the frames of the storyboard when possible, and add only enough detail that will make it easy to build the prototype the following day.

  • Using masking tape was easier than drawing lines for the storyboard frames
  • It was too easy to come up with new ideas while we were drawing the storyboard and it was tricky to tell people that we couldn’t change the plan at this point
  • It was hard to decide the level of detail we needed to discuss and add to the storyboard. We finished the first iteration of the storyboard a few minutes before 3pm. Our first instinct was to start making more detailed wireframes with the remaining time, but we decided to take a break for coffee and come back to see where we needed more detail in the storyboard instead
  • It was useful to keep asking the team what else we needed to define as we drew the storyboard before we started building the prototype the following day
  • Because we read out the different roles in preparation for Thursday, we ended up assigning roles straight away

 

Drawing the storyboardDiscussing what to add to our storyboard

 

Other thoughts on day 3

  • One sprint participant couldn’t attend on Tuesday, but was back on Wednesday, which wasn’t ideal but didn’t impact negatively
  • While setting up for the third day, I wasn’t sure if the ideas from the “Lightning Demos” could be erased from the whiteboard, so I took a photo of them and erased it as, even with the luxury of massive whiteboards, we wouldn’t have had space for the storyboard later on!

By the end of Wednesday we were past the halfway mark of the sprint, and the excitement in anticipation for the Friday tests was palpable. We had some time left before the clock hit 5 and wondered if we should start building the prototype straight away, but decided against it — we needed a good night’s sleep to be ready for day 4.

Day 4

Thursday is all about prototyping. You need to choose which tools you will use, prioritising speed over perfection, and you also need to assign different roles for the team so everyone knows what they need to do throughout the day. The interviewer should write the interview script for Friday’s tests.

  • For the prototype building day, we assigned: two writers, one interviewer, one stitcher, two makers and one asset collector
  • We decided to build the pages we needed with HTML and CSS (instead of using a tool like Keynote or InVision) as we could build upon our existing CSS framework
  • Early in the afternoon we were on track, but we were soon delayed by a wifi outage which lasted for almost 1,5 hours
  • It’s important to keep communication flowing throughout the day to make sure all the assets and content that are needed are created or collected in time for the stitcher to start stitching
  • We were finished by 7pm — if you don’t count the wifi outage, we probably would have been finished by 6pm. The extra hour could have been curtailed if there had been just a little bit more detail in the storyboard page wireframes and in the content delivered to the stitcher, and fewer last minute tiny changes, but all-in-all we did pretty well!

 

Maker and asset collector working on the prototypeJoana and Greg working on the prototype

 

Other thoughts on day 4

  • We had our sprint in our office, so it would have been possible for us to ask for help from people outside of the sprint, but we didn’t know whether this was “allowed”
  • We could have assigned more work to the asset collector: the makers and the stitcher were looking for assets themselves as they created the different components and pages rather than delegating the search to the asset collector, which is how we normally work
  • The makers were finished with their tasks more quickly than expected — not having to go through multiple rounds of reviews that sometimes can take weeks makes things much faster!

By the end of Thursday there was no denying we were tired, but happy about what we had accomplished in such a small amount of time: we had a fully working prototype and five participants lined up for Friday testing. We couldn’t wait for the next day!

Day 5

We were all really excited about the Friday testing. We managed to confirm all five participants for the day, and had an excellent interviewer and solid prototype. As the Facilitator, I was also happy to have a day where I didn’t have a lot to do, for a change!

Thoughts and notes on day 5

On Friday, you test your prototype with five users, taking notes throughout. At the end of the day, you identify patterns within the notes and based on these you decide which should be the next steps for your project.

  • We’re lucky to work in a building with lots of companies who employ our target audience, but we wonder how difficult it would have been to find and book the right participants within just 4 days if we needed different types of users or were based somewhere else
  • We filled up an entire whiteboard with notes from the first interview and had to go get extra boards during the break
  • Throughout the day, we removed duplicate notes from the boards to make them easier to scan
  • Some participants don’t talk a lot naturally and need a lot of constant reminding to think out loud
  • We had the benefit of having an excellent researcher in our team who already knows and does everything the book recommends doing. It might have been harder for someone with less research experience to make sure the interviews were unbiased and ran smoothly
  • At the end of the interviews, after listing the patterns we found, we weren’t sure whether we could/should do more thorough analysis of the testing later or if we should chuck the post-it notes in the bin and move on
  • Our end-of-sprint decision was to have a workshop the following week where we’d plan a roadmap based on the findings — could this be considered “cheating” as we’re only delaying making a decision?

 

The team in the observation roomThe team observing the interviews on Friday

 

A wall of interview notesA wall of interview notes

 

The Sprint Book notes that you can have one of two results at the end of your sprint: an efficient failure, or a flawed success. If your prototype doesn’t go down well with the participants, your team has only spent 5 days working on it, rather than weeks or potentially months — you’ve failed efficiently. And if the prototype receives positive feedback from participants, most likely there will still be areas that can be improved and retested — you’ve succeeded imperfectly.

At the end of Friday we all agreed that we our prototype was a flawed success: there were things we tested that we’d had never think to try before and that received great feedback, but some aspects certainly needed a lot more work to get right. An excellent conclusion to 5 intense days of work!

Final words

Despite the hard work involved in planning and getting the logistics right, running the web team’s trial design sprint was fun.

The web team is small and stretched over many websites and products. We really wanted to test this approach so we could propose it to the other teams we work with as an efficient way to collaborate at key points in our release schedules.

We certainly achieved this goal. The people who participated directly in the sprint learned a great deal during the five days. Those in the web team who didn’t participate were impressed with what was achieved in one week and welcoming of the changes it initiated. And the teams we work with seem eager to try the process out in their teams, now that they’ve seen what kind of results can be produced in such a short time.

How about you? Have you run a design sprint? Do you have any advice for us before we do it again? Leave your thoughts in the comments section.

Read more
Stéphane Graber

Introduction

I maintain a number of development systems that are used as throw away machines to reproduce LXC and LXD bugs by the upstream developers. I use MAAS to track who’s using what and to have the machines deployed with whatever version of Ubuntu or Centos is needed to reproduce a given bug.

A number of those systems are proper servers with hardware BMCs on a management network that MAAS can drive using IPMI. Another set of systems are virtual machines that MAAS drives through libvirt.

But I’ve long had another system I wanted to get in there. That machine is a desktop computer but with a server grade SAS controller and internal and external arrays. That machine also has a Fiber Channel HBA and Infiniband card for even less common setups.

The trouble is that this being a desktop computer, it’s lacking any kind of remote management that MAAS supports. That machine does however have a good PCIe network card which provides reliable wake-on-lan.

Back in the days (MAAS 1.x), there was a wake-on-lan power type that would have covered my use case. This feature was however removed from MAAS 2.x (see LP: #1589140) and the development team suggests that users who want the old wake-on-lan feature, instead install Ubuntu 14.04 and the old MAAS 1.x branch.

Implementing Wake on LAN in MAAS 2.x

I am, however not particularly willing to install an old Ubuntu release and an old version of MAAS just for that one trivial feature, so I instead spent a bit of time to just implement the bits I needed and keep a patch around to be re-applied whenever MAAS changes.

MAAS doesn’t provide a plugin system for power types, so I unfortunately couldn’t just write a plugin and distribute that as an unofficial power type for those who need WOL. I instead had to resort to modifying MAAS directly to add the extra power type.

The code change needed to re-implement a wake-on-lan power type is pretty simple and only took me a few minutes to sort out. The patch can be found here: https://dl.stgraber.org/maas-wakeonlan.diff

To apply it to your MAAS, do:

sudo apt install wakeonlan
wget https://dl.stgraber.org/maas-wakeonlan.diff
sudo patch -p1 -d /usr/lib/python3/dist-packages/provisioningserver/ < maas-wakeonlan.diff
sudo systemctl restart maas-rackd.service maas-regiond.service

Once done, you’ll now see this in the web UI:

After selecting the new “Wake on LAN” power type, enter the MAC address of the network interface that you have WOL enabled on and save the change.

MAAS will then be able to turn the system on, allowing for the normal commissioning and deployment stages. For everything else, this power type behaves like the “Manual” type, asking the user to manually go shutdown or reboot the system as you can’t do that through Wake on LAN.

Note that you’ll have to re-apply part of the patch whenever MAAS is updated. The patch modifies two files and adds a new one. The new file won’t be removed during an upgrade, but the two modified files will get reverted and need patching again.

Conclusion

This is certainly a hack and if your system supports anything better than Wake on LAN, or you’re willing to buy a supported PDU just for that one system, then you should do that instead.

But if the inability to turn a system on is all that stands in your way from adding it to your MAAS, as was the case for me, then that patch may help you.

I hope that in time MAAS will either get that feature back in some way or get a plugin system that I can use to ship that extra power type in its own separate package without needing to alter any of MAAS’ own files.

Read more
Alan Griffiths

MirAL 1.3.2

There’s a bugfix MirAL release (1.3.2) available in ‘Zesty Zapus’ (Ubuntu 17.04) and the so-called “stable phone overlay” ppa for ‘Xenial Xerus’ (Ubuntu 16.04LTS). MirAL is a project aimed at simplifying the development of Mir servers and particularly providing a stable ABI and sensible default behaviors.

The bugfixes in 1.3.2 are:

In libmiral a couple of “fails to build from source” fixes:

Fix FTBFS against Mir < 0.26 (Xenial, Yakkety)

Update to fix FTBFS against lp:mir (and clang)

In the miral-shell example, a crash fixed:

With latest zesty’s libstdc++-6-dev miral-shell will crash when trying to draw its background text. (LP: #1677550)

Some of the launch scripts have been updated to reflect a change to the way GDK chooses the graphics backend:

change the server and client launch scripts to avoid using the default Mir socket (LP: #1675794)

Update miral-xrun to match GDK changes (LP: #1675115)

In addition a misspelling of “management” has been corrected:

miral/set_window_management_policy.h

Read more
Cemil Azizoglu

Yeay, the new Mesa (17.0.2-1ubuntu2) has landed! (Many thanks to Timo.) This new Mesa incorporates a new EGL backend for Mir (as a distro patch). We will be moving away from the old backend by Mir 1.0, but for now both the new and old backends coexist.

This new backend has been implemented as a new platform in Mesa EGL so that we can easily rip out the old platform when we are ready. Being ready means switching _all_ the EGL clients out there to the new Mesa EGL types exported by this backend.

In case you are wondering, the new EGL types are [1]:

MirConnection* –> EGLNativeDisplayType

MirSurface* –> EGLNativeWindowType

Note that we currently use MirRenderSurface for what will soon be renamed to MirSurface. So at the moment, technically we have MirRenderSurface* as theEGLNativeWindowType.

Once we feel confident we will be pushing this patch upstream as well.

There should be no visible differences in your EGL applications due to this change which is a good thing. If you are curious about the code differences that this new backend introduces check out the ‘eglapp’ wrapper that we use in a number of our example apps :

http://bazaar.launchpad.net/~mir-team/mir/development-branch/view/head:/examples/eglapp.c

The new backend is activated by the ‘-r’ switch which sets the ‘new_egl’ flag, so you can see what is done differently in the code by looking at how this flag makes the code change.

Our interfaces are maturing and we are a big step closer to Mir 1.0.

-Cemil

[1] Mir does not support pixmaps.

Read more
Brandon Schaefer

When Choosing a Backend Fails

There was a recent GDK release into zesty that now probes for Mir over X11. This can cause issues when still using an X11 desktop such as Unity7 when a Mir server is running at the same time.

A common way to test Mir is to run it on top of X, which is called Mir-on-X. This means there are now two display servers running at the same time.

An example of an issue this can cause is gnome-terminal-server. It will attempt to spawn its clients on Mir instead of X11 once the Mir server is opened. You now attempt to spawn a new terminal which causes the gnome-terminal-server to crash since it now tries to spawn on Mir but you already spawned terminals on X. As you can imagine this is frustrating to your workflow!

A simple workaround is to add this to your ~/.profile:

if [ "$XDG_CURRENT_DESKTOP" = "Unity:Unity7" ]; then
    dbus-update-activation-environment --systemd GDK_BACKEND=x11
fi

Depending on your desktop the “Unity:Unity7” bit will change.

As more toolkits will start to pick other display servers as their first pick more of these issues will become possible. Other environment variables to consider:

SDL_VIDEODRIVER
QT_QPA_PLATFORM

A bit more detail on the issue can be found here:

Choosing a Backend

Read more
Cemil Azizoglu

Hi, I’ve been wanting to have a blog for a while now. I am not sure if I’ll have the time to post on a regular basis but I’ll try.

First things first : My name is Cemil (pronounced JEH-mil), a.k.a. ‘camako’ on IRC – I work as a developer and am the team-lead in the Mir project.

Recently, I’ve been working on Mir 1.0 tasks, new Mesa EGL platform backend for Mir, Vulkan Mir WSI driver for Mesa, among other things.

Here’s something pretty for you to look at for now :

https://plus.google.com/113725654283519068012/posts/8jmrQnpJxMc

-Cemil

Read more
Stéphane Graber

LXD logo

USB devices in containers

It can be pretty useful to pass USB devices to a container. Be that some measurement equipment in a lab or maybe more commonly, an Android phone or some IoT device that you need to interact with.

Similar to what I wrote recently about GPUs, LXD supports passing USB devices into containers. Again, similarly to the GPU case, what’s actually passed into the container is a Unix character device, in this case, a /dev/bus/usb/ device node.

This restricts USB passthrough to those devices and software which use libusb to interact with them. For devices which use a kernel driver, the module should be installed and loaded on the host, and the resulting character or block device be passed to the container directly.

Note that for this to work, you’ll need LXD 2.5 or higher.

Example (Android debugging)

As an example which quite a lot of people should be able to relate to, lets run a LXD container with the Android debugging tools installed, accessing a USB connected phone.

This would for example allow you to have your app’s build system and CI run inside a container and interact with one or multiple devices connected over USB.

First, plug your phone over USB, make sure it’s unlocked and you have USB debugging enabled:

stgraber@dakara:~$ lsusb
Bus 002 Device 003: ID 0451:8041 Texas Instruments, Inc. 
Bus 002 Device 002: ID 0451:8041 Texas Instruments, Inc. 
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 021: ID 17ef:6047 Lenovo 
Bus 001 Device 031: ID 046d:082d Logitech, Inc. HD Pro Webcam C920
Bus 001 Device 004: ID 0451:8043 Texas Instruments, Inc. 
Bus 001 Device 005: ID 046d:0a01 Logitech, Inc. USB Headset
Bus 001 Device 033: ID 0fce:51da Sony Ericsson Mobile Communications AB 
Bus 001 Device 003: ID 0451:8043 Texas Instruments, Inc. 
Bus 001 Device 002: ID 072f:90cc Advanced Card Systems, Ltd ACR38 SmartCard Reader
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

Spot your phone in that list, in my case, that’d be the “Sony Ericsson Mobile” entry.

Now let’s create our container:

stgraber@dakara:~$ lxc launch ubuntu:16.04 c1
Creating c1
Starting c1

And install the Android debugging client:

stgraber@dakara:~$ lxc exec c1 -- apt install android-tools-adb
Reading package lists... Done
Building dependency tree 
Reading state information... Done
The following NEW packages will be installed:
 android-tools-adb
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 68.2 kB of archives.
After this operation, 198 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu xenial/universe amd64 android-tools-adb amd64 5.1.1r36+git20160322-0ubuntu3 [68.2 kB]
Fetched 68.2 kB in 0s (0 B/s) 
Selecting previously unselected package android-tools-adb.
(Reading database ... 25469 files and directories currently installed.)
Preparing to unpack .../android-tools-adb_5.1.1r36+git20160322-0ubuntu3_amd64.deb ...
Unpacking android-tools-adb (5.1.1r36+git20160322-0ubuntu3) ...
Processing triggers for man-db (2.7.5-1) ...
Setting up android-tools-adb (5.1.1r36+git20160322-0ubuntu3) ...

We can now attempt to list Android devices with:

stgraber@dakara:~$ lxc exec c1 -- adb devices
* daemon not running. starting it now on port 5037 *
* daemon started successfully *
List of devices attached

Since we’ve not passed any USB device yet, the empty output is expected.

Now, let’s pass the specific device listed in “lsusb” above:

stgraber@dakara:~$ lxc config device add c1 sony usb vendorid=0fce productid=51da
Device sony added to c1

And try to list devices again:

stgraber@dakara:~$ lxc exec c1 -- adb devices
* daemon not running. starting it now on port 5037 *
* daemon started successfully *
List of devices attached 
CB5A28TSU6 device

To get a shell, you can then use:

stgraber@dakara:~$ lxc exec c1 -- adb shell
* daemon not running. starting it now on port 5037 *
* daemon started successfully *
E5823:/ $

LXD USB devices support hotplug by default. So unplugging the device and plugging it back on the host will have it removed and re-added to the container.

The “productid” property isn’t required, you can set only the “vendorid” so that any device from that vendor will be automatically attached to the container. This can be very convenient when interacting with a number of similar devices or devices which change productid depending on what mode they’re in.

stgraber@dakara:~$ lxc config device remove c1 sony
Device sony removed from c1
stgraber@dakara:~$ lxc config device add c1 sony usb vendorid=0fce
Device sony added to c1
stgraber@dakara:~$ lxc exec c1 -- adb devices
* daemon not running. starting it now on port 5037 *
* daemon started successfully *
List of devices attached 
CB5A28TSU6 device

The optional “required” property turns off the hotplug behavior, requiring the device be present for the container to be allowed to start.

More details on USB device properties can be found here.

Conclusion

We are surrounded by a variety of odd USB devices, a good number of which come with possibly dodgy software, requiring a specific version of a specific Linux distribution to work. It’s sometimes hard to accommodate those requirements while keeping a clean and safe environment.

LXD USB device passthrough helps a lot in such cases, so long as the USB device uses a libusb based workflow and doesn’t require a specific kernel driver.

If you want to add a device which does use a kernel driver, locate the /dev node it creates, check if it’s a character or block device and pass that to LXD as a unix-char or unix-block type device.

Extra information

The main LXD website is at: https://linuxcontainers.org/lxd
Development happens on Github at: https://github.com/lxc/lxd
Mailing-list support happens on: https://lists.linuxcontainers.org
IRC support happens in: #lxcontainers on irc.freenode.net
Try LXD online: https://linuxcontainers.org/lxd/try-it

Read more
Michael Hall

Late last year Amazon introduce a new EC2 image customized for Machine Learning (ML) workloads. To make things easier for data scientists and researchers, Amazon worked on including a selection of ML libraries into these images so they wouldn’t have to go through the process of downloading and installing them (and often times building them) themselves.

But while this saved work for the researchers, it was no small task for Amazon’s engineers. To keep offering the latest version of these libraries they had to repeat this work every time there was a new release , which was quite often for some of them. Worst of all they didn’t have a ready-made way to update those libraries on instances that were already running!

By this time they’d heard about Snaps and the work we’ve been doing with them in the cloud, so they asked if it might be a solution to their problems. Normally we wouldn’t Snap libraries like this, we would encourage applications to bundle them into their own Snap package. But these libraries had an unusual use-case: the applications that needed them weren’t mean to be distributed. Instead the application would exist to analyze a specific data set for a specific person. So as odd as it may sound, the application developer was the end user here, and the library was the end product, which made it fit into the Snap use case.

Screenshot from 2017-03-23 16-43-19To get them started I worked on developing a proof of concept based on MXNet, one of their most used ML libraries. The source code for it is part C++, part Python, and Snapcraft makes working with both together a breeze, even with the extra preparation steps needed by MXNet’s build instructions. My snapcraft.yaml could first compile the core library and then build the Python modules that wrap it, pulling in dependencies from the Ubuntu archives and Pypi as needed.

This was all that was needed to provide a consumable Snap package for MXNet. After installing it you would just need to add the snap’s path to your LD_LIBRARY_PATH and PYTHONPATH environment variables so it would be found, but after that everything Just Worked! For an added convenience I provided a python binary in the snap, wrapped in a script that would set these environment variables automatically, so any external code that needed to use MXNet from the snap could simply be called with /snap/bin/mxnet.python rather than /usr/bin/python (or, rather, just mxnet.python because /snap/bin/ is already in PATH).

I’m now working with upstream MXNet to get them building regular releases of this snap package to make it available to Amazon’s users and anyone else. The Amazon team is also seeking similar snap packages from their other ML libraries. If you are a user or contributor to any of these libraries, and you want to make it easier than ever for people to get the latest and greatest versions of them, let’s get together and make it happen! My MXNet example linked to above should give you a good starting point, and we’re always happy to help you with your snapcraft.yaml in #snapcraft on rocket.ubuntu.com.

If you’re just curious to try it out ourself, you can download my snap and then follow along with the MXNet tutorial, using the above mentioned mxnet.python for your interactive python shell.

Read more
Alan Griffiths

miral gets cut & paste

For some time now I’ve been intending to investigate the cut & paste mechanisms in the Unity8/Mir stack with the intention of ensuring they are supported in MirAL.

I’ve never had the time to do this, so I was surprised to discover that cut & paste is now working! (At least on Zesty.)

I assume that this is piggy-backing off the support being added to enable the “experimental” Unity8 desktop session, so I hope that this “magic” continues to work.

Read more
Michael Hall

Java is a well established language for developing web applications, in no small part because of it’s industry standard framework for building them: Servlets and JSP.  Another important part of this standard is the Web Archive, or WAR, file format, which defines how to provide a web application’s executables and how they should be run in a way that is independent of the application server that will be running  them.

application-server-market-share-2015WAR files make life easier for developers by separate the web application from the web server. Unfortunately this doesn’t actually make it easier to deploy a webapp, it only shifts some of the burden off of the developers and on to the user, who still needs to setup and configure an application server to host it. One popular option is Apache’s Tomcat webapp server, which is both lightweight and packs enough features to support the needs of most webapps.

And here is where Snaps come in. By combining both the application and the server into a single, installable package you get the best of both, and with a little help from Snapcraft you don’t have to do any extra work.

Snapcraft supports a modular build configuration by having multiple “parts“, each of which provides some aspect of your complete runtime environment in a way that is configurable and reusable. This is extended to a feature called “remote parts” which are pre-defined parts you can easily pull into your snap by name. It’s this combination of reusable and remote parts that are going to make snapping up java web applications incredibly easy.

The remote part we are going to use is the “tomcat” part, which will build the Tomcat application server from upstream source and bundle it in your snap ready to go. All that you, as the web developer, need to provide is your .war file. Below is an simple snapcraft.yaml that will bundle Tomcat’s “sample” war file into a self-contained snap package.

name: tomcat-sample
version: '0.1'
summary: Sample webapp using tomcat part
description: |
 This is a basic webapp snap using the remote Tomcat part

grade: stable
confinement: strict

parts:
  my-part:
    plugin: dump
    source: .
    organize:
      sample.war: ./webapps/sample.war
    after: [tomcat]

apps:
  tomcat:
    command: tomcat-launch
    daemon: simple
    plugs: [network-bind]

The important bits are the ones in bold, let’s go through them one at a time starting with the part named “my-part”. This uses the simple “dump” plugin which is just going to copy everything in it’s source (current directory in this case) into the resulting snap. Here we have just the sample.war file, which we are going to move into a “webapps” directory, because that is where the Tomcat part is going to look for war files.

Now for the magic, by specifying that “my-part” should come after the “tomcat” part (using after: [tomcat]) which isn’t defined elsewhere in the snapcraft.yaml, we will trigger Snapcraft to look for a remote part by that same name, which conveniently exists for us to use. This remote part will do two things, first it will download and build the Tomcat source code, and then it will generate a “tomcat-launch” shell script that we’ll use later. These two parts, “my-part” and “tomcat” will be combined in the final snap, with the Tomcat server automatically knowing about and installing the sample.war webapp.

The “apps” section of the snapcraft.yaml defines the application to be run. In this simple example all we need to execute is the “tomcat-launch” script that was created for us. This sets up the Tomcat environment variables and runtime directories so that it can run fully confined within the snap. And by declaring it to be a simple daemon we are additionally telling it to auto-start as soon as it’s installed (and after any reboot) which will be handled by systemd.

Now when you run “snapcraft” on this config, you will end up with the file tomcat-sample_0.1_amd64.snap which contains your web application, the Tomcat application server, and a headless Java JRE to run it all. That way the only thing your users need to do to run your app is to “snap install tomcat-sample” and everything will be up and running at http://localhost:8080/sample/ right away, no need to worry about installing dependencies or configuring services.

Screenshot from 2017-03-21 14-16-59

If you have a webapp that you currently deploy as a .war file, you can snap it yourself in just a few minutes, use the snapcraft.yaml defined above and replace the sample data with your own. To learn more about Snaps and Snapcraft in general you can follow this tutorial as well as learning how to publish your new snap to the store.

Read more
Tom Macfarlane

Our stand occupied the same space as last year with a couple of major
changes this time around – the closure of a previously adjacent aisle
resulting in an increase in overall stand space (from 380 to 456 square
metres). With the stand now open on just two sides, this presented the
design team with some difficult challenges:

  • Maximising site lines and impact upon approach
  • Utilising our existing components – hanging banners, display units,
    alcoves, meeting rooms – to work effectively within a larger space
  • Directing the flow of visitors around the stand

Design solution

Some key design decisions and smaller details:

  • Rotating the hanging fabric banners 90 degrees and moving them
    to the very front of the stand
  • Repositioning the welcome desk to maximise visibility from
    all approaches
  • Improved lighting throughout – from overhead banner illumination
    to alcoves and within all meeting rooms
  • Store room end wall angled 45 degrees to increase initial site line
  • Raised LED screens for increased visibility
  • Four new alcoves with discrete fixings for all 10x alcove screens
  • Bespoke acrylic display units for AR helmets and developer boards
  • Streamlined meeting room tables with new cable management
  • Separate store and staff rooms

Result

With thoughtful planning and attention to detail, our brand presence
at this years MWC was the strongest yet.

Initial design sketches

Plan and site line 3D render

 


Design intent drawings

 

 

 

 

 

3D lettering and stand graphics

 

 

 

 

 

Read more
LaMont Jones

The question came up “how do I add an authoritative (secondary) name server for a domain that is managed by MAAS?”

Why would I want to do that?

There are various reasons, including that the region controller may just be busy enough, or the MAAS region spread out enough, that we don’t want to have all DNS go through it.  Another reason would be to avoid exposing the region controller to the internet, while still allowing it to provide authoritative DNS data for machines inside the region.

How do I do that?

First, we’ll need to create a secondary nameserver.  For purposes of simplicity, we’ll assume that it’s an Ubuntu machine named mysecondary.example.com, and that you have installed the bind9 package.  And we’ll assume that  you have named the domain maas, that the region controller is named region.example.com, with an upstream interface having the IP address a.b.c.d, and that you have a MAAS session called admin.

On mysecondary.example.com, we add this to /etc/bind/named.conf.local:

zone "maas" { type slave; file "db.maas"; masters { a.b.c.d; }; };

Then reload named there via “rndc reload”

With the MAAS CLI, we then say (note the trailing “.” on rrdata):

maas admin dnsresource-records create name=@ domain=maas rrtype=ns rrdata=mysecondary.example.com.

At that point, mysecondary is both authoritative, and named in the NS RRset for the domain.

What else can I do?

If you call the MAAS domain somename.example.com, then you could add NS records to the example.com DNS zone delegating that zone to the MAAS region and it’s secondaries.

What are the actual limitations?

  • The region controller is always listed as a name server for the domain.  For domains other than the default.  See also bug 1672220 about address records.
  • If MAAS is told that it’s authoritative for a domain, it IS the master/primary.
  • The MAAS region does not have zones that are other than “type master”.

Read more
Stéphane Graber

LXD logo

GPU inside a container

LXD supports GPU passthrough but this is implemented in a very different way than what you would expect from a virtual machine. With containers, rather than passing a raw PCI device and have the container deal with it (which it can’t), we instead have the host setup with all needed drivers and only pass the resulting device nodes to the container.

This post focuses on NVidia and the CUDA toolkit specifically, but LXD’s passthrough feature should work with all other GPUs too. NVidia is just what I happen to have around.

The test system used below is a virtual machine with two NVidia GT 730 cards attached to it. Those are very cheap, low performance GPUs, that have the advantage of existing in low-profile PCI cards that fit fine in one of my servers and don’t require extra power.
For production CUDA workloads, you’ll want something much better than this.

Note that for this to work, you’ll need LXD 2.5 or higher.

Host setup

Install the CUDA tools and drivers on the host:

wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
sudo apt update
sudo apt install cuda

Then reboot the system to make sure everything is properly setup. After that, you should be able to confirm that your NVidia GPU is properly working with:

ubuntu@canonical-lxd:~$ nvidia-smi 
Tue Mar 21 21:28:34 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 730      Off  | 0000:02:06.0     N/A |                  N/A |
| 30%   30C    P0    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GT 730      Off  | 0000:02:08.0     N/A |                  N/A |
| 30%   26C    P0    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0                  Not Supported                                         |
|    1                  Not Supported                                         |
+-----------------------------------------------------------------------------+

And can check that the CUDA tools work properly with:

ubuntu@canonical-lxd:~$ /usr/local/cuda-8.0/extras/demo_suite/bandwidthTest
[CUDA Bandwidth Test] - Starting...
Running on...

 Device 0: GeForce GT 730
 Quick Mode

 Host to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			3059.4

 Device to Host Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			3267.4

 Device to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			30805.1

Result = PASS

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Container setup

First lets just create a regular Ubuntu 16.04 container:

ubuntu@canonical-lxd:~$ lxc launch ubuntu:16.04 c1
Creating c1
Starting c1

Then install the CUDA demo tools in there:

lxc exec c1 -- wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
lxc exec c1 -- dpkg -i cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
lxc exec c1 -- apt update
lxc exec c1 -- apt install cuda-demo-suite-8-0 --no-install-recommends

At which point, you can run:

ubuntu@canonical-lxd:~$ lxc exec c1 -- nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

Which is expected as LXD hasn’t been told to pass any GPU yet.

LXD GPU passthrough

LXD allows for pretty specific GPU passthrough, the details can be found here.
First let’s start with the most generic one, just allow access to all GPUs:

ubuntu@canonical-lxd:~$ lxc config device add c1 gpu gpu
Device gpu added to c1
ubuntu@canonical-lxd:~$ lxc exec c1 -- nvidia-smi
Tue Mar 21 21:47:54 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 730      Off  | 0000:02:06.0     N/A |                  N/A |
| 30%   30C    P0    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GT 730      Off  | 0000:02:08.0     N/A |                  N/A |
| 30%   27C    P0    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0                  Not Supported                                         |
|    1                  Not Supported                                         |
+-----------------------------------------------------------------------------+
ubuntu@canonical-lxd:~$ lxc config device remove c1 gpu
Device gpu removed from c1

Now just pass whichever is the first GPU:

ubuntu@canonical-lxd:~$ lxc config device add c1 gpu gpu id=0
Device gpu added to c1
ubuntu@canonical-lxd:~$ lxc exec c1 -- nvidia-smi
Tue Mar 21 21:50:37 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 730      Off  | 0000:02:06.0     N/A |                  N/A |
| 30%   30C    P0    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0                  Not Supported                                         |
+-----------------------------------------------------------------------------+
ubuntu@canonical-lxd:~$ lxc config device remove c1 gpu
Device gpu removed from c1

You can also specify the GPU by vendorid and productid:

ubuntu@canonical-lxd:~$ lspci -nnn | grep NVIDIA
02:06.0 VGA compatible controller [0300]: NVIDIA Corporation GK208 [GeForce GT 730] [10de:1287] (rev a1)
02:07.0 Audio device [0403]: NVIDIA Corporation GK208 HDMI/DP Audio Controller [10de:0e0f] (rev a1)
02:08.0 VGA compatible controller [0300]: NVIDIA Corporation GK208 [GeForce GT 730] [10de:1287] (rev a1)
02:09.0 Audio device [0403]: NVIDIA Corporation GK208 HDMI/DP Audio Controller [10de:0e0f] (rev a1)
ubuntu@canonical-lxd:~$ lxc config device add c1 gpu gpu vendorid=10de productid=1287
Device gpu added to c1
ubuntu@canonical-lxd:~$ lxc exec c1 -- nvidia-smi
Tue Mar 21 21:52:40 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 730      Off  | 0000:02:06.0     N/A |                  N/A |
| 30%   30C    P0    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GT 730      Off  | 0000:02:08.0     N/A |                  N/A |
| 30%   27C    P0    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0                  Not Supported                                         |
|    1                  Not Supported                                         |
+-----------------------------------------------------------------------------+
ubuntu@canonical-lxd:~$ lxc config device remove c1 gpu
Device gpu removed from c1

Which adds them both as they are exactly the same model in my setup.

But for such cases, you can also select using the card’s PCI ID with:

ubuntu@canonical-lxd:~$ lxc config device add c1 gpu gpu pci=0000:02:08.0
Device gpu added to c1
ubuntu@canonical-lxd:~$ lxc exec c1 -- nvidia-smi
Tue Mar 21 21:56:52 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 730      Off  | 0000:02:08.0     N/A |                  N/A |
| 30%   27C    P0    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0                  Not Supported                                         |
+-----------------------------------------------------------------------------+
ubuntu@canonical-lxd:~$ lxc config device remove c1 gpu 
Device gpu removed from c1

And lastly, lets confirm that we get the same result as on the host when running a CUDA workload:

ubuntu@canonical-lxd:~$ lxc config device add c1 gpu gpu
Device gpu added to c1
ubuntu@canonical-lxd:~$ lxc exec c1 -- /usr/local/cuda-8.0/extras/demo_suite/bandwidthTest
[CUDA Bandwidth Test] - Starting...
Running on...

 Device 0: GeForce GT 730
 Quick Mode

 Host to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			3065.4

 Device to Host Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			3305.8

 Device to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			30825.7

Result = PASS

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Conclusion

LXD makes it very easy to share one or multiple GPUs with your containers.
You can either dedicate specific GPUs to specific containers or just share them.

There is no of the overhead involved with usual PCI based passthrough and only a single instance of the driver is running with the containers acting just like normal host user processes would.

This does however require that your containers run a version of the CUDA tools which supports whatever version of the NVidia drivers is installed on the host.

Extra information

The main LXD website is at: https://linuxcontainers.org/lxd
Development happens on Github at: https://github.com/lxc/lxd
Mailing-list support happens on: https://lists.linuxcontainers.org
IRC support happens in: #lxcontainers on irc.freenode.net
Try LXD online: https://linuxcontainers.org/lxd/try-it

Read more
Alan Griffiths

MirAL 1.3.1

There’s a bugfix MirAL release (1.3.1) available in ‘Zesty Zapus’ (Ubuntu 17.04) and the so-called “stable phone overlay” ppa for ‘Xenial Xerus’ (Ubuntu 16.04LTS). MirAL is a project aimed at simplifying the development of Mir servers and particularly providing a stable ABI and sensible default behaviors.

Unsurprisingly, given the project’s original goal, the ABI is unchanged.

The bugfixes in 1.3.1 are:

In libmiral a focus management fix:

When a dialog is hidden ensure that the active window focus goes to the parent. (LP: #1671072)

In the miral-shell example, two crashes fixed:

If a surface is deleted before its decoration is painted miral-shell can crash, or hang on exit (LP: #1673038)

If the specified “titlebar” font doesn’t exist the server crashes (LP: #1671028)

In addition a misspelling of “management” has been corrected:

SetWindowManagmentPolicy => SetWindowManagementPolicy

Read more
Dustin Kirkland


Canonical announced the Ubuntu 12.04 LTS (Precise Pangolin) release almost 5 years ago, on April 26, 2012. As with all LTS releases, Canonical has provided ongoing security patches and bug fixes for a period of 5 years. The Ubuntu 12.04 LTS (Long Term Support) period will end on Friday, April 28, 2017.

Following the end-of-life of Ubuntu 12.04 LTS, Canonical is offering Ubuntu 12.04 ESM (Extended Security Maintenance), which provides important security fixes for the kernel and the most essential user space packages in Ubuntu 12.04.  These updates are delivered in a secure, private archive exclusively available to Ubuntu Advantage customers on a per-node basis.

All Ubuntu 12.04 LTS users are encouraged to upgrade to Ubuntu 14.04 LTS or Ubuntu 16.04 LTS. But for those who cannot upgrade immediately, Ubuntu 12.04 ESM updates will help ensure the on-going security and integrity of Ubuntu 12.04 systems.

Users interested in Ubuntu 12.04 ESM updates can purchase Ubuntu Advantage at http://buy.ubuntu.com/   Credentials for the private archive will be available by the end-of-life date for Ubuntu 12.04 LTS (April 28, 2017).

Questions?  Post in the comments below and join us for a live webinar, "HOWTO: Ensure the Ongoing Security Compliance of your Ubuntu 12.04 Systems", on Wednesday, March 22nd at 4pm GMT / 12pm EDT / 9am PDT.  Here, we'll discuss Ubuntu 12.04 ESM and perform a few live upgrades of Ubuntu 12.04 LTS systems.

Cheers,
Dustin

Read more
Alan Griffiths

Mir and Zesty

Mir is continuing to make progress towards a 1.0 release and, meanwhile, Zesty Zapus (Ubuntu 17.04) is continuing to make progress towards final freeze.

Currently the version of Mir in Zesty is 0.26.1 and we’re not planning any major changes for the 17.04 series. We’re probably going to make a bugfix release (0.26.2). The other possibility is that work on supporting hybrid graphics is completed in time for adequate testing for 17.04. In the latter case we’ll be releasing Mir 0.27 to get that shipped.

For this and other reasons it isn’t yet clear whether there will be a 0.27 release before we move to 1.0.

The significance of a 1.0 release is that it will be the time we break the mirclient ABI and delete a lot of deprecated APIs, which will have a significant effect on downstream projects. We’ve tried to prepare by marking the deprecations in 0.26 and updating downstream projects accordingly. But while this preparation means that most downstream projects “only need recompiling” this is something we want to do at the start of a release cycle, not at the end.

The argument for a 0.27 release is that there is functionality we want to release and that this can be done without the disruption of an ABI break. So even if we don’t release 0.27 for 17.04 we may well do so once 17.10 is “open” in order to make this work available for Unity8 developers to use.

Either way, sometime early in the 17.10 cycle we’re going to release Mir 1.0. This will clear the way for Mir support in Mesa and Vulkan.

Read more
Alan Griffiths

Choosing a backend

I got drawn into a discussion today and swiftly realized there is no right answer. But there should be!

The question is deceptively simple: Which order should graphics toolkits probe for backends?

My contention is that the answer is: “it depends”.

Suppose that I’m running a traditional X11 based desktop and am testing with a new technology (obviously Mir, but the same applies to Wayland) running as a window on top of it. (I.e. Mir-on-X or Wayland-on-X)

In this case I want any new application to *default* to connecting to the main X11 desktop – I don’t want my test session to “capture” any applications launched normally.

Now suppose I’m running a new technology desktop that provides an X11 socket as a backup (Xmir/Xwayland). In this case I want any new application to *default* to connecting to the main Mir/Wayland desktop – only if the toolkit doesn’t support Mir/Wayland should it connect to the X11 socket.

Now GDK, for example, provides for this with GDK_BACKEND=mir,wayland,x11 or GDK_BACKEND=x11,mir,wayland (as needed). But that is only one toolkit: OTTOMH Qt has QT_QPA_PLATFORM and SDL has SDL_VIDEODRIVER. (I’m sure there are others.)

What is needed is a standard environment variable that all toolkits (and other graphics libs) can use to prioritize backends. One of my colleagues suggested XDG_TOOLKIT_BACKEND (working much the way that GDK_BACKEND does).

That only helps if all the toolkits take notice. Is it worth pursuing?

Read more
abeato

In the conclusions to my last post, “Modifying System Call Arguments With ptrace”, I mentioned that one of the main drawbacks of the explained approach for modifying system call arguments was that there is a process switch for each system call performed by the tracee. I also suggested a possible approach to overcome that issue using ptrace jointly with seccomp, with the later making sure the tracer gets only the system calls we are interested in. In this post I develop this idea further and show how this can be achieved.

For this, I have created a little example that can be found in github, along the example used in the previous post. The main idea is to use seccomp with a Berkeley Packet Filter (BPF) that will specify the conditions under which the tracer gets interrupted.

Now we will go through the source code, with emphasis on the parts that differ from the original example. Skipping the include directives and the forward declarations we get to main():

int main(int argc, char **argv)
{
    pid_t pid;
    int status;

    if (argc < 2) {
        fprintf(stderr, "Usage: %s <prog> <arg1> ... <argN>\n", argv[0]);
        return 1;
    }

    if ((pid = fork()) == 0) {
        /* If open syscall, trace */
        struct sock_filter filter[] = {
            BPF_STMT(BPF_LD+BPF_W+BPF_ABS, offsetof(struct seccomp_data, nr)),
            BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_open, 0, 1),
            BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_TRACE),
            BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
        };
        struct sock_fprog prog = {
            .filter = filter,
            .len = (unsigned short) (sizeof(filter)/sizeof(filter[0])),
        };
        ptrace(PTRACE_TRACEME, 0, 0, 0);
        /* To avoid the need for CAP_SYS_ADMIN */
        if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0) == -1) {
            perror("prctl(PR_SET_NO_NEW_PRIVS)");
            return 1;
        }
        if (prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &prog) == -1) {
            perror("when setting seccomp filter");
            return 1;
        }
        kill(getpid(), SIGSTOP);
        return execvp(argv[1], argv + 1);
    } else {
        waitpid(pid, &status, 0);
        ptrace(PTRACE_SETOPTIONS, pid, 0, PTRACE_O_TRACESECCOMP);
        process_signals(pid);
        return 0;
    }
}

The main change here when compared to the original code is the set-up of a BPF in the tracee, right after performing the call to fork(). BPFs have an intimidating syntax at first glance, but once you grasp the basic concepts behind they are actually quite easy to read. BPFs are defined as a sort of virtual machine (VM) which has one data register or accumulator, one index register, and an implicit program counter (PC). Its “assembly” instructions are defined as a structure with format:

struct sock_filter {
    u_short code;
    u_char  jt;
    u_char  jf;
    u_long k;
};

There are codes (opcodes) for loading into the accumulator, jumping, and so on. jt and jf are increments on the program counter that are used in jump instructions, while k is an auxiliary value which usage depends on the code number.

BPFs have an addressable space with data that is in the networking case a packet datagram, and for seccomp the following structure:

struct seccomp_data {
    int   nr;                   /* System call number */
    __u32 arch;                 /* AUDIT_ARCH_* value
                                   (see <linux/audit.h>) */
    __u64 instruction_pointer;  /* CPU instruction pointer */
    __u64 args[6];              /* Up to 6 system call arguments */
};

So basically what BPFs do in seccomp is to operate on this data, and return a value that tells the kernel what to do next: allow the process to perform the call (SECCOMP_RET_ALLOW), kill it (SECCOMP_RET_KILL), or other options as specified in the seccomp man page.

As can be seen, struct seccomp_data contains more than enough information for our purposes: we can filter based on the system call number and on the arguments.

With all this information we can look now at the filter definition. BPFs filters are defined as an array of sock_filter structures, where each entry is a BPF instruction. In our case we have

BPF_STMT(BPF_LD+BPF_W+BPF_ABS, offsetof(struct seccomp_data, nr)),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_open, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_TRACE),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),

BPF_STMT and BPF_JUMP are a couple of simple macros that fill the sock_filter structure. They differ in the arguments, which include jumping offsets in BPF_JUMP. The first argument is in both cases the “opcode”, which is built with macros as a mnemonics help: for instance the first one is for loading into the accumulator (BPF_LD) a word (BPF_W) using absolute addressing (BPF_ABS). More about this can be read here, for instance.

Analysing now in more detail the filter, the first instruction is asking the VM to load the call number, nr, to the accumulator. The second one compares that to the number for the open syscall, and asks the VM to not modify the counter if they are equal (PC+0), so the third instruction is run, or jump to PC+1 otherwise, which would be the 4th instruction (when executing this instruction the PC points already to the 3rd instruction). So if this is an open syscall we return SECCOMP_RET_TRACE, which will invoke the tracer, otherwise we return SECCOMP_RET_ALLOW, which will let the tracee run the syscall without further impediment.

Moving forward, the first call to prctl sets PR_SET_NO_NEW_PRIVS, which impedes child processes to have more privileges than those of the parent. This is needed to make the following call to prctl, which sets the seccomp filter using the PR_SET_SECCOMP option, succeed even when not being root. After that, we call execvp() as in the ptrace-only example.

Switching to what the parent does, we see that changes are very few. In main(), we set the PTRACE_O_TRACESECCOMP option, that makes the tracee stop when a filter returns SECCOMP_RET_TRACE and signals the event to the tracer. The other change in this function is that we do not need to set anymore PTRACE_O_TRACESYSGOOD, as we are being interrupted by seccomp, not because of system calls.

Moving now to the next function,

static void process_signals(pid_t child)
{
    const char *file_to_redirect = "ONE.txt";
    const char *file_to_avoid = "TWO.txt";

    while(1) {
        char orig_file[PATH_MAX];

        /* Wait for open syscall start */
        if (wait_for_open(child) != 0) break;

        /* Find out file and re-direct if it is the target */

        read_file(child, orig_file);
        printf("[Opening %s]\n", orig_file);

        if (strcmp(file_to_avoid, orig_file) == 0)
            redirect_file(child, file_to_redirect);
    }
}

we see here that now we invoke wait_for_open() only once. Differently to when we are tracing each syscall, which interrupted the tracer before and after the execution of the syscall, seccomp will interrupt us only before the call is processed. We also add here a trace for demonstration purposes.

After that, we have

static int wait_for_open(pid_t child)
{
    int status;

    while (1) {
        ptrace(PTRACE_CONT, child, 0, 0);
        waitpid(child, &status, 0);
        printf("[waitpid status: 0x%08x]\n", status);
        /* Is it our filter for the open syscall? */
        if (status >> 8 == (SIGTRAP | (PTRACE_EVENT_SECCOMP << 8)) &&
            ptrace(PTRACE_PEEKUSER, child,
                   sizeof(long)*ORIG_RAX, 0) == __NR_open)
            return 0;
        if (WIFEXITED(status))
            return 1;
    }
}

Here we use PTRACE_CONT instead of PTRACE_SYSCALL. We get interrupted every time there is a match in the BPF as we have set the PTRACE_O_TRACESECCOMP option, and we let the tracer run until that happens. The other change here, besides a trace, is how we check if we have received the event we are interested in, as obviously the status word is different. The details can be seen in ptrace’s man page. Note also that we could actually avoid the test for __NR_open as the BPF will interrupt us only for open syscalls.

The rest of the code, which is the part that actually changes the argument to the open syscall is exactly the same. Now, let’s check if this works as advertised:

$ git clone https://github.com/alfonsosanchezbeato/ptrace-redirect.git
$ cd ptrace-redirect/
$ cat ONE.txt 
This is ONE.txt
$ cat TWO.txt 
This is TWO.txt
$ gcc redir_filter.c -o redir_filter
$ ./redir_filter cat TWO.txt 
[waitpid status: 0x0000057f]
[waitpid status: 0x0007057f]
[Opening /etc/ld.so.cache]
[waitpid status: 0x0007057f]
[Opening /lib/x86_64-linux-gnu/libc.so.6]
[waitpid status: 0x0007057f]
[Opening /usr/lib/locale/locale-archive]
[waitpid status: 0x0007057f]
[Opening TWO.txt]
This is ONE.txt
[waitpid status: 0x00000000]

It does indeed! Note that traces show that the tracer gets interrupted only by the open syscall (besides an initial trap and when the child exits). If we added the same traces to the ptrace-only program we would see many more calls.

Finally, a word of caution regarding call numbers: in this post and in the previous one we are assuming an x86-64 architecture, so the programs would need to be adapted if we want to use it in different archs. There is also an important catch here: we are implicitly assuming that the child process that gets run by the execvp() call is also x86-64, as we are filtering by using the syscall number for that arch. This implies that this will not work in the case that the child program is compiled for i386. To make this example work properly also in that case, we must check the architecture in the BPF, by looking at “arch” in seccomp_data, and use the appropriate syscall number in each case. We would also need to check the arch before looking at the tracee registers, see an example on how to do this here (alternatively we could make the BPF return this data in the SECCOMP_RET_DATA bits of its return value, which can be retrieved by the tracer via PTRACE_GETEVENTMSG). Needless to say, for arm64/32 we would have similar issues.

Read more