Canonical Voices

Posts tagged with 'canonical'

Dustin Kirkland

Transcoding video is a very resource intensive process.

It can take many minutes to process a small, 30-second clip, or even hours to process a full movie.  There are numerous, excellent, open source video transcoding and processing tools freely available in Ubuntu, including libav-toolsffmpegmencoder, and handbrake.  Surprisingly, however, none of those support parallel computing easily or out of the box.  And disappointingly, I couldn't find any MPI support readily available either.

I happened to have an Orange Box for a few days recently, so I decided to tackle the problem myself, and develop a scalable, parallel video transcoding solution myself.  I'm delighted to share the result with you today!

When it comes to commercial video production, it can take thousands of machines, hundreds of compute hours to render a full movie.  I had the distinct privilege some time ago to visit WETA Digital in Wellington, New Zealand and tour the render farm that processed The Lord of the Rings triology, Avatar, and The Hobbit, etc.  And just a few weeks ago, I visited another quite visionary, cloud savvy digital film processing firm in Hollywood, called Digital Film Tree.

Windows and Mac OS may be the first platforms that come to mind, when you think about front end video production, Linux is far more widely used for batch video processing, and with Ubuntu, in particular, being extensively at both WETA Digital and Digital Film Tree, among others.

While I could have worked with any of a number of tools, I settled on avconv (the successor(?) of ffmpeg), as it was the first one that I got working well on my laptop, before scaling it out to the cluster.

I designed an approach on my whiteboard, in fact quite similar to some work I did parallelizing and scaling the john-the-ripper password quality checker.

At a high level, the algorithm looks like this:
  1. Create a shared network filesystem, simultaneously readable and writable by all nodes
  2. Have the master node split the work into even sized chunks for each worker
  3. Have each worker process their segment of the video, and raise a flag when done
  4. Have the master node wait for each of the all-done flags, and then concatenate the result
And that's exactly what I implemented that in a new transcode charm and transcode-cluster bundle.  It provides linear scalability and performance improvements, as you add additional units to the cluster.  A transcode job that takes 24 minutes on a single node, is down to 3 minutes on 8 worker nodes in the Orange Box, using Juju and MAAS against physical hardware nodes.


For the curious, the real magic is in the config-changed hook, which has decent inline documentation.



The trick, for anyone who might make their way into this by way of various StackExchange questions and (incorrect) answers, is in the command that splits up the original video (around line 54):

avconv -ss $start_time -i $filename -t $length -s $size -vcodec libx264 -acodec aac -bsf:v h264_mp4toannexb -f mpegts -strict experimental -y ${filename}.part${current_node}.ts

And the one that puts it back together (around line 72):

avconv -i concat:"$concat" -c copy -bsf:a aac_adtstoasc -y ${filename}_${size}_x264_aac.${format}

I found this post and this documentation particularly helpful in understanding and solving the problem.

In any case, once deployed, my cluster bundle looks like this.  8 units of transcoders, all connected to a shared filesystem, and performance monitoring too.


I was able to leverage the shared-fs relation provided by the nfs charm, as well as the ganglia charm to monitor the utilization of the cluster.  You can see the spikes in the cpu, disk, and network in the graphs below, during the course of a transcode job.




For my testing, I downloaded the movie Code Rushfreely available under the CC-BY-NC-SA 3.0 license.  If you haven't seen it, it's an excellent documentary about the open source software around Netscape/Mozilla/Firefox and the dotcom bubble of the late 1990s.

Oddly enough, the stock, 746MB high quality MP4 video doesn't play in Firefox, since it's an mpeg4 stream, rather than H264.  Fail.  (Yes, of course I could have used mplayer, vlc, etc., that's not the point ;-)


Perhaps one of the most useful, intriguing features of HTML5 is it's support for embedding multimedia, video, and sound into webpages.  HTML5 even supports multiple video formats.  Sounds nice, right?  If it only were that simple...  As it turns out, different browsers have, and lack support for the different formats.  While there is no one format to rule them all, MP4 is supported by the majority of browsers, including the two that I use (Chromium and Firefox).  This matrix from w3schools.com illustrates the mess.

http://www.w3schools.com/html/html5_video.asp

The file format, however, is only half of the story.  The audio and video contents within the file also have to be encoded and compressed with very specific codecs, in order to work properly within the browsers.  For MP4, the video has to be encoded with H264, and the audio with AAC.

Among the various brands of phones, webcams, digital cameras, etc., the output format and codecs are seriously all over the map.  If you've ever wondered what's happening, when you upload a video to YouTube or Facebook, and it's a while before it's ready to be viewed, it's being transcoded and scaled in the background. 

In any case, I find it quite useful to transcode my videos to MP4/H264/AAC format.  And for that, a scalable, parallel computing approach to video processing would be quite helpful.

During the course of the 3 minute run, I liked watching the avconv log files of all of the nodes, using Byobu and Tmux in a tiled split screen format, like this:


Also, the transcode charm installs an Apache2 webserver on each node, so you can expose the service and point a browser to any of the nodes, where you can find the input, output, and intermediary data files, as well as the logs and DONE flags.



Once the job completes, I can simply click on the output file, Code_Rush.mp4_1280x720_x264_aac.mp4, and see that it's now perfectly viewable in the browser!


In case you're curious, I have verified the same charm with a couple of other OGG, AVI, MPEG, and MOV input files, too.


Beyond transcoding the format and codecs, I have also added configuration support within the charm itself to scale the video frame size, too.  This is useful to take a larger video, and scale it down to a more appropriate size, perhaps for a phone or tablet.  Again, this resource intensive procedure perfectly benefits from additional compute units.


File format, audio/video codec, and frame size changes are hardly the extent of video transcoding workloads.  There are hundreds of options and thousands of combinations, as the manpages of avconv and mencoder attest.  All of my scripts and configurations are free software, open source.  Your contributions and extensions are certainly welcome!

In the mean time, I hope you'll take a look at this charm and consider using it, if you have the need to scale up your own video transcoding ;-)

Cheers,
Dustin

Read more
Dustin Kirkland

It's that simple.
It was about 4pm on Friday afternoon, when I had just about wrapped up everything I absolutely needed to do for the day, and I decided to kick back and have a little fun with the remainder of my work day.

 It's now 4:37pm on Friday, and I'm now done.

Done with what?  The Yo charm, of course!

The Internet has been abuzz this week about the how the Yo app received a whopping $1 million dollars in venture funding.  (Forbes notes that this is a pretty surefire indication that there's another internet bubble about to burst...)

It's little more than the first program any kid writes -- hello world!

Subsequently I realized that we don't really have a "hello world" charm.  And so here it is, yo.

$ juju deploy yo

Deploying up a webpage that says "Yo" is hardly the point, of course.  Rather, this is a fantastic way to see the absolute simplest form of a Juju charm.  Grab the source, and go explore it yo-self!

$ charm-get yo
$ tree yo
├── config.yaml
├── copyright
├── hooks
│   ├── config-changed
│   ├── install
│   ├── start
│   ├── stop
│   ├── upgrade-charm
│   └── website-relation-joined
├── icon.svg
├── metadata.yaml
└── README.md
1 directory, 11 files



  • The config.yaml let's you set and dynamically changes the service itself (the color and size of the font that renders "Yo").
  • The copyright is simply boilerplate GPLv3
  • The icon.svg is just a vector graphics "Yo."
  • The metadata.yaml explains what this charm is, how it can relate to other charms
  • The README.md is a simple getting-started document
  • And the hooks...
    • config-changed is the script that runs when you change the configuration -- basically, it uses sed to inline edit the index.html Yo webpage
    • install simply installs apache2 and overwrites /var/www/index.html
    • start and stop simply starts and stops the apache2 service
    • upgrade-charm is currently a no-op
    • website-relation-joined sets and exports the hostname and port of this system
The website relation is very important here...  Declaring and defining this relation instantly lets me relate this charm with dozens of other services.  As you can see in the screenshot at the top of this post, I was able to easily relate the varnish website accelerator in front of the Yo charm.

Hopefully this simple little example might help you examine the anatomy of a charm for the first time, and perhaps write your own first charm!

Cheers,

Dustin

Read more
mandel

dbus-cpp is the libary that we are currently using to perform dbus calls from cpp within ubuntu touch. For those that are used to dbus-cpp know that the dbus interface can be defined via structures that are later checked at compile time. The following is an example of those definitions:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
struct Geoclue
{
    struct Master
    {
        struct Create
        {
            inline static std::string name()
            {
                return "Create";
            } typedef Master Interface;
            inline static const std::chrono::milliseconds default_timeout() { return std::chrono::seconds{1}; }
        };
    };
 
    struct MasterClient
    {
        struct SetRequirements
        {
            inline static std::string name()
            {
                return "SetRequirements";
            } typedef MasterClient Interface;
            inline static const std::chrono::milliseconds default_timeout() { return std::chrono::seconds{1}; }
        };
        struct GetAddressProvider
        {
            inline static std::string name()
            {
                return "GetAddressProvider";
            } typedef MasterClient Interface;
            inline static const std::chrono::milliseconds default_timeout() { return std::chrono::seconds{1}; }
        };
        struct GetPositionProvider
        {
            inline static std::string name()
            {
                return "GetPositionProvider";
            } typedef MasterClient Interface;
            inline static const std::chrono::milliseconds default_timeout() { return std::chrono::seconds{1}; }
        };
    };
 
    struct Address
    {
        struct GetAddress
        {
            inline static std::string name()
            {
                return "GetAddress";
            } typedef Address Interface;
            inline static const std::chrono::milliseconds default_timeout() { return std::chrono::seconds{1}; }
        };
    };
 
    struct Position
    {
        struct GetPosition
        {
            inline static std::string name()
            {
                return "GetPosition";
            } typedef Position Interface;
            inline static const std::chrono::milliseconds default_timeout() { return std::chrono::seconds{1}; }
        };
        struct Signals
        {
            struct PositionChanged
            {
                inline static std::string name()
                {
                    return "PositionChanged";
                };
                typedef Position Interface;
                typedef std::tuple<int32_t, int32_t, double, double, double, dbus::types::Struct<std::tuple<int32_t, double, double>>> ArgumentType;
            };
        };
    };
};
}

This is quite a lot of work and quite repetitive. We did notice that and used in some of our projects a set of macros to make us type less. We now will provide the following macros for other developers to use as part of dbus-cpp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
/*
 * Copyright © 2014 Canonical Ltd.
 *
 * This program is free software: you can redistribute it and/or modify it
 * under the terms of the GNU Lesser General Public License version 3,
 * as published by the Free Software Foundation.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of+
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
 * GNU Lesser General Public License for more details.
 *
 * You should have received a copy of the GNU Lesser General Public License
 * along with this program. If not, see <http://www.gnu.org/licenses/>.
 *
 * Authored by: Thomas Voß <thomas.voss@canonical.com>
 */
 
#ifndef CORE_DBUS_MACROS_H_
#define CORE_DBUS_MACROS_H_
 
#include <core/dbus/types/object_path.h>
 
#include <chrono>
#include <string>
 
#define DBUS_CPP_METHOD_WITH_TIMEOUT_DEF(Name, Itf, Timeout) \
 struct Name \
 { \
 typedef Itf Interface; \
 inline static const std::string& name() \
 { \
 static const std::string s{#Name}; \
 return s; \
 } \
 inline static const std::chrono::milliseconds default_timeout() { return std::chrono::milliseconds{Timeout}; } \
 };\

#define DBUS_CPP_METHOD_DEF(Name, Itf) DBUS_CPP_METHOD_WITH_TIMEOUT_DEF(Name, Itf, 2000)
 
#define DBUS_CPP_SIGNAL_DEF(Name, Itf, ArgType) \
 struct Name \
 { \
 inline static std::string name() \
 { \
 return #Name; \
 }; \
 typedef Itf Interface; \
 typedef ArgType ArgumentType; \
 };\

#define DBUS_CPP_READABLE_PROPERTY_DEF(Name, Itf, Type) \
 struct Name \
 { \
 inline static std::string name() \
 { \
 return #Name; \
 }; \
 typedef Itf Interface; \
 typedef Type ValueType; \
 static const bool readable = true; \
 static const bool writable = false; \
 }; \

#define DBUS_CPP_WRITABLE_PROPERTY_DEF(Name, Itf, Type) \
 struct Name \
 { \
 inline static std::string name() \
 { \
 return #Name; \
 }; \
 typedef Itf Interface; \
 typedef Type ValueType; \
 static const bool readable = true; \
 static const bool writable = true; \
 }; \

#endif // CORE_DBUS_MACROS_H_

Using those macros the above example is as follows

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
struct Geoclue
{
    struct Master
    {
        DBUS_CPP_METHOD_DEF(Create, Master);
    };
 
    struct MasterClient
    {
        DBUS_CPP_METHOD_DEF(SetRequirements, MasterClient);
        DBUS_CPP_METHOD_DEF(GetAddressProvider, MasterClient);
        DBUS_CPP_METHOD_DEF(GetPositionProvider, MasterClient);
    };
 
    struct Address
    {
        DBUS_CPP_METHOD_DEF(GetAddress, Address);
    };
 
    struct Position
    {
        DBUS_CPP_METHOD_DEF(GetPosition, Position);
        struct Signals
        {
            DBUS_CPP_SIGNAL_DEF(PositionChanged, Position);
        };
    };
};
}

As you can see the amount of code that needs to be written is a lot less and a new developer can easily understand what is going on. Happy coding ;)

Read more
Michael Hall

opplanet-tasco-usb-digital-microscope-780200tTwo years ago, my wife and I made the decision to home-school our two children.  It was the best decision we could have made, our kids are getting a better education, and with me working from home since joining Canonical I’ve been able to spend more time with them than ever before. We also get to try and do some really fun things, which is what sets the stage for this story.

Both my kids love science, absolutely love it, and it’s one of our favorite subjects to teach.  A couple of weeks ago my wife found an inexpensive USB microscope, which lets you plug it into a computer and take pictures using desktop software.  It’s not a scientific microscope, nor is it particularly powerful or clear, but for the price it was just right to add a new aspect to our elementary science lessons. All we had to do was plug it in and start exploring.

My wife has a relatively new (less than a year) laptop running windows 8.  It’s not high-end, but it’s all new hardware, new software, etc.  So when we plugged in our simple USB microscope…….it failed.  As in, didn’t do anything.  Windows seemed to be trying to figure out what to do with it, over and over and over again, but to no avail.

My laptop, however, is running Ubuntu 14.04, the latest stable and LTS release.  My laptop is a couple of years old, but classic, Lenovo x220. It’s great hardware to go with Ubuntu and I’ve had nothing but good experiences with it.  So of course, when I decided to give our new USB microsope a try……it failed.  The connection was fine, the log files clearly showed that it was being identified, but nothing was able to see it as a video input device or make use of it.

Now, if that’s where our story ended, it would fall right in line with a Shakespearean tragedy. But while both Windows and Ubuntu failed to “just work” with this microscope, both failures were not equal. Because the Windows drivers were all closed source, my options ended with that failure.

But on Ubuntu, the drivers were open, all I needed to do was find a fix. It took a while, but I eventually found a 2.5 year old bug report for an identical chipset to my microscope, and somebody proposed a code fix in the comments.  Now, the original reporter never responded to say whether or not the fix worked, and it was clearly never included in the driver code, but it was an opportunity.  Now I’m no kernel hacker, nor driver developer, in fact I probably shouldn’t be trusted to write any amount of C code at all.  But because I had Ubuntu, getting the source code of my current driver, as well as all the tools and dependencies needed to build it, took only a couple of terminal commands.  The patch was too old to cleanly apply to the current code, but it was easy enough to figure out where they should go, and after a couple tries to properly build just the driver (and not the full kernel or every driver in it), I had a new binary kernel modules that would load without error.  Then, when I plugged my USB microscope in again, it worked!

Red Onion skin at 120x magnificationPeople use open source for many reasons.  Some people use it because it’s free as in beer, for them it’s on the same level as freeware or shareware, only the cost matters. For others it’s about ethics, they would choose open source even if it cost them money or didn’t work as well, because they feel it’s morally right, and that proprietary software is morally wrong. I use open source because of USB microscopes. Because when they don’t work, open source gives me a chance to change that.

Read more
David Murphy (schwuk)

Ars Technica has a great write up by Lee Hutchinson on our Orange Box demo and training unit.

You can't help but have your attention grabbed by it!

You can’t help but have your attention grabbed by it!

As the comments are quick to point out – at the expense of the rest of the piece – the hardware isn’t the compelling story here. While you can buy your own, you can almost certainly hand build an equivalent-or-better set up for less money1, but Ars recognises this:

Of course, that’s exactly the point: the Orange Box is that taste of heroin that the dealer gives away for free to get you on board. And man, is it attractive. However, as Canonical told me about a dozen times, the company is not making them to sell—it’s making them to use as revenue driving opportunities and to quickly and effectively demo Canonical’s vision of the cloud.

The Orange Box is about showing off MAAS & Juju, and – usually – OpenStack.

To see what Ars think of those, you should read the article.

I definitely echo Lee’s closing statement:

I wish my closet had an Orange Box in it. That thing is hella cool.


  1. Or make one out of wood like my colleague Gavin did! 

Read more
Dustin Kirkland

It's hard for me to believe that I have sat on a this draft blog post for almost 6 years.  But I'm stuck on a plane this evening, inspired by Elon Musk and Tesla's (cleverly titled) announcement, "All Our Patents Are Belong To You."  Musk writes:

Yesterday, there was a wall of Tesla patents in the lobby of our Palo Alto headquarters. That is no longer the case. They have been removed, in the spirit of the open source movement, for the advancement of electric vehicle technology.
When I get home, I'm going to take down a plaque that has proudly hung in my own home office for nearly 10 years now.  In 2004, I was named an IBM Master Inventor, recognizing sustained contributions to IBM's patent portfolio.

Musk continues:
When I started out with my first company, Zip2, I thought patents were a good thing and worked hard to obtain them. And maybe they were good long ago, but too often these days they serve merely to stifle progress, entrench the positions of giant corporations and enrich those in the legal profession, rather than the actual inventors. After Zip2, when I realized that receiving a patent really just meant that you bought a lottery ticket to a lawsuit, I avoided them whenever possible.
And I feel the exact same way!  When I was an impressionable newly hired engineer at IBM, I thought patents were wonderful expressions of my own creativity.  IBM rewarded me for the work, and recognized them as important contributions to my young career.  Remember, in 2003, IBM was defending the Linux world against evil SCO.  (Confession: I think I read Groklaw every single day.)

Yeah, I filed somewhere around 75 patents in about 4 years, 47 of which have been granted by the USPTO to date.

I'm actually really, really proud of a couple of them.  I was the lead inventor on a couple of early patents defining the invention you might know today as Swype (Android) or Shapewriter (iPhone) on your mobile devices.  In 2003, I called it QWERsive, as the was basically applying "cursive handwriting" to a "qwerty keyboard."  Along with one of my co-inventors, we actually presented a paper at the 27th UNICODE conference in Berlin in 2005, and IBM sold the patent to Lenovo a year later.  (To my knowledge, thankfully that patent has never been enforced, as I used Swype every single day.)

QWERsive

But that enthusiasm evaporated very quickly between 2005 and 2007, as I reviewed thousands of invention disclosures by my IBM colleagues, and hundreds of software patents by IBM competitors in the industry.

I spent most of 2005 working onsite at Red Hat in Westford, MA, and came to appreciate how much more efficiently innovation happened in a totally open source world, free of invention disclosures, black out periods, gag orders, and software patents.  I met open source activists in the free software community, such as Jon maddog Hall, who explained the wake of destruction behind, and the impending doom ahead, in a world full of software patents.

Finally, in 2008, I joined an amazing little free software company called Canonical, which was far too busy releasing Ubuntu every 6 months on time, and building an amazing open source software ecosystem, to fart around with software patents.  To my delight, our founder, Mark Shuttleworth, continues to share the same enlightened view, as he states in this TechCrunch interview (2012):
“People have become confused,” Shuttleworth lamented, “and think that a patent is incentive to create at all.” No one invents just to get a patent, though — people invent in order to solve problems. According to him, patents should incentivize disclosure. Software is not something you can really keep secret, and as such Shuttleworth’s determination is that “society is not benefited by software patents at all.”Software patents, he said, are a bad deal for society. The remedy is to shorten the duration of patents, and reduce the areas people are allowed to patent. “We’re entering a third world war of patents,” Shuttleworth said emphatically. “You can’t do anything without tripping over a patent!” One cannot possibly check all possible patents for your invention, and the patent arms race is not about creation at all.
And while I'm still really proud of some of my ideas today, I'm ever so ashamed that they're patented.

If I could do what Elon Musk did with Tesla's patent portfolio, you have my word, I absolutely would.  However, while my name is listed as the "inventor" on four dozen patents, all of them are "assigned" to IBM (or Lenovo).  That is to say, they're not mine to give, or open up.

What I can do, is speak up, and formally apologize.  I'm sorry I filed software patents.  A lot of them.  I have no intention on ever doing so again.  The system desperately needs a complete overhaul.  Both the technology and business worlds are healthier, better, more innovative environment without software patents.

I do take some consolation that IBM seems to be "one of the good guys", in so much as our modern day IBM has not been as litigious as others, and hasn't, to my knowledge, used any of the patents for which I'm responsible in an offensive manner.

No longer hanging on my wall.  Tucked away in a box in the attic.
But there are certainly those that do.  Patent trolls.

Another former employer of mine, Gazzang was acquired earlier this month (June 3rd) by Cloudera -- a super sharp, up-and-coming big data open source company with very deep pockets and tremendous market potential.  Want to guess what happened 3 days later?  A super shady patent infringement lawsuit is filed, of course!
Protegrity Corp v. Gazzang, Inc.
Complaint for Patent InfringementCivil Action No. 3:14-cv-00825; no judge yet assigned. Filed on June 6, 2014 in the U.S. District Court for the District of Connecticut;Patents in case 7,305,707: “Method for intrusion detection in a database system” by Mattsson. Prosecuted by Neuner; George W. Cohen; Steven M. Edwards Angell Palmer & Dodge LLP. Includes 22 claims (2 indep.). Was application 11/510,185. Granted 12/4/2007.
Yuck.  And the reality is that happens every single day, and in places where the stakes are much, much higher.  See: Apple v. Google, for instance.

Musk concludes his post:
Technology leadership is not defined by patents, which history has repeatedly shown to be small protection indeed against a determined competitor, but rather by the ability of a company to attract and motivate the world’s most talented engineers. We believe that applying the open source philosophy to our patents will strengthen rather than diminish Tesla’s position in this regard.
What a brave, bold, ballsy, responsible assertion!

I've never been more excited to see someone back up their own rhetoric against software patents, with such a substantial, palpable, tangible assertion.  Kudos, Elon.

Moreover, I've also never been more interested in buying a Tesla.   Coincidence?

Maybe it'll run an open source operating system and apps, too.  Do that, and I'm sold.

:-Dustin

Read more
jdstrand

Last time I discussed AppArmor, I talked about new features in Ubuntu 13.10 and a bit about ApplicationConfinement for Ubuntu Touch. With the release of Ubuntu 14.04 LTS, several improvements were made:

  • Mediation of signals
  • Mediation of ptrace
  • Various policy updates for 14.04, including new tunables, better support for XDG user directories, and Unity7 abstractions
  • Parser policy compilation performance improvements
  • Google Summer of Code (SUSE sponsored) python rewrite of the userspace tools

Signal and ptrace mediation

Prior to Ubuntu 14.04 LTS, a confined process could send signals to other processes (subject to DAC) and ptrace other processes (subject to DAC and YAMA). AppArmor on 14.04 LTS adds mediation of both signals and ptrace which brings important security improvements for all AppArmor confined applications, such as those in the Ubuntu AppStore and qemu/kvm machines as managed by libvirt and OpenStack.

When developing policy for signal and ptrace rules, it is important to remember that AppArmor does a cross check such that AppArmor verifies that:

  • the process sending the signal/performing the ptrace is allowed to send the signal to/ptrace the target process
  • the target process receiving the signal/being ptraced is allowed to receive the signal from/be ptraced by the sender process

Signal(7) permissions use the ‘signal’ rule with the ‘receive/send’ permissions governing signals. PTrace permissions use the ‘ptrace’ rule with the ‘trace/tracedby’ permissions governing ptrace(2) and the ‘read/readby’ permissions governing certain proc(5) filesystem accesses, kcmp(2), futexes (get_robust_list(2)) and perf trace events.

Consider the following denial:

Jun 6 21:39:09 localhost kernel: [221158.831933] type=1400 audit(1402083549.185:782): apparmor="DENIED" operation="ptrace" profile="foo" pid=29142 comm="cat" requested_mask="read" denied_mask="read" peer="unconfined"

This demonstrates that the ‘cat’ binary running under the ‘foo’ profile was unable to read the contents of a /proc entry (in my test, /proc/11300/environ). To allow this process to read /proc entries for unconfined processes, the following rule can be used:

ptrace (read) peer=unconfined,

If the receiving process was confined, the log entry would say ‘peer=”<profile name>”‘ and you would adjust the ‘peer=unconfined’ in the rule to match that in the log denial. In this case, because unconfined processes implicitly can be readby all other processes, we don’t need to specify the cross check rule. If the target process was confined, the profile for the target process would need a rule like this:

ptrace (readby) peer=foo,

Likewise for signal rules, consider this denial:

Jun 6 21:53:15 localhost kernel: [222005.216619] type=1400 audit(1402084395.937:897): apparmor="DENIED" operation="signal" profile="foo" pid=29069 comm="bash" requested_mask="send" denied_mask="send" signal=term peer="unconfined"

This shows that ‘bash’ running under the ‘foo’ profile tried to send the ‘term’ signal to an unconfined process (in my test, I used ‘kill 11300′) and was blocked. Signal rules use ‘read’ and ‘send to determine access, so we can add a rule like so to allow sending of the signal:

signal (send) set=("term") peer=unconfined,

Like with ptrace, a cross-check is performed with signal rules but implicit rules allow unconfined processes to send and receive signals. If pid 11300 were confined, you would adjust the ‘peer=’ in the rule of the foo profile to match the denial in the log, and then adjust the target profile to have something like:

signal (receive) set=("term") peer=foo,

Signal and ptrace rules are very flexible and the AppArmor base abstraction in Ubuntu 14.04 LTS has several rules to help make profiling and transitioning to the new mediation easier:

# Allow other processes to read our /proc entries, futexes, perf tracing and
# kcmp for now
ptrace (readby),
 
# Allow other processes to trace us by default (they will need
# 'trace' in the first place). Administrators can override
# with:
# deny ptrace (tracedby) ...
ptrace (tracedby),
 
# Allow unconfined processes to send us signals by default
signal (receive) peer=unconfined,
 
# Allow us to signal ourselves
signal peer=@{profile_name},
 
# Checking for PID existence is quite common so add it by default for now
signal (receive, send) set=("exists"),

Note the above uses the new ‘@{profile_name}’ AppArmor variable, which is particularly handy with ptrace and signal rules. See man 5 apparmor.d for more details and examples.

14.10

Work still remains and some of the things we’d like to do for 14.10 include:

  • Finishing mediation for non-networking forms of IPC (eg, abstract sockets). This will be done in time for the phone release.
  • Have services integrate with AppArmor and the upcoming trust-store to become trusted helpers (also for phone release)
  • Continue work on netowrking IPC (for 15.04)
  • Continue to work with the upstream kernel on kdbus
  • Work continued on LXC stacking and we hope to have stacked profiles within the current namespace for 14.10. Full support for stacked profiles where different host and container policy for the same binary at the same time should be ready by 15.04
  • Various fixes to the python userspace tools for remaining bugs. These will also be backported to 14.04 LTS

Until next time, enjoy!


Filed under: canonical, security, ubuntu, ubuntu-server

Read more
David Murphy (schwuk)

This is really inspiring to me, on several levels: as an Ubuntu member, as a Canonical, and as a school governor.

Not only are they deploying Ubuntu and other open-source software to their students, they are encouraging those students to tinker with their laptops, and – better yet – some of those same students are directly involved in the development, distribution, and providing support for their peers. All of those students will take incredibly valuable experience with them into their future careers.

Well done.

Read more
mark

This is a series of posts on reasons to choose Ubuntu for your public or private cloud work & play.

We run an extensive program to identify issues and features that make a difference to cloud users. One result of that program is that we pioneered dynamic image customisation and wrote cloud-init. I’ll tell the story of cloud-init as an illustration of the focus the Ubuntu team has on making your devops experience fantastic on any given cloud.

 

Ever struggled to find the “right” image to use on your favourite cloud? Ever wondered how you can tell if an image is safe to use, what keyloggers or other nasties might be installed? We set out to solve that problem a few years ago and the resulting code, cloud-init, is one of the more visible pieces Canonical designed and built, and very widely adopted.

Traditionally, people used image snapshots to build a portfolio of useful base images. You’d start with a bare OS, add some software and configuration, then snapshot the filesystem. You could use those snapshots to power up fresh images any time you need more machines “like this one”. And that process works pretty amazingly well. There are hundreds of thousands, perhaps millions, of such image snapshots scattered around the clouds today. It’s fantastic. Images for every possible occasion! It’s a disaster. Images with every possible type of problem.

The core issue is that an image is a giant binary blob that is virtually impossible to audit. Since it’s a snapshot of an image that was running, and to which anything might have been done, you will need to look in every nook and cranny to see if there is a potential problem. Can you afford to verify that every binary is unmodified? That every configuration file and every startup script is safe? No, you can’t. And for that reason, that whole catalogue of potential is a catalogue of potential risk. If you wanted to gather useful data sneakily, all you’d have to do is put up an image that advertises itself as being good for a particular purpose and convince people to run it.

There are other issues, even if you create the images yourself. Each image slowly gets out of date with regard to security updates. When you fire it up, you need to apply all the updates since the image was created, if you want a secure machine. Eventually, you’ll want to re-snapshot for a more up-to-date image. That requires administration overhead and coordination, most people don’t do it.

That’s why we created cloud-init. When your virtual machine boots, cloud-init is run very early. It looks out for some information you send to the cloud along with the instruction to start a new machine, and it customises your machine at boot time. When you combine cloud-init with the regular fresh Ubuntu images we publish (roughly every two weeks for regular updates, and whenever a security update is published), you have a very clean and elegant way to get fresh images that do whatever you want. You design your image as a script which customises the vanilla, base image. And then you use cloud-init to run that script against a pristine, known-good standard image of Ubuntu. Et voila! You now have purpose-designed images of your own on demand, always built on a fresh, secure, trusted base image.

Auditing your cloud infrastructure is now straightforward, because you have the DNA of that image in your script. This is devops thinking, turning repetitive manual processes (hacking and snapshotting) into code that can be shared and audited and improved. Your infrastructure DNA should live in a version control system that requires signed commits, so you know everything that has been done to get you where you are today. And all of that is enabled by cloud-init. And if you want to go one level deeper, check out Juju, which provides you with off-the-shelf scripts to customise and optimise that base image for hundreds of common workloads.

Read more
Michael Hall

Last year the main Ubuntu download page was changed to include a form for users to make a donation to one or more parts of Ubuntu, including to the community itself. Those donations made for “Community projects” were made available to members of our community who knew of ways to use them that would benefit the Ubuntu project.

Every dollar given out is an investment in Ubuntu and the community that built it. This includes sponsoring community events, sending community representatives to those events with booth supplies and giveaway items, purchasing hardware to make improve development and testing, and more.

But these expenses don’t cover the time, energy, and talent that went along with them, without which the money itself would have been wasted.  Those contributions, made by the recipients of these funds, can’t be adequately documented in a financial report, so thank you to everybody who received funding for their significant and sustained contributions to Ubuntu.

As part of our commitment to openness and transparency we said that we would publish a report highlighting both the amount of donations made to this category, and how and where that money was being used. Linked below is the first of those reports.

View the Report

Read more
Michael Hall

A couple of months ago Jono announced the dates for the Ubuntu Online Summit, June 10th – 12th,  and those dates are almost upon us now.  The schedule is opened, the track leads are on board, all we need now are sessions.  And that’s where you come in.

Ubuntu Online Summit is a change for us, we’re trying to mix the previous online UDS events with our Open Week, Developer Week and User Days events, to try and bring people from every part of our community together to celebrate, educate, and improve Ubuntu. So in addition to the usual planning sessions we had at UDS, we’re also looking for presentations from our various community teams on the work they do, walk-throughs for new users learning how to use Ubuntu, as well as instructional sessions to help new distro developers, app developers, and cloud devops get the most out of it as a platform.

What we need from you are sessions.  It’s open to anybody, on any topic, anyway you want to do it.  The only requirement is that you can start and run a Google+ OnAir Hangout, since those are what provide the live video streaming and recording for the event.  There are two ways you can propose a session: the first is to register a Blueprint in Launchpad, this is good for planning session that will result in work items, the second is to propose a session directly in Summit, which is good for any kind of session.  Instructions for how to do both are available on the UDS Website.

There will be Track Leads available to help you get your session on the schedule, and provide some technical support if you have trouble getting your session’s hangout setup. When you propose your session (or create your Blueprint), try to pick the most appropriate track for it, that will help it get approved and scheduled faster.

Ubuntu Development

Many of the development-oriented tracks from UDS have been rolled into the Ubuntu Development track. So anything that would previously have been in Client, Core/Foundations or Cloud and Server will be in this one track now. The track leads come from all parts of Ubuntu development, so whatever you session’s topic there will be a lead there who will be familiar with it.

Track Leads:

  • Łukasz Zemczak
  • Steve Langasek
  • Leann Ogasawara
  • Antonio Rosales
  • Marc Deslaurs

Application Development

Introduced a few cycles back, the Application Development track will continue to have a focus on improving the Ubuntu SDK, tools and documentation we provide for app developers.  We also want to introduce sessions focused on teaching app development using the SDK, the various platform services available, as well as taking a deeper dive into specifics parts of the Ubuntu UI Toolkit.

Track Leads:

  • Michael Hall
  • David Planella
  • Alan Pope
  • Zsombor Egri
  • Nekhelesh Ramananthan

Cloud DevOps

This is the counterpart of the Application Development track for those with an interest in the cloud.  This track will have a dual focus on planning improvements to the DevOps tools like Juju, as well as bringing DevOps up to speed with how to use them in their own cloud deployments.  Learn how to write charms, create bundles, and manage everything in a variety of public and private clouds.

Track Leads:

  • Jorge Castro
  • Marco Ceppi
  • Patricia Gaughen
  • Jose Antonio Rey

Community

The community track has been a stable of UDS for as long as I can remember, and it’s still here in the Ubuntu Online Summit.  However, just like the other tracks, we’re looking beyond just planning ways to improve the community structure and processes.  This time we also want to have sessions showing users how they can get involved in the Ubuntu community, what teams are available, and what tools they can use in the process.

Track Leads:

  • Daniel Holbach
  • Jose Antonio Rey
  • Laura Czajkowski
  • Svetlana Belkin
  • Pablo Rubianes

Users

This is a new track and one I’m very excited about. We are all users of Ubuntu, and whether we’ve been using it for a month or a decade, there are still things we can all learn about it. The focus of the Users track is to highlight ways to get the most out of Ubuntu, on your laptop, your phone or your server.  From detailed how-to sessions, to tips and tricks, and more, this track can provide something for everybody, regardless of skill level.

Track Leads:

  • Elizabeth Krumbach Joseph
  • Nicholas Skaggs
  • Valorie Zimmerman

So once again, it’s time to get those sessions in.  Visit this page to learn how, then start thinking of what you want to talk about during those three days.  Help the track leads out by finding more people to propose more sessions, and let’s get that schedule filled out. I look forward to seeing you all at our first ever Ubuntu Online Summit.

Read more
Michael Hall

I’ve just finished the last day of a week long sprint for Ubuntu application development. There were many people here, designers, SDK developers, QA folks and, which excited me the most, several of the Core Apps developers from our community!

image20140520_0048I haven’t been in attendance at many conferences over the past couple of years, and without an in-person UDS I haven’t had a chance to meetup and hangout with anybody outside of my own local community. So this was a very nice treat for me personally to spend the week with such awesome and inspiring contributors.

It wasn’t a vacation though, sprints are lots of work, more work than UDS.  All of us were jumping back and forth between high information density discussions on how to implement things, and then diving into some long heads-down work to get as much implemented as we could. It was intense, and now we’re all quite tired, but we all worked together well.

I was particularly pleased to see the community guys jumping right in and thriving in what could have very easily been an overwhelming event. Not only did they all accomplish a lot of work, fix a lot of bugs, and implement some new features, but they also gave invaluable feedback to the developers of the toolkit and tools. They never cease to amaze me with their talent and commitment.

It was a little bitter-sweet though, as this was also the last sprint with Jono at the head of the community team.  As most of you know, Jono is leaving Canonical to join the XPrize foundation.  It is an exciting opportunity to be sure, but his experience and his insights will be sorely missed by the rest of us. More importantly though he is a friend to so many of us, and while we are sad to see him leave, we wish him all the best and can’t wait to hear about the things he will be doing in the future.

Read more
Louis

A few years ago, while I started to participate to the packaging of makedumpfile and kdump-tools for Debian and ubuntu. I am currently applying for the formal status of Debian Maintainer to continue that task.

For a while now, I have been noticing that our version of the kernel dump mechanism was lacking from a functionality that has been available on RHEL & SLES for a long time : remote kernel crash dumps. On those distribution, it is possible to define a remote server to be the receptacle of the kernel dumps of other systems. This can be useful for centralization or to capture dumps on systems with limited or no local disk space.

So I am proud to announce the first functional beta-release of kdump-tools with remote kernel crash dump functionality for Debian and Ubuntu !

For those of you eager to test or not interested in the details, you can find a packaged version of this work in a Personal Package Archive (PPA) here :

https://launchpad.net/~louis-bouchard/+archive/networked-kdump

New functionality : remote SSH and NFS

In the current version available in Debian and Ubuntu, the kernel crash dumps are stored on local filesystems. Starting with version 1.5.1, they are stored in a timestamped directory under /var/crash. The new functionality allow to either define a remote host accessible through SSH or an NFS mount point to be the receptacle for the kernel crash dumps.

A new section of the /etc/default/kdump-tools file has been added :

# ---------------------------------------------------------------------------
# Remote dump facilities:
# SSH - username and hostname of the remote server that will receive the dump
# and dmesg files.
# SSH_KEY - Full path of the ssh private key to be used to login to the remote
# server. use kdump-config propagate to send the public key to the
# remote server
# HOSTTAG - Select if hostname of IP address will be used as a prefix to the
# timestamped directory when sending files to the remote server.
# 'ip' is the default.
# NFS - Hostname and mount point of the NFS server configured to receive
# the crash dump. The syntax must be {HOSTNAME}:{MOUNTPOINT} 
# (e.g. remote:/var/crash)
#
# SSH="<user@server>"
#
# SSH_KEY="<path>"
#
# HOSTTAG="hostname|[ip]"
# 
# NFS="<nfs mount>"
#

The kdump-config command also gains a new option : propagate which is used to send a public ssh key to the remote server so passwordless ssh commands can be issued to the remote SSH host.

Those options and commands are nothing new : I simply based my work on existing functionality from RHEL & SLES. So if you are well acquainted with RHEL remote kernel crash dump mechanisms, you will not be lost on Debian and Ubuntu. So I want to thank those who built the functionality on those distributions; it was a great help in getting them ported to Debian.

Testing on Debian

First of all, you must enable the kernel crash dump mechanism at the kernel level. I will not go in details as it is slightly off topic but you should :

  1. Add crashkernel=128M to /etc/default/grub in GRUB_CMDLINE_LINUX_DEFAULT
  2. Run udpate-grub
  3. reboot

Install the beta packages

The package in the PPA can be installed on Debian with add-apt-repository. This command is in the software-properties-common package so you will have to install it first :

$ apt-get install software-properties-common
$ add-apt-repository ppa:louis-bouchard/networked-kdump

Since you are on Debian, the result of the last command will be wrong, as the serie defined in the PPA is for Utopic. Just use the following command to fix that :

$ sed -i -e 's/sid/utopic/g' /etc/apt/sources.list.d/louis-bouchard-networked-kdump-sid.list 
$ apt-get update
$ apt-get install kdump-tools makedumpfile

Configure kdump-tools for remote SSH capture

Edit the file /etc/default/kdump-tools and enable the kdump mechanism by setting USE_KDUMP to 1 . Then set the SSH variable to the remote hostname & credentials that you want to use to send the kernel crash dump. Here is an example :

USE_KDUMP=1
...
SSH="ubuntu@TrustyS-netcrash"

You will need to propagate the ssh key to the remote SSH host, so make sure that you have the password of the remote server’s user you defined (ubuntu in my case) for this command :

root@sid:~# kdump-config propagate
Need to generate a new ssh key...
The authenticity of host 'trustys-netcrash (192.168.122.70)' can't be established.
ECDSA key fingerprint is 04:eb:54:de:20:7f:e4:6a:cc:66:77:d0:7c:3b:90:7c.
Are you sure you want to continue connecting (yes/no)? yes
ubuntu@trustys-netcrash's password: 
propagated ssh key /root/.ssh/kdump_id_rsa to server ubuntu@TrustyS-netcrash

If you have an existing ssh key that you want to use, you can use the SSH_KEY option to point to your own key in /etc/default/kdump-tools :

SSH_KEY="/root/.ssh/mykey_id_rsa"

Then run the propagate command as previously :

root@sid:~/.ssh# kdump-config propagate
Using existing key /root/.ssh/mykey_id_rsa
ubuntu@trustys-netcrash's password: 
propagated ssh key /root/.ssh/mykey_id_rsa to server ubuntu@TrustyS-netcrash

It is a safe practice to verify that the remote SSH host can be accessed without password. You can use the following command to test (with your own remote server as defined in the SSH variable in /etc/default/kdump-tools) :

root@sid:~/.ssh# ssh -i /root/.ssh/mykey_id_rsa ubuntu@TrustyS-netcrash pwd
/home/ubuntu

If the passwordless connection can be achieved, then everything should be all set. You can proceed with a real crash dump test if your setup allows for it (not a production environment for instance).

Configure kdump-tools for remote NFS capture

Edit the /etc/default/kdump-tools file and set the NFS variable with the NFS mount point that will be used to transfer the crash dump :

NFS="TrustyS-netcrash:/var/crash"

The format needs to be the syntax that normally would be used to mount the NFS filesystem. You should test that your NFS filesystem is indeed accessible by mounting it manually :

root@sid:~/.ssh# mount -t nfs TrustyS-netcrash:/var/crash /mnt
root@sid:~/.ssh# df /mnt
Filesystem 1K-blocks Used Available Use% Mounted on
TrustyS-netcrash:/var/crash 6815488 1167360 5278848 19% /mnt
root@sid:~/.ssh# umount /mnt

Once you are sure that your NFS setup is correct, then you can proceed with a real crash dump test.

Testing on Ubuntu

As you would expect, setting things on Ubuntu is quite similar to Debian.

Install the beta packages

The package in the PPA can be installed on Debian with add-apt-repository. This command is in the software-properties-common package so you will have to install it first :

$ sudo add-apt-repository ppa:louis-bouchard/networked-kdump

Packages are available for Trusty and Utopic.

$ sudo apt-get update
$ sudo apt-get -y install linux-crashdump

Configure kdump-tools for remote SSH capture

Edit the file /etc/default/kdump-tools and enable the kdump mechanism by setting USE_KDUMP to 1 . Then set the SSH variable to the remote hostname & credentials that you want to use to send the kernel crash dump. Here is an example :

USE_KDUMP=1
...
SSH="ubuntu@TrustyS-netcrash"

You will need to propagate the ssh key to the remote SSH host, so make sure that you have the password of the remote server’s user you defined (ubuntu in my case) for this command :

ubuntu@TrustyS-testdump:~$ sudo kdump-config propagate
[sudo] password for ubuntu: 
Need to generate a new ssh key...
The authenticity of host 'trustys-netcrash (192.168.122.70)' can't be established.
ECDSA key fingerprint is 04:eb:54:de:20:7f:e4:6a:cc:66:77:d0:7c:3b:90:7c.
Are you sure you want to continue connecting (yes/no)? yes
ubuntu@trustys-netcrash's password: 
propagated ssh key /root/.ssh/kdump_id_rsa to server ubuntu@TrustyS-netcrash
ubuntu@TrustyS-testdump:~$
If you have an existing ssh key that you want to use, you can use the SSH_KEY option to point to your own key in /etc/default/kdump-tools :
SSH_KEY="/root/.ssh/mykey_id_rsa"

Then run the propagate command as previously :

ubuntu@TrustyS-testdump:~$ kdump-config propagate
Using existing key /root/.ssh/mykey_id_rsa
ubuntu@trustys-netcrash's password: 
propagated ssh key /root/.ssh/mykey_id_rsa to server ubuntu@TrustyS-netcrash

It is a safe practice to verify that the remote SSH host can be accessed without password. You can use the following command to test (with your own remote server as defined in the SSH variable in /etc/default/kdump-tools) :

ubuntu@TrustyS-testdump:~$sudo ssh -i /root/.ssh/mykey_id_rsa ubuntu@TrustyS-netcrash pwd
/home/ubuntu

If the passwordless connection can be achieved, then everything should be all set.

Configure kdump-tools for remote NFS capture

Edit the /etc/default/kdump-tools file and set the NFS variable with the NFS mount point that will be used to transfer the crash dump :

NFS="TrustyS-netcrash:/var/crash"

The format needs to be the syntax that normally would be used to mount the NFS filesystem. You should test that your NFS filesystem is indeed accessible by mounting it manually (you might need to install the nfs-common package) :

ubuntu@TrustyS-testdump:~$ sudo mount -t nfs TrustyS-netcrash:/var/crash /mnt 
ubuntu@TrustyS-testdump:~$ df /mnt
Filesystem 1K-blocks Used Available Use% Mounted on
TrustyS-netcrash:/var/crash 6815488 1167488 5278720 19% /mnt
ubuntu@TrustyS-testdump:~$ sudo umount /mnt

Once you are sure that your NFS setup is correct, then you can proceed with a real crash dump test.

 Miscellaneous commands and options

A few other things are under the control of the administrator

The HOSTTAG modifier

When sending the kernel crash dump, kdump-config will use the IP address of the server to as a prefix to the timestamped directory on the remote host. You can use the HOSTTAG variable to change that default. Simply define in /etc/default/kdump-tools :

HOSTTAG="hostname"

The hostname of the server will be used as a prefix instead of the IP address.

Currently, this is only implemented for the SSH method, but it will be available for NFS as well in the final version.

kdump-config show

To verify the configuration that you have defined in /etc/default/kdump-tools, you can use kdump-config’s show command to review your options.

ubuntu@TrustyS-testdump:~$ sudo kdump-config show
USE_KDUMP: 1
KDUMP_SYSCTL: kernel.panic_on_oops=1
KDUMP_COREDIR: /var/crash
crashkernel addr: 0x2d000000
SSH: ubuntu@TrustyS-netcrash
SSH_KEY: /root/.ssh/kdump_id_rsa
HOSTTAG: ip
current state: ready to kdump
kexec command:
 /sbin/kexec -p --command-line="BOOT_IMAGE=/vmlinuz-3.13.0-24-generic root=/dev/mapper/TrustyS--vg-root ro console=ttyS0,115200 irqpoll maxcpus=1 nousb" --initrd=/boot/initrd.img-3.13.0-24-generic /boot/vmlinuz-3.13.0-24-generic

If the remote crash kernel dump functionality is setup, you will see the options listed in the output of the commands.

Conclusion

As outlined at the beginning, this is the first functional beta version of the code. If you are curious, you can find the code I am working on here :

http://anonscm.debian.org/gitweb/?p=collab-maint/makedumpfile.git;a=shortlog;h=refs/heads/networked_kdump_beta1

Don’t hesitate to test & let me know if you find issues

Read more
beuno

Now that all the responsible disclosure processes have been followed through, I’d like to tell everyone a story of my very bad week last week. Don’t worry, it has a happy ending.

 

Part 1: Exposition

On May 5th we got a support request from a user who observed confusing behaviour in one of our systems. Our support staff immediately escalated it to me and my team sprung into action for what ended up being a 48-hour rollercoaster ride that ended with us reporting upstream to Django a security bug.

The bug, in a nutshell, is that when the following conditions lines up, a system could end up serving a request to one user that was meant for another:

- You are authenticating requests with cookies, OAuth or other authentication mechanisms
- The user is using any version of Internet Explorer or Chromeframe (to be more precise, anything with “MSIE” in the request user agent)
- You (or an ISP in the middle) are caching requests between Django and the internet (except Varnish’s default configuration, for reasons we’ll get to)
- You are serving the same URL with different content to different users

We rarely saw this combination of conditions because users of services provided by Canonical generally have a bias towards not using Internet Explorer, as you’d expect from a company who develops the world’s most used Linux distribution.

 

Part 2: Rising Action

Now, one may think that the bug is obvious, and wonder how it went unnoticed since 2008, but this really was one was one of those elusive “ninja-bugs” you hear about on the Internet and it took us quite a bit of effort to track it down.

In debugging situations such as this, the first step is generally to figure out how to reproduce the bug. In fact, figuring out how to reproduce it is often the lion’s share of the effort of fixing it.  However, no matter how much we tried we could not reproduce it. No matter what we changed, we always got back the right request. This was good, because it ruled out a widespread problem in our systems, but did not get us closer to figuring out the problem.

Putting aside reproducing it for a while, we then moved on to combing very carefully through our code, trying to find any hints of what could be causing this. Several of us looked at it with fresh eyes so we wouldn’t be tainted by having developed or reviewed the code, but we all still came up empty each and every time. Our code seemed perfectly correct.

We then went on to a close examination of all related requests to get new clues to where the problem was hiding. But we had a big challenge with this. As developers we don’t get access to any production information that could identify people. This is good for user privacy, of course, but made it hard to produce useful logs. We invested some effort to work around this while maintaining user privacy by creating a way to anonymise the logs in a way that would still let us find patterns in them. This effort turned up the first real clue.

We use Squid to cache data for each user, so that when they re-request the same data, it’s queued up right in memory and can be quickly served to them without having to recreate the data from the databases and other services. In those anonymized  Squid logs, we saw cookie-authenticated requests that didn’t contain an HTTP Vary header at all, where we expected it to have at the very least “Vary: Cookie” to ensure Squid would only serve the correct content all the time. So we then knew what was happening, but not why. We immediately pulled Squid out of the middle to stop this from happening.

Why was Squid not logging Vary headers? There were many possible culprits for this, so we got a *lot* of people were involved searching for the problem. We combed through everything in our frontend stack (Apache, Haproxy and Squid) that could sometimes remove Vary headers.

This was made all the harder because we had not yet fully Juju charmed every service, so could not easily access all configurations and test theories locally. Sometimes technical debt really gets expensive!

After this exhaustive search, we determined that nothing our code removed headers. So we started following the code up to Django middlewares, and went as far as logging the exact headers Django was sending out at the last middleware layer. Still nothing.

 

Part 3: The Climax

Until we got a break. Logs were still being generated, and eventually a pattern emerged. All the initial requests that had no Vary headers seemed for the most part to be from Internet Explorer. It didn’t make sense that a browser could remove headers that were returned from a server, but knowing this took us to the right place in the Django code, and because Django is open source, there was no friction in inspecting it deeply.  That’s when we saw it.

In a function called fix_IE_for_vary, we saw the offending line of code.

del response['Vary']

We finally found the cause.

It turns out IE 6 and 7 didn’t have the HTTP Vary header implemented fully, so there’s a workaround in Django to remove it for any content that isn’t html or plain text. In hindsight, if Django would of implemented this instead as a middleware, even if default, it would have been more likely that this would have been revised earlier. Hindsight is always 20/20 though, and it easy to sit back and theorise on how things should have been done.

So if you’ve been serving any data that wasn’t html or plain text with a caching layer in the middle that implements Vary header management to-spec (Varnish doesn’t trust it by default, and checks the cookie in the request anyway), you may have improperly returned a request.

Newer versions if Internet Explorer have since fixed this, but who knew in 2008 IE 9 would come 3 years later?

 

Part 4: Falling Action

We immediately applied a temporary fix to all our running Django instances in Canonical and involved our security team to follow standard responsible disclosure processes. The Canonical security team was now in the driving seat and worked to assign a CVE number and email the Django security contact with details on the bug, how to reproduce it and links to the specific code in the Django tree.

The Django team immediately and professionally acknowledged the bug and began researching possible solutions as well as any other parts of the code where this scenario could occur. There was continuous communication among our teams for the next few days while we agreed on lead times for distributions to receive and prepare the security fix,

 

Part 5: Resolution

I can’t highlight enough how important it is to follow these well-established processes to make sure we keep the Internet at large a generally safe place.
To summarise, if you’re running Django, please update to the latest security release as quickly as possible, and disable any internal caching until then to minimise the chances of hitting this bug.

If you're running squid and want to check if you could be affected, here's a small python script to run against your logs we put together you can use as a base, you may need to tweak it based on your log format. Be sure to run it only against cookie-authenticated URLs, otherwise you will hit a lot of false positives.

Read more
Dustin Kirkland

Click and drag to rotate, zoom with middle mouse button

It was September of 2009.  I answered a couple of gimme trivia questions and dropped my business card into a hat at a Linux conference in Portland, Oregon.  A few hours later, I received an email...I had just "won" a developer edition HTC Dream -- the Android G1.  I was quite anxious to have a hardware platform where I could experiment with Android.  I had, of course, already downloaded the SDK, compiled Android from scratch, and fiddled with it in an emulator.  But that experience fell far short of Android running on real hardware.  Until the G1.  The G1 was the first device to truly showcase the power and potential of the Android operating system.

And with that context, we are delighted to introduce the Orange Box!


The Orange Box


Conceived by Canonical and custom built by TranquilPC, the Orange Box is a 10-node cluster computer, that fits in a suitcase.

Ubuntu, MAAS, Juju, Landscape, OpenStack, Hadoop, CloudFoundry, and more!

The Orange Box provides a spectacular development platform, showcasing in mere minutes the power of hardware provisioning and service orchestration with Ubuntu, MAAS, Juju, and Landscape.  OpenStack, Hadoop, CloudFoundry, and hundreds of other workloads deploy in minutes, to real hardware -- not just instances in AWS!  It also makes one hell of a Steam server -- there's a charm for that ;-)


OpenStack deployed by Juju, takes merely 6 minutes on an Orange Box

Most developers here certainly recognize the term "SDK", or "Software Development Kit"...  You can think of the Orange Box as a "HDK", or "Hardware Development Kit".  Pair an Orange Box with MAAS and Juju, and you have yourself a compact cloud.  Or a portable big data number cruncher.  Or a lightweight cluster computer.


The underside of an Orange Box, with its cover off


Want to get your hands on one?

Drop us a line, and we'd be delighted to hand-deliver an Orange Box to your office, and conduct 2 full days of technical training, covering MAAS, Juju, Landscape, and OpenStack.  The box is yours for 2 weeks, as you experiment with the industry leading Ubuntu ecosystem of cloud technologies at your own pace and with your own workloads.  We'll show back up, a couple of weeks later, to review what you learned and discuss scaling these tools up, into your own data center, on your own enterprise hardware.  (And if you want your very own Orange Box to keep, you can order one from our friends at TranquilPC.)


Manufacturers of the Orange Box

Gear head like me?  Interested in the technical specs?


Remember those posts late last year about Intel NUCs?  Someone took notice, and we set out to build this ;-)


Each Orange Box chassis contains:
  • 10x Intel NUCs
  • All 10x Intel NUCs contain
    • Intel HD Graphics 4000 GPU
    • 16GB of DDR3 RAM
    • 120GB SSD root disk
    • Intel Gigabit ethernet
  • D-Link DGS-1100-16 managed gigabit switch with 802.1q VLAN support
    • All 10 nodes are internally connected to this gigabit switch
  • 100-240V AC/DC power supply
    • Adapter supplied for US, UK, and EU plug types
    • 19V DC power supplied to each NUC
    • 5V DC power supplied to internal network switch


Intel NUC D53427RKE board

That's basically an Amazon EC2 m3.xlarge ;-)

The first node, node0, additionally contains:
  • A 2TB Western Digital HDD, preloaded with a full Ubuntu archive mirror
  • USB and HDMI ports are wired and accessible from the rear of the box

Most planes fly in clouds...this cloud flies in planes!


In aggregate, this micro cluster effectively fields 40 cores, 160GB of RAM, 1.2TB of solid state storage, and is connected over an internal gigabit network fabric.  A single fan quietly cools the power supply, while all of the nodes are passively cooled by aluminum heat sinks spanning each side of the chassis. All in a chassis the size of a tower PC!

It fits in a suit case, and can travel anywhere you go.


Pelican iM2875 Storm Case

How are we using them at Canonical?

If you're here at the OpenStack Summit in Atlanta, GA, you'll see at least a dozen Orange Boxes, in our booth, on stage during Mark Shuttleworth's keynote, and in our breakout conference rooms.


Canonical sales engineer, Ameet Paranjape,
demonstrating OpenStack on the Orange Box in the Ubuntu booth
at the OpenStack Summit in Atlanta, GA
We are also launching an update to our OpenStack Jumpstart program, where we'll deliver and Orange Box and 2 full days of training to your team, and leave you the box while you experiment with OpenStack, MAAS, Juju, Hadoop, and more for 2 weeks.  Without disrupting your core network or production data center workloads,  prototype your OpenStack experience within a private sandbox environment. You can experiment with various storage alternatives, practice scaling services, destroy and rebuild the environment repeatedly. Safe. Risk free.


This is Cloud, for the Free Man.

:-Dustin

Read more
Dustin Kirkland


Upon learning about the Heartbleed vulnerability in OpenSSL, my first thoughts were pretty desperate.  I basically lost all faith in humanity's ability to write secure software.  It's really that bad.

I spent the next couple of hours drowning in the sea of passwords and certificates I would personally need to change...ugh :-/

As of the hangover of that sobering reality arrived, I then started thinking about various systems over the years that I've designed, implemented, or was otherwise responsible for, and how Heartbleed affected those services.  Another throbbing headache set in.

I patched DivItUp.com within minutes of Ubuntu releasing an updated OpenSSL package, and re-keyed the SSL certificate as soon as GoDaddy declared that it was safe for re-keying.

Likewise, the Ubuntu entropy service was patched and re-keyed, along with all Ubuntu-related https services by Canonical IT.  I pushed an new package of the pollinate client with updated certificate changes to Ubuntu 14.04 LTS (trusty), the same day.

That said, I did enjoy a bit of measured satisfaction, in one controversial design decision that I made in January 2012, when creating Gazzang's zTrustee remote key management system.

All default network communications, between zTrustee clients and servers, are encrypted twice.  The outer transport layer network traffic, like any https service, is encrypted using OpenSSL.  But the inner payloads are also signed and encrypted using GnuPG.

Hundreds of times, zTrustee and I were questioned or criticized about that design -- by customers, prospects, partners, and probably competitors.

In fact, at one time, there was pressure from a particular customer/partner/prospect, to disable the inner GPG encryption entirely, and have zTrustee rely solely on the transport layer OpenSSL, for performance reasons.  Tried as I might, I eventually lost that fight, and we added the "feature" (as a non-default option).  That someone might have some re-keying to do...

But even in the face of the Internet-melting Heartbleed vulnerability, I'm absolutely delighted that the inner payloads of zTrustee communications are still protected by GnuPG asymmetric encryption and are NOT vulnerable to Heartbleed style snooping.

In fact, these payloads are some of the very encryption keys that guard YOUR health care and financial data stored in public and private clouds around the world by Global 2000 companies.

Truth be told, the insurance against crypto library vulnerabilities zTrustee bought by using GnuPG and OpenSSL in combination was really the secondary objective.

The primary objective was actually to leverage asymmetric encryption, to both sign AND encrypt all payloads, in order to cryptographically authenticate zTrustee clients, ensure payload integrity, and enforce key revocations.  We technically could have used OpenSSL for both layers and even realized a few performance benefits -- OpenSSL is faster than GnuPG in our experience, and can leverage crypto accelerator hardware more easily.  But I insisted that the combination of GPG over SSL would buy us protection against vulnerabilities in either protocol, and that was worth any performance cost in a key management product like zTrustee.

In retrospect, this makes me wonder why diverse, backup, redundant encryption, isn't more prevalent in the design of security systems...

Every elevator you have ever used has redundant safety mechanisms.  Your car has both seat belts and air bags.  Your friendly cashier will double bag your groceries if you ask.  And I bet you've tied your shoes with a double knot before.

Your servers have redundant power supplies.  Your storage arrays have redundant hard drives.  You might even have two monitors.  You're might be carrying a laptop, a tablet, and a smart phone.

Moreover, important services on the Internet are often highly available, redundant, fault tolerant or distributed by design.

But the underpinnings of the privacy and integrity of the very Internet itself, is usually protected only once, with transport layer encryption of the traffic in motion.

At this point, can we afford the performance impact of additional layers of security?  Or, rather, at this point, can we afford not to use all available protection?

Dustin

p.s. I use both dm-crypt and eCryptFS on my Ubuntu laptop ;-)

Read more
Dustin Kirkland





This article is cross-posted on Docker's blog as well.

There is a design pattern, occasionally found in nature, when some of the most elegant and impressive solutions often seem so intuitive, in retrospect.



For me, Docker is just that sort of game changing, hyper-innovative technology, that, at its core,  somehow seems straightforward, beautiful, and obvious.



Linux containers, repositories of popular base images, snapshots using modern copy-on-write filesystem features.  Brilliant, yet so simple.  Docker.io for the win!


I clearly recall nine long months ago, intrigued by a fervor of HackerNews excitement pulsing around a nascent Docker technology.  I followed a set of instructions on a very well designed and tastefully manicured web page, in order to launch my first Docker container.  Something like: start with Ubuntu 13.04, downgrade the kernel, reboot, add an out-of-band package repository, install an oddly named package, import some images, perhaps debug or ignore some errors, and then launch.  In few moments, I could clearly see the beginnings of a brave new world of lightning fast, cleanly managed, incrementally saved, highly dense, operating system containers.

Ubuntu inside of Ubuntu, Inception style.  So.  Much.  Potential.



Fast forward to today -- April 18, 2014 -- and the combination of Docker and Ubuntu 14.04 LTS has raised the bar, introducing a new echelon of usability and convenience, and coupled with the trust and track record of enterprise grade Long Term Support from Canonical and the Ubuntu community.
Big thanks, by the way, to Paul Tagliamonte, upstream Debian packager of Docker.io, as well as all of the early testers and users of Docker during the Ubuntu development cycle.
Docker is now officially in Ubuntu.  That makes Ubuntu 14.04 LTS the first enterprise grade Linux distribution to ship with Docker natively packaged, continuously tested, and instantly installable.  Millions of Ubuntu servers are now never more than three commands away from launching or managing Linux container sandboxes, thanks to Docker.


sudo apt-get install docker.io
sudo docker.io pull ubuntu
sudo docker.io run -i -t ubuntu /bin/bash


And after that last command, Ubuntu is now running within Docker, inside of a Linux container.

Brilliant.

Simple.

Elegant.

User friendly.

Just the way we've been doing things in Ubuntu for nearly a decade. Thanks to our friends at Docker.io!


Cheers,
:-Dustin

Read more
Michael Hall

Bicentennial Man PosterEver since we started building the Ubuntu SDK, we’ve been trying to find ways of bringing the vast number of Android apps that exist over to Ubuntu. As with any new platform, there’s a chasm between Android apps and native apps that can only be crossed through the effort of porting.

There are simple solutions, of course, like providing an Android runtime on Ubuntu. On other platforms, those have shown to present Android apps as second-class citizens that can’t benefit from a new platform’s unique features. Worse, they don’t provide a way for apps to gradually become first-class citizens, so chasm between Android and native still exists, which means the vast majority of apps supported this way will never improve.

There are also complicates solutions, like code conversion, that try to translate Android/Java code into the native platform’s language and toolkit, preserving logic and structure along the way. But doing this right becomes such a monumental task that making a tool to do it is virtually impossible, and the amount of cleanup and checking needed to be done by an actual developer quickly rises to the same level of effort as a manual port would have. This approach also fails to take advantage of differences in the platforms, and will re-create the old way of doing things even when it doesn’t make sense on the new platform.

Screenshot from 2014-04-19 14:44:22NDR takes a different approach to these, it doesn’t let you run our Android code on Ubuntu, nor does it try to convert your Android code to native code. Instead NDR will re-create the general framework of your Android app as a native Ubuntu app, converting Activities to Pages, for example, to give you a skeleton project on which you can build your port. It won’t get you over the chasm, but it’ll show you the path to take and give you a head start on it. You will just need to fill it in with the logic code to make it behave like your Android app. NDR won’t provide any of logic for you, and chances are you’ll want to do it slightly differently than you did in Android anyway, due to the differences between the two platforms.

Screenshot from 2014-04-19 14:44:31To test NDR during development, I chose the Telegram app because it was open source, popular, and largely used Android’s layout definitions and components. NDR will be less useful against apps such as games, that use their own UI components and draw directly to a canvas, but it’s pretty good at converting apps that use Android’s components and UI builder.

After only a couple days of hacking I was able to get NDR to generate enough of an Ubuntu SDK application that, with a little bit of manual cleanup, it was recognizably similar to the Android app’s.

This proves, in my opinion, that bootstrapping an Ubuntu port based on Android source code is not only possible, but is a viable way of supporting Android app developers who want to cross that chasm and target their apps for Ubuntu as well. I hope it will open the door for high-quality, native Ubuntu app ports from the Android ecosystem.  There is still much more NDR can do to make this easier, and having people with more Android experience than me (that would be none) would certainly make it a more powerful tool, so I’m making it a public, open source project on Launchpad and am inviting anybody who has an interest in this to help me improve it.

Read more
ssweeny

Trusty TahrAnother very exciting release of Ubuntu for desktops and phones (oh, and I guess servers and cloud too) is out the door!

This is a Long Term Support release, which means it’s supported for five years, and it’s the release I’ll be trying to install on friends’ and family’s computers at every opportunity.

As usual, you can take a tour or go straight to the download page.

Read more
olli

Ubuntu 14.04 will be released today and you couldn’t resist the itch to go try the Unity 8 preview session on the Desktop. How underwhelming… there are almost no apps, and some don’t even work and overall it’s actually pretty unexciting… let’s change that in the next few chapters. First things first though… let’s look […]

Read more