Canonical Voices

April Wang

因为黑客松活动,在炎炎夏日的8月里第一次飞到了深圳,这个潮湿闷热的城市让我“大跌眼镜”(不停冒汗,眼镜真的一直往下掉)。 活动现场是在位于福田区华强北商圈中的华强创客中心,不论是平日里还是周末,路边楼下仿佛永远都是人流涌动热闹非凡。华强北创客中心是由华强集团倾力打造,中国第一个为创业者提供一站式服务的综合型创新创业生态平台。一期建筑面积有5000㎡,位于华强广场B座7楼的空中花园,堪称是华强北闹市中的一片室外桃园。整体设计布满了类似街头艺术的graffiti式画作,置身其中就能感受到它灵感激发的能量,黑客松选这里自然是理所当然。

 

Canonical一直坚信激励创新的最佳方式就是将他们需要的技术给到创新者的手中,这次深圳黑客松除了Ubuntu手机操作系统平台之外,我们还带了Ubuntu Snappy Core, 一款安全易用的智能硬件操作系统技术。针对这个最新技术, 我们在活动TechTalk环节详细讲解了如何通过KVM来做开发的上手介绍。错过的同学可以在这里下载文档参看视频。而参加活动的同学们通过将Ubuntu手机平台和Snappy技术相结合将会获得特别IoT奖项。所以这场活动的亮点和看点更加有趣。 

22日的上午10点半,倒计时开始,黑客松正式进入hacking时段。不吃不喝不停不休的30个小时之后。

呵呵, 开玩笑了,一定是有吃有喝有玩有乐了,而且还有夜宵火锅,台式足球。

既然是场hackathon,重头戏当然还是这场hacking party产出的作品了。下面我就挑几组现场做了作品和大家分享。

QML Git-OSC是由开源中国团队开发的一款基于QML的Ubuntu手机应用,有了它程序猿攻城狮们可以直接通过Ubuntu手机端访问查看保存在自己在Git@OSC上的Repo详情和代码了。作为一款为写代码人群定制的应用,这组团队成功获得了最佳上手奖- 樱桃机械键盘。

iFace是第二组出场的demo作品,展示队员Mago用含蓄幽默的方式介绍了一款,在“这个看脸的时代”,用你的颜值当工具的应用。通过脸部验证登录,上网约聊。这么懂时代,能聊的队伍很不意外的拿走了由优麒麟赞助的能说会道奖-一款便携音箱。

这场黑客松中最吸引眼球的团队E Minor(E小调),在30个小时的黑客松内产出了两款作品:LibreOffice Impress Remote 和让我从第一天就在期待的Project MrRobot。Impress Remote作品正如其名,可以让你的Ubuntu手机即刻变成你Impress文档的remote,简单但超级实用。Project Mr Robot是一款由Ubuntu手机操控超萌Rapiro机器人的应用,通过这款应用你可以通过语音,按键和摇手机来操控它。两款作品的代码也完全开源,感兴趣的同学们可以在这里这里分别找到他们的代码。这支团队,轻松拿下我们的最佳颜值奖(Ubuntu双肩背包)和最佳极客奖(由华硕公司特别赞助的移动便携投影仪);这里也特别感谢为这组颁奖,也是我们现场评审之一的美女评审秦夏鸿女士(灵游科技的副总裁)。此外Project Robot在活动结束第二天就已经被Softpedia点名报道了。

IoT Ranger是一款专为电脑牵挂强迫症人群定制的应用。它为Ubuntu手机用户提供了一个随时监测家里电脑运行状态的应用。这款基于cordava的Ubuntu手机应用,巧妙使用运行在kvm环境下的网络服务框架,成功的将Ubuntu Snappy Core技术和Ubuntu手机应用开发相结合。绝对是这次黑客松中当之不愧的IoT特别奖项作品,自然拿下了我们精心准备的Beaglebone Black。

活动现场展示的作品还有几组我就不在这里一一介绍了, 感兴趣的同学们可以后续在Ubuntu开发者网站(cn.developer.ubuntu.com)的黑客松页面找到每组作品介绍。在短短的30个小时内,我们见证了如此之多的精彩,不禁就已经开始期待下一次了。 希望在后面的日子里,每个团队都能实现作品的成功部署,在外面的世界里成功立足。 代码写的辛苦,但是能和兴趣相投的人一起通宵畅聊应该是最过瘾的事了。在此献上活动现场制作的文化衫照片和大家分享。

最后要再次感谢这次活动的特约赞助商华硕,除了特别极客奖之外,还为大家提供了丰盛的签到奖和现场Demo奖;感谢线上线下的协办单位和论坛平台让这场黑客松成为可能(Git@OSC, SegmentFault,开源中国开源社Linux伊甸园Linux中国Linuxtoy.orgQTCN开发网, Meego南极圈深圳开放创新实验室SegmentFault腾讯开放平台优麒麟中芬设计园),感谢现场评审团队,还有让这场活动分外精彩的场地赞助华强北创客中心,为大家提供分外精彩的场地赞助,轻松愉快的氛围激发大家无限的创作灵感。

 

Read more
facundo

Hostería sede del PyCamp


Entre las fotos que saqué del PyCamp de hace un par de semanas está esta, que me gustó tanto que la pongo acá aparte, un poco más grande...

Hostería sede del PyCamp 2015 en La Serranita

Es la hostería donde fue sede el evento (donde dormíamos y trabajábamos... las comidas fueron en otro lugar). Una construcción en múltiples niveles muy muy linda.

Más fotos del PyCamp acá.

Read more
Joseph Salisbury

Meeting Minutes

IRC Log of the meeting.

Meeting minutes.

Agenda

20150825 Meeting Agenda


Release Metrics and Incoming Bugs

Release metrics and incoming bug data can be reviewed at the following link:

  • http://kernel.ubuntu.com/reports/kt-meeting.txt


Status: CVE’s

The current CVE status can be reviewed at the following link:

  • http://kernel.ubuntu.com/reports/kernel-cves.html


Status: Wily Development Kernel

We have rebased our Wily master-next branch to the latest upstream
v4.2-rc8 and uploaded to our ~canonical-kernel-team PPA. The fglrx DKMS
package is still failing to build with this latest kernel. We are
actively investigating to get this resolved.
—–
Important upcoming dates:

  • https://wiki.ubuntu.com/WilyWerewolf/ReleaseSchedule
    Thurs Aug 27 – Beta 1 (~2 days away)
    Thurs Sep 24 – Final Beta (~4 weeks away)
    Thurs Oct 8 – Kernel Freeze (~6 weeks away)
    Thurs Oct 15 – Final Freeze (~7 weeks away)
    Thurs Oct 22 – 15.10 Release (~8 weeks away)


Status: Stable, Security, and Bugfix Kernel Updates – Precise/Trusty/Utopic/Vivid

Status for the main kernels, until today:

  • Precise – Verification & Testing
  • Trusty – Verification & Testing
  • lts-Utopic – Verification & Testing
  • Vivid – Verification & Testing

    Current opened tracking bugs details:

  • http://kernel.ubuntu.com/sru/kernel-sru-workflow.html
    For SRUs, SRU report is a good source of information:
  • http://kernel.ubuntu.com/sru/sru-report.html

    Schedule:

    cycle: 16-Aug through 05-Sep
    ====================================================================
    14-Aug Last day for kernel commits for this cycle
    15-Aug – 22-Aug Kernel prep week.
    23-Aug – 29-Aug Bug verification & Regression testing.
    30-Aug – 05-Sep Regression testing & Release to -updates.


Open Discussion or Questions? Raise your hand to be recognized

No open discussion.

Read more
Tristram Oaten

Publishing Vanilla

We’ve got a new CSS framework at Canonical, named Vanilla. My colleague Ant has a great write-up introducing Vanilla. Essentially it’s a CSS microframework powered by Sass. The build process consists of two steps, an open source build, and a private build.

Open Source Build

While there are inevitably componants that need to be kept private (keys, tokens, etc.) being Canonical, we want to keep much of the build in the open, in addition to the code. We wanted the build to be as automated and close to CI/CD principles as possible. Here’s what happens:

Committing to our github repository kicks off a travis build that runs gulp tests, which include sass-lint. And we also use david-dm.org to make sure our npm dependencies are up to date. All of these have nice badges we can link to right from our github page, so the first thing people see is the heath of our project. I really like this, it keeps us honest, and informs the community.

Not everything can be done with travis, however, as publishing Vanilla to npm, updating our project page and demo site require some private credentials. For the confidential build, we use Jenkins. (formally Hudson, a java-based build management system.).

Private Build with Jenkins

Our Jenkins build does a few things:

  1. Increment the package.json version number
  2. npm publish (package)
  3. Build Sass with npm install
  4. Upload css to our assets server
  5. Update Sassdoc
  6. Update demo site with new CSS

Robin put this functionality together in a neat bash script: publish.sh.

We use this script in a Jenkins build that we kick off with a few parameters, point, minor and major to indicate the version to be updated in package.json. This allows our devs push-button releases on the fly, with the same build, from bugfixes all the way up to stable releases (1.0.0)

After less than 30 seconds, our demo site, which showcases framework elements and their usage, is updated. This demo is styled with the latest version of Vanilla, and also serves as documentation and a test of the CSS. We take advantage of github’s html publishing feature, Github Pages. Anyone can grab – or even hotlink – the files on our release page.

The Future

It’d be nice for the regression test (which we currently just eyeball) to be automated, perhaps with a visual diff tool such as PhantomCSS or a bespoke solution with Selenium.

Wrap-up

Vanilla is ready to hack on, go get it here and tell us what you think! (And yes, you can get it in colours other than Ubuntu Orange)

Read more
facundo


Como casi siempre a Córdoba, fuí y volví en micro (porque en el colectivo en general duermo pasablemente bien, entonces aprovecho la noche, y no pierdo medio día "para viajar"). Esta vez, por otro lado, fuí un día antes, porque tenía que hacer un trámite en Córdoba Capital, así que estuve el jueves hospedado en la casa de Nati y Matías, trabajando durante el día, jugando juegos de mesa durante la noche.

El viernes a la mañana hicimos el viaje hasta La Serranita con los chicos. Llegamos a media mañana, y ahí se dió la situación de siempre, que es muchas veces esperada: saludar y abrazar a viejos amigos que uno no puede ver tan seguido (y a cuatro o cinco nuevos que uno todavía no conoce :) ).

El grupo recién reunido, charlando sobre las propuestas

Como quedaron planeadas las actividades


El lugar estuvo muy bueno. Quizás me podría quejar que el salón principal era demasiado ajustado, y que las comidas eran en una hostería a cuatro cuadras de distancia, pero el resto estuvo más que bien. No sólo las instalaciones (habitaciones, parque, quincho, etc, etc), sino la atención humana. Un lujo.

Hasta buena internet tuvimos en este PyCamp, ya que estábamos en la red vecinal abierta que Nico Echaniz y amigos montaron en La Quintana y ciudades aledañas. Eso sí, notamos que cuando teníamos problemas con lo que era comunicaciones el tema estaba en el router que estábamos usando (y eso que terminamos poniendo uno muy bueno). Decidimos que había que invertir en hardware un poco más pro (no algo "de uso hogareño bueno" sino algo "profesional")... veremos cuanto cuesta, pero creo que vamos a gastar unos mangos ahí, ya que nos queda no sólo para el PyCamp sino para otros eventos.

Una terraza dos niveles más arriba que la sala de laburo

Pequeño parque en uno de los niveles

A nivel proyectos y Python: lo de siempre... se hacen mil cosas, desde laburar en proyectos estables hasta delirar con cosas nuevas, se mejoran cosas arrancadas de antes, se agregan funcionalidades, se empiezan proyectos que se terminan en esos cuatro días, se arrancan cosas que luego duran mucho tiempo, etc... pero lo más importante no pasa por ahí.

El núcleo del evento es humano. Charlar con gente que conocés de siempre, y podés delirar ideas, proyectos nuevos, o simplemente charlar. Y conocer gente nueva. Pibes que están haciendo cosas locas o no, con laburos copados o no, con vidas interesantes o no. Pero charlar, compartir tiempo, ver como las otras personas encaran un proyecto, qué aportan, como ayudarlos, como transmitirles experiencias.

El programar Python es casi una excusa para que todo lo otro suceda. Y digo "casi" porque sí, claro, lo que se programa y hace está buenísimo también :D

En el comedor, almorzando

En la sala principal de laburo (no era grande, pero no era la única)

En ese aspecto, yo estuve principalmente con dos proyectos. Por un lado filesync server, recientemente liberado open source, con un cambio muy grande que empecé el jueves mismo estando en la casa de Nati y continué intermitentemente durante los cuatro días de PyCamp.

El otro proyecto en el que invertí mucho tiempo es fades que desarrollo principalmente con Nico. Es que se enganchó mucha gente que le gustaba lo que fades ofrece, y aportaron un montón de ideas buenísimas. ¡Y no sólo ideas! También código, branches que mergeamos o que todavía tenemos que revisar. Ya iremos metiendo todo, y queremos hacer un release en las próximas semanas. Estén atentos, porque fades ahora hace cosas que te vuela la peluca :D

Pero no sólo trabajé en eso. También porté Tritcask a que trabaje simultaneamente con Python 2 y Python 3 (arranqué sólo con esto, pero el 70% del laburo lo hicimos juntos con SAn). Y estuvimos buscando cómo hacer para detectar cuanto de un subtítulo matchea con las voces de un video, de manera de poder determinar si está bien sincronizado o no. Y estuve haciendo algo de código asincrónico usando asyncio. Y estuve charlando con SAn, DiegoM, Bruno y Nico Echaniz sobre una especie de Repositorio Federado de Contenido Educativo. Y estuve ayudando a gente a trabajar en Python mismo durante un cortito Python Bug Day (Jairo solucionó un issue y medio!!).

Camino al río

Recorriendo la vera del río, saltando de piedra en piedra

El mejor asado de un PyCamp, ever

Y tomé sol. Y tuve en mis manos una espada de verdad por primera vez. Y caminé por el costado del río saltando de piedra en piedra. Y comí un asadazo (entre el centenar de kilos de comida que ingeríamos por día por persona). Y conocí La Serranita. Y charlé mil. Y usé un sistema de realidad virtual. Y jugué a muchos juegos de mesa.

Y abracé amigos.

Read more
Daniel Holbach

In the flurry of uploads for the C++ ABI transition and other frantic work (Thursday is Feature Freeze day) this gem maybe went unnoticed:

snapcraft (0.1) wily; urgency=low

  * Initial release

What this means? If you’re on wily, you can easily try out snapcraft and get started turning software into snaps. We have some initial docs available on the developer site which should help you find your way around.

This is a 0.1 release, so there are bugs and there might be bigger changes coming your way, but there will also be more docs, more plugins and more good stuff in general. If you’re curious, you might want to sign up for the daily build (just add the ppa:snappy-dev/snapcraft-daily PPA).

Here’s a brilliant example of what snapcraft can do for you: packaging a Java app was never this easy.

If you’re more into client apps, check out Ted’s article on how to create a QML snap.

As you can easily see: the future is on its way and upstreams and app developer will have a much easier time sharing their software.

As I said above: snapcraft is still a 0.1 release. If you want to let us know your feedback and find bugs or propose merges, you can find snapcraft in Launchpad.

Read more
Robin Winslow

pre {font-size: 1em; margin-bottom: 0.75em; padding: 0.75em} code {padding-left: 0.5em; padding-right: 0.5em} pre code {padding: 0; display: block;}

I recently tried to setup OpenID for one of our sites to support authentication with login.ubuntu.com, and it took me much longer than I’d anticipated because our site is behind a reverse-proxy.

My problem

I was trying to setup OpenID with the django-openid-auth plugin. Normally our sites don’t include absolute links (https://example.com/hello-world) back to themselves, because relative URLs (/hello-world) work perfectly well, so normally Django doesn’t need to know the domain name that it’s hosted it.

However, when authenticating with OpenID, our website needs to send the user off to login.ubuntu.com with a callback url so that once they’re successfully authenticed they can be directed back to our site. This means that the django-openid-auth needs to ask Django for an absolute URL to send off to the authenticator (e.g. https://example.com/openid/complete).

The problem with proxies

In our setup, the Django app is served with a light Gunicorn server behind an Apache front-end which handles HTTPS negotiation:

User <-> Apache <-> Gunicorn (Django)

(There’s actually an additional HAProxy load-balancer in between, which I thought was complicating matters, but it turns out HAProxy was just passing through requests absolutely untouched and so was irrelevant to the problem.)

Apache was setup as a reverse-proxy to Django, meaning that the user only ever talks to Apache, and Apache goes off to get the response from Django itself, with Django’s local network IP address – e.g. 10.0.0.3.

It turns out this is the problem. Because Apache, and not the user directly, is making the request to Django, Django sees the request come in at http://10.0.0.3/openid/login rather than https://example.com/openid/login. This meant that django-openid-auth was generating and sending the wrong callback URL of http://10.0.0.3/openid/complete to login.ubuntu.com.

How Django generates absolute URLs

django-openid-auth uses HttpRequest.build_absolute_uri which in turn uses HttpRequest.get_host to retrieve the domain. get_host then normally uses the HTTP_HOST header to generate the URL, or if it doesn’t exist, it uses the request URL (e.g.: http://10.0.0.3/openid/login).

However, after inspecting the code for get_host I discovered that if and only if settings.USE_X_FORWARDED_HOST is True then Django will look for the X-Forwarded-Host header first to generate this URL. This is the key to the solution.

Solving the problem – Apache

In our Apache config, we were initially using mod_rewrite to forward requests to Django.

RewriteEngine On
RewriteRule ^/?(.*)$ http://10.0.0.3/$1 [P,L]

However, when proxying with this method Apache2 doesn’t send the X_Forwarded_Host header that we need. So we changed it to use mod_proxy:

ProxyPass / http://10.0.0.3/
ProxyPassReverse / http://10.0.0.3/

This then means that Apache will send three headers to Django: X-Forwarded-For, X-Forwarded-Host and X-Forwarded-Server, which will contain the information for the original request.

In our case the Apache frontend used HTTPS protocol, whereas Django was only using so we had to pass that through as well by manually setting Apache to pass an X-Forwarded-Proto to Django. Our eventual config changes looked like this:

<VirtualHost *:443>
    ...
    RequestHeader set X-Forwarded-Proto 'https' env=HTTPS

    ProxyPass / http://10.0.0.3/
    ProxyPassReverse / http://10.0.0.3/
    ...
</VirtualHost>

This meant that Apache now passes through all the information Django needs to properly build absolute URLs, we just need to make Django parse them properly.

Solving the problem – Django

By default, Django ignores all X-Forwarded headers. As mentioned earlier, you can set get_host to read the X-Forwarded-Host header by setting USE_X_FORWARDED_HOST = True, but we also needed one more setting to get HTTPS to work. These are the settings we added to our Django settings.py:

# Setup support for proxy headers
USE_X_FORWARDED_HOST = True
SECURE_PROXY_SSL_HEADER = ('HTTP_X_FORWARDED_PROTO', 'https')

After changing all these settings, we now have Apache passing all the relevant information (X-Forwarded-Host, X-Forwarded-Proto) so that Django is now able to successfully generate absolute URLs, and django-openid-auth now works a charm.

Read more
Prakash

Taiwanese firm Foxconn’s decision to invest a whopping USD five billion in India has caused unease in China as it marks the first top international firm opting for India amid a slowdown in the Chinese economy.

“Foxconn chooses India over China for new plant,” read the headline in state-run china.org.cn while carrying the news of the Taiwanese electronic giant signing up to set up a big plant in Maharashtra.

“Foxconn’s latest India investment represents the leading electronic product maker’s intention to profit from the world’s fastest expanding market of smartphones. Foxconn, famous for making parts for Apple, will reportedly produce Xiaomi phones in the new factory, a rumour that Foxconn authorities did not clarify or comment,” it said.

Read More: http://www.financialexpress.com/article/industry/companies/foxconn-shift-to-india-causes-concerns-in-china/116976/

Read more
Michi Henning

A Fast Thumbnailer for Ubuntu

Over the past few months, James Henstridge, Xavi Garcia Mena, and I have implemented a fast and scalable thumbnailing service for Ubuntu and Ubuntu Touch. This post explains how we did it, and how we achieved our performance and reliability goals.

Introduction

On a phone as well as the desktop, applications need to display image thumbnails for various media, such as photos, songs, and videos. Creating thumbnails for such media is CPU-intensive and can be costly in bandwidth if images are retrieved over the network. In addition, different types of media require the use of different APIs that are non-trivial to learn. It makes sense to provide thumbnail creation as a platform API that hides this complexity from application developers and, to improve performance, to cache thumbnails on disk.

This article explains the requirements we had and how we implemented a thumbnailer service that is extremely fast and scalable, and robust in the face of power loss or crashes.

Requirements

We had a number of requirements we wanted to meet in our implementation.

  • Robustness
    In the event of a crash, the implementation must guarantee the integrity of on-disk data structures. This is particularly important on a phone, where we cannot expect the user to perform manual recovery (such as cleaning up damaged files). Because batteries can run out at any time, integrity must be guaranteed even in the face of power loss.
  • Scalability
    It is common for people to store many thousands of songs and photos on a device, so the cache must scale to at least tens of thousands of records. Thumbnails can range in size from a few kilobytes to well over a megabyte (for “thumbnails” at full-screen resolution), so the cache must deal efficiently with large records.
  • Re-usability
    Persistent and reliable on-disk storage of arbitrary records (ranging in size from a few bytes to potentially megabytes) is a common application requirement, so we did not want to create a cache implementation that is specific to thumbnails. Instead, the disk cache is provided as a stand-alone C++ API that can be used for any number of other purposes, such as a browser or HTTP cache, or to build an object file cache similar to ccache.
  • High performance
    The performance of the thumbnailer directly affects the user experience: it is not nice for the customer to look at “please wait a while” icons in, say, an image gallery while thumbnails are being loaded one by one. We therefore had to have a high-performance implementation that delivers cached thumbnails quickly (on the order of a millisecond per thumbnail on an Arm CPU). An efficient implementation also helps to conserve battery life.
  • Location independence and extensibility
    Canonical runs an image server at dash.ubuntu.com that provides album and artist artwork for many musicians and bands. Images from this server are used to display artwork in the music player for media that contains ID3 tags, but does not embed artwork in the media file. The thumbnailer must work with embedded images as well as remote images, and it must be possible to extend it for new types of media without unduly disturbing the existing code.
  • Low bandwidth consumption
    Mobile phones typically come with data caps, so the cache has to be frugal with network bandwidth.
  • Concurrency and isolation
    The implementation has to allow concurrent access by multiple applications, as well as concurrent access from a single implementation. Besides needing to be thread-safe, this means that a request for a thumbnail that is slow (such as downloading an image over the network) must not delay other requests.
  • Fault tolerance
    Mobile devices lose network access without warning, and users can add corrupt media files to their device. The implementation must be resilient to partial failures, such as incomplete network replies, dropped connections, and bad image data. Moreover, the recovery strategy for such failures must conserve battery and avoid repeated futile attempts to create thumbnails from media that cannot be retrieved or contains malformed data.
  • Security
    The implementation must ensure that applications cannot see (or, worse, overwrite) each other’s thumbnails or coerce the thumbnailer into delivering images from files that an application is not allowed to read.
  • Asynchronous API
    The customers of the thumbnailer are applications that are written in QML or Qt, which cannot block in the UI thread. The thumbnailer therefore must provide a non-blocking API. Moreover, the application developer should be able to get the best possible performance without having to use threads. Instead, concurrency must be internal to the implementation (which is able to put threads to use intelligently where they make sense), instead of the application throwing threads at the problem in the hope that it might make things faster when, in fact, it might just add overhead.
  • Monitoring
    The effectiveness of a cache cannot be assessed without statistics to show hit and miss rates, evictions, and other basic performance data, so it must provide a way to extract this information.
  • Error reporting
    When something goes wrong with a system service, typically the only way to learn about the problem is to look at log messages. In case of a failure, the implementation must leave enough footprints behind to allow someone to diagnose a failure after the fact with some chance of success.
  • Backward compatibility
    This project was a rewrite of an earlier implementation. Rather than delivering a “big bang” piece of software and potentially upsetting existing clients, we incrementally changed the implementation such that existing applications continued to work. (The only pre-existing interface was a QML interface that required no change.)

System architecture

Here is a high-level overview of the main system components.

A Fast Thumbnailer for UbuntuExternal API

To the outside world, the thumbnailer provides two APIs.

One API is a QML plugin that registers itself as an image provider for QQuickAsyncImageProvider. This allows the caller to to pass a URI that encodes a query for a local or remote thumbnail at a particular size; if the URI matches the registered provider, QML transfers control to the entry points in our plugin.

The second API is a Qt API that provides three methods:

QSharedPointer<Request> getThumbnail(QString const& filePath,
                                     QSize const& requestedSize);
QSharedPointer<Request> getAlbumArt(QString const& artist,
                                    QString const& album,
                                    QSize const& requestedSize);
QSharedPointer<Request> getArtistArt(QString const& artist,
                                     QString const& album,
                                     QSize const& requestedSize);

The getThumbnail() method extracts thumbnails from local media files, whereas getAlbumArt() and getArtistArt() retrieve artwork from the remote image server. The returned Request object provides a finished signal, and methods to test for success or failure of the request and to extract a thumbnail as a QImage. The request also provides a waitForFinished() method, so the API can be used synchronously.

Thumbnails are delivered to the caller in the size they are requested, subject to a (configurable) 1920-pixel limit. As an escape hatch, requests with width and height of zero deliver artwork at its original size, even if it exceeds the 1920-pixel limit. The scaling algorithm preserves the original aspect ratio and never scales up from the original, so the returned thumbnails may be smaller than their requested size.

DBus service

The thumbnailer is implemented as a DBus service with two interfaces. The first interface provides the server-side implementation of the three methods of the external API; the second interface is an administrative interface that can deliver statistics, clear the internal disk caches, and shut down the service. A simple tool, thumbnailer-admin, allows both interfaces to be called from the command line.

To conserve resources, the service is started on demand by DBus and shuts down after 30 seconds of idle time.

Image extraction

Image extraction uses an abstract base class. This interface is independent of media location and type. The actual image extraction is performed by derived implementations that download images from the remote server, extract them from local image files, or extract them from local streaming media files. This keeps knowledge of image location and encoding out of the main caching and error handling logic, and allows us to support new media types (whether local or remote) by simply adding extra derived implementations.

Image extraction is asynchronous, with currently three implementations:

  • Image downloader
    To retrieve artwork from the remote image server, the service talks to an abstract base class with asynchronous download_album() and download_artist() methods. This allows multiple downloads to run concurrently and makes it easy to add new local or remote image providers without disturbing the code for existing ones. A class derived from that abstract base implements a REST API with QNetworkAccessManager to retrieve images from dash.ubuntu.com.
  • Photo extractor
    The photo extractor is responsible for delivering images from local image files, such as JPEG or PNG files. It simply delegates that work to the image converter and scaler.
  • Audio and video thumbnail extractor
    To extract thumbnails from audio and video files, we use GStreamer. Due to reliability problems with some codecs that can hang or crash, we delegate the task to a separate vs-thumb executable. This shields the service from failures and also allows us to run several GStreamer pipelines concurrently without a crash of one pipeline affecting the others.

Image converter and scaler

We use a simple Image class with a synchronous interface to convert and scale different image formats to JPEG. The implementation uses Gdk-Pixbuf, which can handle many different input formats and is very efficient.

For JPEG source images, the code checks for the presence of EXIF data using libexif and, if it contains a thumbnail that is at least as large as the requested size, scales the thumbnail from the EXIF data. (For images taken with the camera on a Nexus 4, the original image size is 3264×1836, with an embedded EXIF thumbnail of 512×288. Scaling from the EXIF thumbnail is around one hundred times faster than scaling from the full-size image.)

Disk cache

The thumbnailer service optimizes performance and conserves bandwidth and battery by adopting a layered caching strategy.

Two-level caching with failure lookup

Internally, the service uses three separate on-disk caches:

  • Full-size cache
    This cache stores images that are expensive to retrieve (images that are remote or are embedded in audio and video files) at original resolution (scaled down to a 1920-pixel bounding box if the original image is larger). The default size of this cache is 50 MB, which is sufficient to hold around 400 images at 1920×1080 resolution. Images are stored in JPEG format (at a 90% quality setting).
  • Thumbnail cache
    This cache stores thumbnails at the size that was requested by the caller, such as 512×288. The default size of this cache is 100 MB, which is sufficient to store around 11,000 thumbnails at 512×288, or around 25,000 thumbnails at 256×144.
  • Failure cache
    The failure cache stores the keys for images that could not be extracted because of a failure. For remote images, this means that the server returned an authoritative answer “no such image exists”, or that we encountered an unexpected (non-authoritative) failure, such as the server not responding or a DNS lookup timing out. For local images, it means either that the image data could not be processed because it is damaged, or that an audio file does not contain embedded artwork.

The full-size cache exists because it is likely that an application will request thumbnails at different sizes for the same image. For example, when scrolling through a list of songs that shows a small thumbnail of the album cover beside each song, the user is likely to select one of the songs to play, at which point the media player will display the same cover in a larger size. By keeping full-size images in a separate (smallish) cache, we avoid performing an expensive extraction or download a second time. Instead, we create additional thumbnails by scaling them from the full-size cache (which uses an LRU eviction policy).

The thumbnail cache stores thumbnails that were previously retrieved, also using LRU eviction. Thumbnails are stored as JPEG at the default quality setting of 75%, at the actual size that was requested by the caller. Storing JPEG images (rather than, say, PNG) saves space and increases cache effectiveness. (The minimal quality loss from compression is irrelevant for thumbnails). Because we store thumbnails at the size they are actually needed, we may have several thumbnails for the same image in the cache (each thumbnail at a different size). But applications typically ask for thumbnails in only a small number of sizes, and ask for different sizes for the same image only rarely. So, the slight increase in disk space is minor and amply repaid by applications not having to scale thumbnails after they receive them from the cache, which saves battery and achieves better performance overall.

Finally, the failure cache is used to stop futile attempts to repeatedly extract a thumbnail when we know that the attempt will fail. It uses LRU eviction with an expiry time for each entry.

Cache lookup algorithm

When asked for a thumbnail at a particular size, the lookup and thumbnail generation proceed as follows:

  1. Check if a thumbnail exists in the requested size in the thumbnail cache. If so, return it.
  2. Check if a full-size image for the thumbnail exists in the full-size cache. If so, scale the new thumbnail from the full-size image, add the thumbnail to the thumbnail cache, and return it.
  3. Check if there is an entry for the thumbnail in the failure cache. If so, return an error.
  4. Attempt to download or extract the original image for the thumbnail. If the attempt fails, add an entry to the failure cache and return an error.
  5. If the original image was delivered by the remote server or was extracted locally from streaming media, add it to the full-size cache.
  6. Scale the thumbnail to the desired size, add it to the thumbnail cache, and return it.

Note that these steps represent only the logical flow of control for a particular thumbnail. The implementation executes these steps concurrently for different thumbnails.

Designing for performance

Apart from fast on-disk caches (see below), the thumbnailer must make efficient use of I/O bandwidth and threads. This not only means making things fast, but also to not unnecessarily waste resources such as threads, memory, network connections, or file descriptors. Provided that enough requests are made to keep the service busy, we do not want it to ever wait for a download or image extraction to complete while there is something else that could be done in the mean time, and we want it to keep all CPU cores busy. In addition, requests that are slow (because they require a download or a CPU-intensive image extraction) must not block requests that are queued up behind them if those requests would result in cache hits that could be returned immediately.

To achieve a high degree of concurrency without blocking on long-running operations while holding precious resources, the thumbnailer uses a three-phase lookup algorithm:

  1. In phase 1, we look at the caches to determine if we have a hit or an authoritative miss. Phase 1 is very fast. (It takes around a millisecond to return a thumbnail from the cache on a Nexus 4.) However, cache lookup can briefly stall on disk I/O or require a lot of CPU to extract and scale an image. To get good performance, phase 1 requests are passed to a thread pool with as many threads as there are CPU cores. This allows the maximum number of lookups to proceed concurrently.
  2. Phase 2 is initiated if phase 1 determines that a thumbnail requires download or extraction, either of which can take on the order of seconds. (In case of extraction from local media, the task is CPU intensive; in case of download, most of the time is spent waiting for the reply from the server.) This phase is scheduled asynchronously from an event loop. This minimizes task switching and allows large numbers of requests to be queued while only using a few bytes for each request that is waiting in the queue.
  3. Phase 3 is really a repeat of phase 1: if phase 2 produces a thumbnail, it adds it to the cache; if phase 2 does not produce a thumbnail, it creates an entry in the failure cache. By simply repeating phase 1, the lookup then results in either a thumbnail or an error.

If phase 2 determines that a download or extraction is required, that work is performed concurrently: the service schedules several downloads and extractions in parallel. By default, it will run up to two concurrent downloads, and as many concurrent GStreamer pipelines as there are CPUs. This ensures that we use all of the available CPU cores. Moreover, download and extraction run concurrently with lookups for phase 1 and 3. This means that, even if a cache lookup briefly stalls on I/O, there is a good chance that another thread can make use of the CPU.

Because slow operations do not block lookup, this also ensures that a slow request does not stall requests for thumbnails that are already in the cache. In other words, it does not matter how many slow requests are in progress: requests that can be completed quickly are indeed completed quickly, regardless of what is going on elsewhere.

Overall, this strategy works very well. For example, with sufficient workload, the service achieves around 750% CPU utilization on an 8-core desktop machine, while still delivering cache hits almost instantaneously. (On a Nexus 4, cache hits take a little over 1 ms while concurrent extractions or downloads are in progress.)

A re-usable persistent cache for C++

The three internal caches are implemented by a small and flexible C++ API. This API is available as a separate reusable PersistentStringCache component (see persistent-cache-cpp) that provides a persistent store of arbitrary key–value pairs. Keys and values can be binary, and entries can be large. (Megabyte-sized values do not present a problem.)

The implementation uses leveldb, which provides a very fast NoSQL database that scales to multi-gigabyte sizes and provides integrity guarantees. In particular, if the calling process crashes, all inserts that completed at the API level will be intact after a restart. (In case of a power failure or kernel crash, a few buffered inserts can be lost, but the integrity of the database is still guaranteed.)

To use a cache, the caller instantiates it with a path name, a maximum size, and an eviction policy. The eviction policy can be set to either strict LRU (least-recently-used) or LRU with an expiry time. Once a cache reaches its maximum size, expired entries (if any) are evicted first and, if that does not free enough space for a new entry, entries are discarded in least-recently-used order until enough room is available to insert a new record. (In all other respects, expired entries behave like entries that were never added.)

A simple get/put API allows records to be retrieved and added, for example:

auto c = core::PersistentStringCache::open(
    “my_cache”, 100 * 1024 * 1024, core::CacheDiscardPolicy::lru_only);
// Look for an entry and add it if there is a cache miss.
string key = "Bjarne";
auto value = c->get(key);
if (value) {
    cout << key << ″: ″ << *value << endl;
} else {
    value = "C++ inventor";  // Provide a value for the key. 
    c->put(key, *value);     // Insert it.
}

Running this program prints nothing on the first run, and “Bjarne: C++ inventor” on all subsequent runs.

The API also allows application-specific metadata to be added to records, provides detailed statistics, supports dynamic resizing of caches, and offers a simple adapter template that makes it easy to store complex user-defined types without the need to clutter the code with explicit serialization and deserialization calls. (In a pinch, if iteration is not needed, the cache can be used as a persistent map by setting an impossibly large cache size, in which case no records are ever evicted.)

Performance

Our benchmarks indicate good performance. (Figures are for an Intel Ivy Bridge i7-3770k 3.5 GHz machine with a 256 GB SSD.) Our test uses 60-byte string keys. Values are binary blobs filled with random data (so they are not compressible), 20 kB in size with a standard deviation of 7,000, so the majority of values are 13–27 kB in size. The cache size is 100 MB, so it contains around 5,000 records.

Filling the cache with 100 MB of records takes around 2.8 seconds. Thereafter, the benchmark does a random lookup with an 80% hit probability. In case of a cache miss, it inserts a new random record, evicting old records in LRU order to make room for the new one. For 100,000 iterations, the cache returns around 4,800 “thumbnails” per second, with an aggregate read/write throughput of around 93 MB/sec. At 90% hit rate, we see twice the performance at around 7,100 records/sec. (Writes are expensive once the cache is full due to the need to evict entries, which requires updating the main cache table as well as an index.)

Repeating the test with a 1 GB cache produces identical timings so (within limits) performance remains constant for large databases.

Overall, performance is restricted largely by the bandwidth to disk. With a 7,200 rpm disk, we measured around one third of the performance with an SSD.

Recovering from errors

The overall design of the thumbnailer delivers good performance when things work. However, our implementation has to deal with the unexpected, such as network requests that do not return responses, GStreamer pipelines that crash, request overload, and so on. What follows is a partial list of steps we took to ensure that things behave sensibly, particularly on a battery-powered device.

Retry strategy

The failure cache provides an effective way to stop the service from endlessly trying to create thumbnails that, in an earlier attempt, returned an error.

For remote images, we know that, if the server has (authoritatively) told us that it has no artwork for a particular artist or album, it is unlikely that artwork will appear any time soon. However, the server may be updated with more artwork periodically. To deal with this, we add an expiry time of one week to the entries in the failure cache. That way, we do not try to retrieve the same image again until at least one week has passed (and only if we receive a request for a thumbnail for that image again later).

As opposed to authoritative answers from the image server (“I do not have artwork for this artist.”), we can also encounter transient failures. For example, the server may currently be down, or there may be some other network-related issue. In this case, we remember the time of the failure and do not try to contact the remote server again for two hours. This conserves bandwidth and battery power.

The device may also disconnected from the network, in which case any attempt to retrieve a remote image is doomed. Our implementation returns failure immediately on a cache miss for a remote image if no network is present or the device is in flight mode. (We do not add an entry to the failure cache in this case).

For local files, we know that, if an attempt to get a thumbnail for a particular file has failed, future attempts will fail as well. This means that the only way for the problem to get fixed is by modifying or replacing the actual media file. To deal with this, we add the inode number, modification time, and inode modification time to the key for local images. If a user replaces, say, a music file with a new one that contains artwork, we automatically pick up the new version of the file because its key has changed; the old version will eventually fall out of the cache.

Download and extraction failures

We monitor downloads and extractions for timely completion. (Timeouts for downloads and extractions can be configured separately.) If the server does not respond within 10 seconds, we abandon the attempt and treat it it as a transient network error. Similarly, the vs-thumb processes that extract images from audio and video files can hang. We monitor these processes and kill them if they do not produce a result within 10 seconds.

Database corruption

Assuming an error-free implementation of leveldb, database corruption is impossible. However, in practice, an errant command could scribble over the database files. If leveldb detects that the database is corrupted, the recovery strategy is simple: we delete the on-disk cache and start again from scratch. Because the cache contents are ephemeral anyway, this is fine (other than slower operation until the working set of thumbnails makes it into the cache again).

Dealing with backlog

The asynchronous API provided by the service allows an application to submit an unlimited number of requests. Lots of requests happen if, for example, the user has inserted a flash card with thousands of photos into the device and then requests a gallery view for the collection. If the service’s client-side API blindly forwards requests via DBus, this causes a problem because DBus terminates the connection once there are more than around 400 outstanding requests.

To deal with this, we limit the number of outstanding requests to 200 and send another request via DBus only when an earlier request completes. Additional requests are queued in memory. Because this happens on the client side, the number of outstanding requests is limited only by the amount of memory that is available to the client.

A related problem arises if a client submits many requests for a thumbnail for the same image. This happens when, for example, the user looks at a list of tracks: tracks that belong to the same album have the same artwork. If artwork needs to be retrieved from the remote server, naively forwarding cache misses for each thumbnail to the server would end up re-downloading the same image several times.

We deal with this by maintaining an in-memory map of all remote download requests that are currently in progress. If phase 1 reports a cache miss, before initiating a download, we add the key for the remote image to the map and remove it again once the download completes. If more requests for the same image encounter a cache miss while the download for the original request is still in progress, the key for the in-progress download is still in the map, and we hold additional requests for the same image until the download completes. We then schedule the held requests as usual and create their thumbnails from the image that was cached by the first request.

Security

The thumbnailer runs with normal user privileges. We use AppArmor’s aa_query_label() function to verify that the calling client has read access to a file it wants a thumbnail for. This prevents one application from accessing thumbnails produced by a different application, unless both applications can read the original file. In addition, we place the entire service under an AppArmor profile to ensure that it can write only to its own cache directory.

Conclusion

Overall, we are very pleased with the overall design and performance of the thumbnailer. Each component has a clearly defined role with a clean interface, which made it easy for us to experiment and to refine the design as we went along. The design is extensible, so we can support additional media types or remote data sources without disturbing the existing code.

We used threads sparingly and only where we saw worthwhile concurrency opportunities. Using asynchronous interfaces for long-running operations kept resource usage to a minimum and allowed us to take advantage of I/O interleaving. In turn, this extracts the best possible performance from the hardware.

The thumbnailer now runs on Ubuntu Touch and is used by the gallery, camera, and music apps, as well as for all scopes that display media thumbnails.

This article has been originally published on Michi Henning's blog.

Read more
Prakash

If the Digital Camera technology was open sourced, would it be available to consumers early? If it wasn’t for patents and Kodak hiding the technology, the world may be different.

In 1975, this Kodak employee invented the digital camera. His bosses made him hide it.

Read More: http://www.brw.com.au/p/tech-gadgets/made_this_kodak_employee_invented_QnYp4iCrFXYwagdCRzszeP

Read more
David Owen

What happens if you apply a technique from stock-analysis to software projects?

Continue reading "Stock-analysis applied to software projects"

Read more
Michi

Over the past few months, James Henstridge, Xavi Garcia Mena, and I have implemented a fast and scalable thumbnailing service for Ubuntu and Ubuntu Touch. This post explains how we did it, and how we achieved our performance and reliability goals.

Introduction

On a phone as well as the desktop, applications need to display image thumbnails for various media, such as photos, songs, and videos. Creating thumbnails for such media is CPU-intensive and can be costly in bandwidth if images are retrieved over the network. In addition, different types of media require the use of different APIs that are non-trivial to learn. It makes sense to provide thumbnail creation as a platform API that hides this complexity from application developers and, to improve performance, to cache thumbnails on disk.

This article explains the requirements we had and how we implemented a thumbnailer service that is extremely fast and scalable, and robust in the face of power loss or crashes.

Requirements

We had a number of requirements we wanted to meet in our implementation.

  • Robustness
    In the event of a crash, the implementation must guarantee the integrity of on-disk data structures. This is particularly important on a phone, where we cannot expect the user to perform manual recovery (such as cleaning up damaged files). Because batteries can run out at any time, integrity must be guaranteed even in the face of power loss.
  • Scalability
    It is common for people to store many thousands of songs and photos on a device, so the cache must scale to at least tens of thousands of records. Thumbnails can range in size from a few kilobytes to well over a megabyte (for “thumbnails” at full-screen resolution), so the cache must deal efficiently with large records.
  • Re-usability
    Persistent and reliable on-disk storage of arbitrary records (ranging in size from a few bytes to potentially megabytes) is a common application requirement, so we did not want to create a cache implementation that is specific to thumbnails. Instead, the disk cache is provided as a stand-alone C++ API that can be used for any number of other purposes, such as a browser or HTTP cache, or to build an object file cache similar to ccache.
  • High performance
    The performance of the thumbnailer directly affects the user experience: it is not nice for the customer to look at “please wait a while” icons in, say, an image gallery while thumbnails are being loaded one by one. We therefore had to have a high-performance implementation that delivers cached thumbnails quickly (on the order of a millisecond per thumbnail on an Arm CPU). An efficient implementation also helps to conserve battery life.
  • Location independence and extensibility
    Canonical runs an image server at dash.ubuntu.com that provides album and artist artwork for many musicians and bands. Images from this server are used to display artwork in the music player for media that contains ID3 tags, but does not embed artwork in the media file. The thumbnailer must work with embedded images as well as remote images, and it must be possible to extend it for new types of media without unduly disturbing the existing code.
  • Low bandwidth consumption
    Mobile phones typically come with data caps, so the cache has to be frugal with network bandwidth.
  • Concurrency and isolation
    The implementation has to allow concurrent access by multiple applications, as well as concurrent access from a single implementation. Besides needing to be thread-safe, this means that a request for a thumbnail that is slow (such as downloading an image over the network) must not delay other requests.
  • Fault tolerance
    Mobile devices lose network access without warning, and users can add corrupt media files to their device. The implementation must be resilient to partial failures, such as incomplete network replies, dropped connections, and bad image data. Moreover, the recovery strategy for such failures must conserve battery and avoid repeated futile attempts to create thumbnails from media that cannot be retrieved or contains malformed data.
  • Security
    The implementation must ensure that applications cannot see (or, worse, overwrite) each other’s thumbnails or coerce the thumbnailer into delivering images from files that an application is not allowed to read.
  • Asynchronous API
    The customers of the thumbnailer are applications that are written in QML or Qt, which cannot block in the UI thread. The thumbnailer therefore must provide a non-blocking API. Moreover, the application developer should be able to get the best possible performance without having to use threads. Instead, concurrency must be internal to the implementation (which is able to put threads to use intelligently where they make sense), instead of the application throwing threads at the problem in the hope that it might make things faster when, in fact, it might just add overhead.
  • Monitoring
    The effectiveness of a cache cannot be assessed without statistics to show hit and miss rates, evictions, and other basic performance data, so it must provide a way to extract this information.
  • Error reporting
    When something goes wrong with a system service, typically the only way to learn about the problem is to look at log messages. In case of a failure, the implementation must leave enough footprints behind to allow someone to diagnose a failure after the fact with some chance of success.
  • Backward compatibility
    This project was a rewrite of an earlier implementation. Rather than delivering a “big bang” piece of software and potentially upsetting existing clients, we incrementally changed the implementation such that existing applications continued to work. (The only pre-existing interface was a QML interface that required no change.)

System architecture

Here is a high-level overview of the main system components.

A Fast Thumbnailer for UbuntuExternal API

To the outside world, the thumbnailer provides two APIs.

One API is a QML plugin that registers itself as an image provider for QQuickAsyncImageProvider. This allows the caller to to pass a URI that encodes a query for a local or remote thumbnail at a particular size; if the URI matches the registered provider, QML transfers control to the entry points in our plugin.

The second API is a Qt API that provides three methods:

QSharedPointer<Request> getThumbnail(QString const& filePath,
                                     QSize const& requestedSize);
QSharedPointer<Request> getAlbumArt(QString const& artist,
                                    QString const& album,
                                    QSize const& requestedSize);
QSharedPointer<Request> getArtistArt(QString const& artist,
                                     QString const& album,
                                     QSize const& requestedSize);

The getThumbnail() method extracts thumbnails from local media files, whereas getAlbumArt() and getArtistArt() retrieve artwork from the remote image server. The returned Request object provides a finished signal, and methods to test for success or failure of the request and to extract a thumbnail as a QImage. The request also provides a waitForFinished() method, so the API can be used synchronously.

Thumbnails are delivered to the caller in the size they are requested, subject to a (configurable) 1920-pixel limit. As an escape hatch, requests with width and height of zero deliver artwork at its original size, even if it exceeds the 1920-pixel limit. The scaling algorithm preserves the original aspect ratio and never scales up from the original, so the returned thumbnails may be smaller than their requested size.

DBus service

The thumbnailer is implemented as a DBus service with two interfaces. The first interface provides the server-side implementation of the three methods of the external API; the second interface is an administrative interface that can deliver statistics, clear the internal disk caches, and shut down the service. A simple tool, thumbnailer-admin, allows both interfaces to be called from the command line.

To conserve resources, the service is started on demand by DBus and shuts down after 30 seconds of idle time.

Image extraction

Image extraction uses an abstract base class. This interface is independent of media location and type. The actual image extraction is performed by derived implementations that download images from the remote server, extract them from local image files, or extract them from local streaming media files. This keeps knowledge of image location and encoding out of the main caching and error handling logic, and allows us to support new media types (whether local or remote) by simply adding extra derived implementations.

Image extraction is asynchronous, with currently three implementations:

  • Image downloader
    To retrieve artwork from the remote image server, the service talks to an abstract base class with asynchronous download_album() and download_artist() methods. This allows multiple downloads to run concurrently and makes it easy to add new local or remote image providers without disturbing the code for existing ones. A class derived from that abstract base implements a REST API with QNetworkAccessManager to retrieve images from dash.ubuntu.com.
  • Photo extractor
    The photo extractor is responsible for delivering images from local image files, such as JPEG or PNG files. It simply delegates that work to the image converter and scaler.
  • Audio and video thumbnail extractor
    To extract thumbnails from audio and video files, we use GStreamer. Due to reliability problems with some codecs that can hang or crash, we delegate the task to a separate vs-thumb executable. This shields the service from failures and also allows us to run several GStreamer pipelines concurrently without a crash of one pipeline affecting the others.

Image converter and scaler

We use a simple Image class with a synchronous interface to convert and scale different image formats to JPEG. The implementation uses Gdk-Pixbuf, which can handle many different input formats and is very efficient.

For JPEG source images, the code checks for the presence of EXIF data using libexif and, if it contains a thumbnail that is at least as large as the requested size, scales the thumbnail from the EXIF data. (For images taken with the camera on a Nexus 4, the original image size is 3264×1836, with an embedded EXIF thumbnail of 512×288. Scaling from the EXIF thumbnail is around one hundred times faster than scaling from the full-size image.)

Disk cache

The thumbnailer service optimizes performance and conserves bandwidth and battery by adopting a layered caching strategy.

Two-level caching with failure lookup

Internally, the service uses three separate on-disk caches:

  • Full-size cache
    This cache stores images that are expensive to retrieve (images that are remote or are embedded in audio and video files) at original resolution (scaled down to a 1920-pixel bounding box if the original image is larger). The default size of this cache is 50 MB, which is sufficient to hold around 400 images at 1920×1080 resolution. Images are stored in JPEG format (at a 90% quality setting).
  • Thumbnail cache
    This cache stores thumbnails at the size that was requested by the caller, such as 512×288. The default size of this cache is 100 MB, which is sufficient to store around 11,000 thumbnails at 512×288, or around 25,000 thumbnails at 256×144.
  • Failure cache
    The failure cache stores the keys for images that could not be extracted because of a failure. For remote images, this means that the server returned an authoritative answer “no such image exists”, or that we encountered an unexpected (non-authoritative) failure, such as the server not responding or a DNS lookup timing out. For local images, it means either that the image data could not be processed because it is damaged, or that an audio file does not contain embedded artwork.

The full-size cache exists because it is likely that an application will request thumbnails at different sizes for the same image. For example, when scrolling through a list of songs that shows a small thumbnail of the album cover beside each song, the user is likely to select one of the songs to play, at which point the media player will display the same cover in a larger size. By keeping full-size images in a separate (smallish) cache, we avoid performing an expensive extraction or download a second time. Instead, we create additional thumbnails by scaling them from the full-size cache (which uses an LRU eviction policy).

The thumbnail cache stores thumbnails that were previously retrieved, also using LRU eviction. Thumbnails are stored as JPEG at the default quality setting of 75%, at the actual size that was requested by the caller. Storing JPEG images (rather than, say, PNG) saves space and increases cache effectiveness. (The minimal quality loss from compression is irrelevant for thumbnails). Because we store thumbnails at the size they are actually needed, we may have several thumbnails for the same image in the cache (each thumbnail at a different size). But applications typically ask for thumbnails in only a small number of sizes, and ask for different sizes for the same image only rarely. So, the slight increase in disk space is minor and amply repaid by applications not having to scale thumbnails after they receive them from the cache, which saves battery and achieves better performance overall.

Finally, the failure cache is used to stop futile attempts to repeatedly extract a thumbnail when we know that the attempt will fail. It uses LRU eviction with an expiry time for each entry.

Cache lookup algorithm

When asked for a thumbnail at a particular size, the lookup and thumbnail generation proceed as follows:

  1. Check if a thumbnail exists in the requested size in the thumbnail cache. If so, return it.
  2. Check if a full-size image for the thumbnail exists in the full-size cache. If so, scale the new thumbnail from the full-size image, add the thumbnail to the thumbnail cache, and return it.
  3. Check if there is an entry for the thumbnail in the failure cache. If so, return an error.
  4. Attempt to download or extract the original image for the thumbnail. If the attempt fails, add an entry to the failure cache and return an error.
  5. If the original image was delivered by the remote server or was extracted locally from streaming media, add it to the full-size cache.
  6. Scale the thumbnail to the desired size, add it to the thumbnail cache, and return it.

Note that these steps represent only the logical flow of control for a particular thumbnail. The implementation executes these steps concurrently for different thumbnails.

Designing for performance

Apart from fast on-disk caches (see below), the thumbnailer must make efficient use of I/O bandwidth and threads. This not only means making things fast, but also to not unnecessarily waste resources such as threads, memory, network connections, or file descriptors. Provided that enough requests are made to keep the service busy, we do not want it to ever wait for a download or image extraction to complete while there is something else that could be done in the mean time, and we want it to keep all CPU cores busy. In addition, requests that are slow (because they require a download or a CPU-intensive image extraction) must not block requests that are queued up behind them if those requests would result in cache hits that could be returned immediately.

To achieve a high degree of concurrency without blocking on long-running operations while holding precious resources, the thumbnailer uses a three-phase lookup algorithm:

  1. In phase 1, we look at the caches to determine if we have a hit or an authoritative miss. Phase 1 is very fast. (It takes around a millisecond to return a thumbnail from the cache on a Nexus 4.) However, cache lookup can briefly stall on disk I/O or require a lot of CPU to extract and scale an image. To get good performance, phase 1 requests are passed to a thread pool with as many threads as there are CPU cores. This allows the maximum number of lookups to proceed concurrently.
  2. Phase 2 is initiated if phase 1 determines that a thumbnail requires download or extraction, either of which can take on the order of seconds. (In case of extraction from local media, the task is CPU intensive; in case of download, most of the time is spent waiting for the reply from the server.) This phase is scheduled asynchronously from an event loop. This minimizes task switching and allows large numbers of requests to be queued while only using a few bytes for each request that is waiting in the queue.
  3. Phase 3 is really a repeat of phase 1: if phase 2 produces a thumbnail, it adds it to the cache; if phase 2 does not produce a thumbnail, it creates an entry in the failure cache. By simply repeating phase 1, the lookup then results in either a thumbnail or an error.

If phase 2 determines that a download or extraction is required, that work is performed concurrently: the service schedules several downloads and extractions in parallel. By default, it will run up to two concurrent downloads, and as many concurrent GStreamer pipelines as there are CPUs. This ensures that we use all of the available CPU cores. Moreover, download and extraction run concurrently with lookups for phase 1 and 3. This means that, even if a cache lookup briefly stalls on I/O, there is a good chance that another thread can make use of the CPU.

Because slow operations do not block lookup, this also ensures that a slow request does not stall requests for thumbnails that are already in the cache. In other words, it does not matter how many slow requests are in progress: requests that can be completed quickly are indeed completed quickly, regardless of what is going on elsewhere.

Overall, this strategy works very well. For example, with sufficient workload, the service achieves around 750% CPU utilization on an 8-core desktop machine, while still delivering cache hits almost instantaneously. (On a Nexus 4, cache hits take a little over 1 ms while concurrent extractions or downloads are in progress.)

A re-usable persistent cache for C++

The three internal caches are implemented by a small and flexible C++ API. This API is available as a separate reusable PersistentStringCache component (see persistent-cache-cpp) that provides a persistent store of arbitrary key–value pairs. Keys and values can be binary, and entries can be large. (Megabyte-sized values do not present a problem.)

The implementation uses leveldb, which provides a very fast NoSQL database that scales to multi-gigabyte sizes and provides integrity guarantees. In particular, if the calling process crashes, all inserts that completed at the API level will be intact after a restart. (In case of a power failure or kernel crash, a few buffered inserts can be lost, but the integrity of the database is still guaranteed.)

To use a cache, the caller instantiates it with a path name, a maximum size, and an eviction policy. The eviction policy can be set to either strict LRU (least-recently-used) or LRU with an expiry time. Once a cache reaches its maximum size, expired entries (if any) are evicted first and, if that does not free enough space for a new entry, entries are discarded in least-recently-used order until enough room is available to insert a new record. (In all other respects, expired entries behave like entries that were never added.)

A simple get/put API allows records to be retrieved and added, for example:

auto c = core::PersistentStringCache::open(
    “my_cache”, 100 * 1024 * 1024, core::CacheDiscardPolicy::lru_only);
// Look for an entry and add it if there is a cache miss.
string key = "Bjarne";
auto value = c->get(key);
if (value) {
    cout << key << ″: ″ << *value << endl;
} else {
    value = "C++ inventor";  // Provide a value for the key. 
    c->put(key, *value);     // Insert it.
}

Running this program prints nothing on the first run, and “Bjarne: C++ inventor” on all subsequent runs.

The API also allows application-specific metadata to be added to records, provides detailed statistics, supports dynamic resizing of caches, and offers a simple adapter template that makes it easy to store complex user-defined types without the need to clutter the code with explicit serialization and deserialization calls. (In a pinch, if iteration is not needed, the cache can be used as a persistent map by setting an impossibly large cache size, in which case no records are ever evicted.)

Performance

Our benchmarks indicate good performance. (Figures are for an Intel Ivy Bridge i7-3770k 3.5 GHz machine with a 256 GB SSD.) Our test uses 60-byte string keys. Values are binary blobs filled with random data (so they are not compressible), 20 kB in size with a standard deviation of 7,000, so the majority of values are 13–27 kB in size. The cache size is 100 MB, so it contains around 5,000 records.

Filling the cache with 100 MB of records takes around 2.8 seconds. Thereafter, the benchmark does a random lookup with an 80% hit probability. In case of a cache miss, it inserts a new random record, evicting old records in LRU order to make room for the new one. For 100,000 iterations, the cache returns around 4,800 “thumbnails” per second, with an aggregate read/write throughput of around 93 MB/sec. At 90% hit rate, we see twice the performance at around 7,100 records/sec. (Writes are expensive once the cache is full due to the need to evict entries, which requires updating the main cache table as well as an index.)

Repeating the test with a 1 GB cache produces identical timings so (within limits) performance remains constant for large databases.

Overall, performance is restricted largely by the bandwidth to disk. With a 7,200 rpm disk, we measured around one third of the performance with an SSD.

Recovering from errors

The overall design of the thumbnailer delivers good performance when things work. However, our implementation has to deal with the unexpected, such as network requests that do not return responses, GStreamer pipelines that crash, request overload, and so on. What follows is a partial list of steps we took to ensure that things behave sensibly, particularly on a battery-powered device.

Retry strategy

The failure cache provides an effective way to stop the service from endlessly trying to create thumbnails that, in an earlier attempt, returned an error.

For remote images, we know that, if the server has (authoritatively) told us that it has no artwork for a particular artist or album, it is unlikely that artwork will appear any time soon. However, the server may be updated with more artwork periodically. To deal with this, we add an expiry time of one week to the entries in the failure cache. That way, we do not try to retrieve the same image again until at least one week has passed (and only if we receive a request for a thumbnail for that image again later).

As opposed to authoritative answers from the image server (“I do not have artwork for this artist.”), we can also encounter transient failures. For example, the server may currently be down, or there may be some other network-related issue. In this case, we remember the time of the failure and do not try to contact the remote server again for two hours. This conserves bandwidth and battery power.

The device may also disconnected from the network, in which case any attempt to retrieve a remote image is doomed. Our implementation returns failure immediately on a cache miss for a remote image if no network is present or the device is in flight mode. (We do not add an entry to the failure cache in this case).

For local files, we know that, if an attempt to get a thumbnail for a particular file has failed, future attempts will fail as well. This means that the only way for the problem to get fixed is by modifying or replacing the actual media file. To deal with this, we add the inode number, modification time, and inode modification time to the key for local images. If a user replaces, say, a music file with a new one that contains artwork, we automatically pick up the new version of the file because its key has changed; the old version will eventually fall out of the cache.

Download and extraction failures

We monitor downloads and extractions for timely completion. (Timeouts for downloads and extractions can be configured separately.) If the server does not respond within 10 seconds, we abandon the attempt and treat it it as a transient network error. Similarly, the vs-thumb processes that extract images from audio and video files can hang. We monitor these processes and kill them if they do not produce a result within 10 seconds.

Database corruption

Assuming an error-free implementation of leveldb, database corruption is impossible. However, in practice, an errant command could scribble over the database files. If leveldb detects that the database is corrupted, the recovery strategy is simple: we delete the on-disk cache and start again from scratch. Because the cache contents are ephemeral anyway, this is fine (other than slower operation until the working set of thumbnails makes it into the cache again).

Dealing with backlog

The asynchronous API provided by the service allows an application to submit an unlimited number of requests. Lots of requests happen if, for example, the user has inserted a flash card with thousands of photos into the device and then requests a gallery view for the collection. If the service’s client-side API blindly forwards requests via DBus, this causes a problem because DBus terminates the connection once there are more than around 400 outstanding requests.

To deal with this, we limit the number of outstanding requests to 200 and send another request via DBus only when an earlier request completes. Additional requests are queued in memory. Because this happens on the client side, the number of outstanding requests is limited only by the amount of memory that is available to the client.

A related problem arises if a client submits many requests for a thumbnail for the same image. This happens when, for example, the user looks at a list of tracks: tracks that belong to the same album have the same artwork. If artwork needs to be retrieved from the remote server, naively forwarding cache misses for each thumbnail to the server would end up re-downloading the same image several times.

We deal with this by maintaining an in-memory map of all remote download requests that are currently in progress. If phase 1 reports a cache miss, before initiating a download, we add the key for the remote image to the map and remove it again once the download completes. If more requests for the same image encounter a cache miss while the download for the original request is still in progress, the key for the in-progress download is still in the map, and we hold additional requests for the same image until the download completes. We then schedule the held requests as usual and create their thumbnails from the image that was cached by the first request.

Security

The thumbnailer runs with normal user privileges. We use AppArmor’s aa_query_label() function to verify that the calling client has read access to a file it wants a thumbnail for. This prevents one application from accessing thumbnails produced by a different application, unless both applications can read the original file. In addition, we place the entire service under an AppArmor profile to ensure that it can write only to its own cache directory.

Conclusion

Overall, we are very pleased with the overall design and performance of the thumbnailer. Each component has a clearly defined role with a clean interface, which made it easy for us to experiment and to refine the design as we went along. The design is extensible, so we can support additional media types or remote data sources without disturbing the existing code.

We used threads sparingly and only where we saw worthwhile concurrency opportunities. Using asynchronous interfaces for long-running operations kept resource usage to a minimum and allowed us to take advantage of I/O interleaving. In turn, this extracts the best possible performance from the hardware.

The thumbnailer now runs on Ubuntu Touch and is used by the gallery, camera, and music apps, as well as for all scopes that display media thumbnails.


Read more
Dustin Kirkland


Canonical is delighted to sponsor ContainerCon 2015, a Linux Foundation event in Seattle next week, August 17-19, 2015. It's quite exciting to see the A-list of sponsors, many of them newcomers to this particular technology, teaming with energy around containers. 

From chroots to BSD Jails and Solaris Zones, the concepts behind containers were established decades ago, and in fact traverse the spectrum of server operating systems. At Canonical, we've been working on containers in Ubuntu for more than half a decade, providing a home and resources for stewardship and maintenance of the upstream Linux Containers (LXC) project since 2010.

Last year, we publicly shared our designs for LXD -- a new stratum on top of LXC that endows the advantages of a traditional hypervisor into the faster, more efficient world of containers.

Those designs are now reality, with the open source Golang code readily available on Github, and Ubuntu packages available in a PPA for all supported releases of Ubuntu, and already in the Ubuntu 15.10 beta development tree. With ease, you can launch your first LXD containers in seconds, following this simple guide.

LXD is a persistent daemon that provides a clean RESTful interface to manage (start, stop, clone, migrate, etc.) any of the containers on a given host.

Hosts running LXD are handily federated into clusters of container hypervisors, and can work as Nova Compute nodes in OpenStack, for example, delivering Infrastructure-as-a-Service cloud technology at lower costs and greater speeds.

Here, LXD and Docker are quite complementary technologies. LXD furnishes a dynamic platform for "system containers" -- containers that behave like physical or virtual machines, supplying all of the functionality of a full operating system (minus the kernel, which is shared with the host). Such "machine containers" are the core of IaaS clouds, where users focus on instances with compute, storage, and networking that behave like traditional datacenter hardware.

LXD runs perfectly well along with Docker, which supplies a framework for "application containers" -- containers that enclose individual processes that often relate to one another as pools of micro services and deliver complex web applications.

Moreover, the Zen of LXD is the fact that the underlying container implementation is actually decoupled from the RESTful API that drives LXD functionality. We are most excited to discuss next week at ContainerCon our work with Microsoft around the LXD RESTful API, as a cross-platform container management layer.

Ben Armstrong, a Principal Program Manager Lead at Microsoft on the core virtualization and container technologies, has this to say:
“As Microsoft is working to bring Windows Server Containers to the world – we are excited to see all the innovation happening across the industry, and have been collaborating with many projects to encourage and foster this environment. Canonical’s LXD project is providing a new way for people to look at and interact with container technologies. Utilizing ‘system containers’ to bring the advantages of container technology to the core of your cloud infrastructure is a great concept. We are looking forward to seeing the results of our engagement with Canonical in this space.”
Finally, if you're in Seattle next week, we hope you'll join us for the technical sessions we're leading at ContainerCon 2015, including: "Putting the D in LXD: Migration of Linux Containers", "Container Security - Past, Present, and Future", and "Large Scale Container Management with LXD and OpenStack". Details are below.
Date: Monday, August 17 • 2:20pm - 3:10pm
Title: Large Scale Container Management with LXD and OpenStack
Speaker: Stéphane Graber
Abstracthttp://sched.co/3YK6
Location: Grand Ballroom B
Schedulehttp://sched.co/3YK6 
Date: Wednesday, August 19 10:25am-11:15am
Title: Putting the D in LXD: Migration of Linux Containers
Speaker: Tycho Andersen
Abstract: http://sched.co/3YTz
Location: Willow A
Schedule: http://sched.co/3YTz
Date: Wednesday, August 19 • 3:00pm - 3:50pm
Title: Container Security - Past, Present and Future
Speaker: Serge Hallyn
Abstract: http://sched.co/3YTl
Location: Ravenna
Schedule: http://sched.co/3YTl
Cheers,
Dustin

Read more
Michi Henning

Hello world!

Welcome to Canonical Voices. This is your first post. Edit or delete it, then start blogging!

Read more
Joseph Salisbury

Meeting Minutes

IRC Log of the meeting.

Meeting minutes.

Agenda

20150811 Meeting Agenda


Release Metrics and Incoming Bugs

Release metrics and incoming bug data can be reviewed at the following link:

  • http://kernel.ubuntu.com/reports/kt-meeting.txt


Status: Wily Development Kernel

We have rebased our Wily master-next branch to the latest upstream
v4.2-rc6 and uploaded to our ~canonical-kernel-team PPA. We are
still fixing up DKMS packages before we proceed uploading to the
archive.
—–
Important upcoming dates:

  • https://wiki.ubuntu.com/WilyWerewolf/ReleaseSchedule
    Thurs Aug 20 – Feature Freeze (~1 week away)
    Thurs Aug 27 – Beta 1 (~2 weeks away)
    Thurs Sep 24 – Fina Beta (~6 weeks away)


Status: CVE’s

The current CVE status can be reviewed at the following link:

  • http://kernel.ubuntu.com/reports/kernel-cves.html


Status: Stable, Security, and Bugfix Kernel Updates – Precise/Trusty/Utopic/Vivid

Status for the main kernels, until today:

  • Precise – Verification & Testing
  • Trusty – Verification & Testing
  • lts-Utopic – Verification & Testing
  • Vivid – Verification & Testing

    Current opened tracking bugs details:

  • http://kernel.ubuntu.com/sru/kernel-sru-workflow.html
    For SRUs, SRU report is a good source of information:
  • http://kernel.ubuntu.com/sru/sru-report.html

    Schedule:

    cycle: 26-Jul through 15-Aug
    ====================================================================
    24-Jul Last day for kernel commits for this cycle
    26-Jul – 01-Aug Kernel prep week.
    02-Aug – 08-Aug Bug verification & Regression testing.
    09-Aug – 15-Aug Regression testing & Release to -updates.


Open Discussion or Questions? Raise your hand to be recognized

No open discussion.

Read more
Daniel Holbach

If you haven’t heard of it yet, every Tuesday we have the Ubuntu Community Q&A session at 15:00 UTC. It’s always up on http://ubuntuonair.com and you can watch old sessions on the youtube channel. For the casual Ubuntu users it’s a great way to get to know people who are working in the inner circles of Ubuntu and can answer questions, clear up misunderstandings or get specialists on the show.

Since Jono went to XPRIZE, our team at Canonical has been running them and I really enjoy these sessions. What I liked even more were the sessions where we had guests and got to talk about some more specific topics. In the past few weeks we had Olli Ries on, quite a few UbuCon organisers, some testing/QA heroes and many more.

If you have anyone you’d like to see interviewed or any specific topics you’d like to see covered, please drop a comment below and we’ll do our best to get them on in the next weeks!

Read more
Prakash

For a whole lot of people, especially those in developing countries, science – and with it, medicine – isn’t readily available to the majority of citizens. But Manu Prakash wants to change that.
Prakash, an assistant professor of bioengineering at Stanford, is the proprietor of “frugal science,” a term he coined to explain the movement toward building cheap versions of high tech tools. His endeavor aims to make medical devices both affordable and available to the masses.

The way Prakash sees it, labs don’t need the most expensive equipment out there in order to reach profound breakthroughs. “Today people look at these extraordinary labs and forget that in the 1800s they could still do the exact same science,” he told The New York Times.

So in 2014 he created a paper microscope, aptly named the Foldscope, that costs only 50 cents to produce.

Read More: http://www.businessinsider.in/A-paper-microscope-that-costs-only-50-cents-can-detect-malaria-from-just-a-drop-of-blood-and-it-could-revolutionize-medicine/articleshow/48259276.cms

Read more
Tim Peeters

Adaptive page layouts made easy

Convergent applications

We want to make it easy for app developers to write an app that can run on different form factors without changes in the code. This implies that an app should support screens of various sizes, and the layout of the app should be optimal for each screen size. For example, a messaging app running on a desktop PC in a big window could show a list of conversations in a narrow column on the left, and the selected conversation in a wider column on the right side. The same application on a phone would show only the list of conversations, or the selected conversation with a back-button to return to the list. It would also be useful if the app automatically switches between the 1-column and 2-column layouts when the user resizes the window, or attaches a large screen to the phone.

To accomplish this, we introduced the AdaptivePageLayout component in Ubuntu.Components 1.3. This version of  Ubuntu.Components is still under development (expect an official release announcement soon), but if you are running the latest version of the Ubuntu UI Toolkit, you can already try it out by updating your import Ubuntu.Components to version 1.3. Note that you should not mix import versions, so when you update one of your components to 1.3, they should all be updated.

AdaptivePageLayout

AdaptivePageLayout is an Item with the following properties and functions:

  • property Page primaryPage
  • function addPageToCurrentColumn(sourcePage, newPage)
  • function addPageToNextColumn(sourcePage, newPage)
  • function removePages(page)

To understand how it works, imagine that internally, the AdaptivePageLayout keeps track of an infinite number of virtual columns that may be displayed on your screen. Not all virtual columns are visible on the screen. By default, depending on the width of your AdaptivePageLayout, either one or two columns are visible. When a Page is added to a virtual column that is not visible, it will instead be shown in the right-most visible column.

The Page defined as primaryPage will initially be visible in the first (left-most) column and all the other columns are empty (see figure 1).

Figure 1: Showing only primaryPage in layouts of 100 and 50 grid-units.
Showing only primaryPage at 100 grid units. Showing primaryPage at 50 grid units.

To show another Page in the first column, call addPageToCurrentColumn() with as parameters the current page (primaryPage), and the new page. The new page will then show up in the same column with a back button in the header to close the new page and return to the previous page (see figure 2). So far, AdaptivePageLayout is no different than a PageStack.

Figure 2: Page with back button in the first column.
Page with back button in the first column at 100 grid units. Page with back button in first column at 50 grid units.

The differences with PageStack become evident when you want to keep the first page visible in the first column, while adding a new page to the next column. To do this, call addPageToNextColumn() with the same parameters as addPageToCurrentColumn() above. The new page will now show up in the following column on the screen (see figure  3).

Figure 3: Adding a page to the next column.
Added a page to the next column at 100 grid units. Added a page to the next column at 50 grid units.

However, if you resize the window so that it fits only one column, the left column will be hidden, and the page that was in the right column will now have a back button. Resizing back to get the two-column layout will again give you the first page on the left, and the new page on the right. Call removePages(page) to remove page and all pages that were added after page was added. There is one exception: primaryPage is never removed, so removePages(primaryPage) will remove all pages except primaryPage and return your AdaptivePageLayout to its initial state.

AdaptivePageLayout automatically chooses between a one and two-column layout depending on the width of the window. It also automatically shows a back button in the correct column when one is needed and synchronizes the header size between the different columns (see figure 4).

Figure 4: Adding sections to any column increases the height of the header in every column.
Added a page with sections to the right column at 100 grid units. Added a page with sections at 50 grid units.

Future extensions

The version of AdaptivePageLayout that is now in the UI toolkit is only the first version. What works now will keep working, but we will extend the API to support the following:

  • Layouts with more than two columns
  • Use different conditions for switching between layouts
  • User-resizable columns
  • Automatic and manual hiding of the header in single-column layouts
  • Custom proxy objects to support Autopilot tests for applications

Below you can read the full source code that was used to create the screenshots above. The screenhots do not cover all the possible orders in which the pages left and right can be added, so I encourage you to run the code for yourself and discover its full behavior. We are looking forward to see your first applications using the new AdaptivePageLayout component soon :). Of course if there are any questions you can leave a comment below or ping members of the SDK team (I am t1mp) in #ubuntu-app-devel on Freenode IRC.

 

import QtQuick 2.4
import Ubuntu.Components 1.3

MainView {
    width: units.gu(100)
    height: units.gu(70)

    AdaptivePageLayout {
        id: layout
        anchors.fill: parent
        primaryPage: rootPage

        Page {
            id: rootPage
            title: i18n.tr("Root page")

            Column {
                anchors {
                    top: parent.top
                    left: parent.left
                    margins: units.gu(1)
                }
                spacing: units.gu(1)

                Button {
                    text: "Add page left"
                    onClicked: layout.addPageToCurrentColumn(rootPage, leftPage)
                }
                Button {
                    text: "Add page right"
                    onClicked: layout.addPageToNextColumn(rootPage, rightPage)
                }
                Button {
                    text: "Add sections page right"
                    onClicked: layout.addPageToNextColumn(rootPage, sectionsPage)
                }
            }
        }

        Page {
            id: leftPage
            title: i18n.tr("First column")

            Rectangle {
                anchors {
                    fill: parent
                    margins: units.gu(2)
                }
                color: UbuntuColors.orange

                Button {
                    anchors.centerIn: parent
                    text: "right"
                    onTriggered: layout.addPageToNextColumn(leftPage, rightPage)
                }
            }
        }

        Page {
            id: rightPage
            title: i18n.tr("Second column")

            Rectangle {
                anchors {
                    fill: parent
                    margins: units.gu(2)
                }
                color: UbuntuColors.green

                Button {
                    anchors.centerIn: parent
                    text: "Another page!"
                    onTriggered: layout.addPageToCurrentColumn(rightPage, sectionsPage)
                }
            }
        }

        Page {
            id: sectionsPage
            title: i18n.tr("Page with sections")
            head.sections.model: [i18n.tr("one"), i18n.tr("two"), i18n.tr("three")]

            Rectangle {
                anchors {
                    fill: parent
                    margins: units.gu(2)
                }
                color: UbuntuColors.blue
            }
        }
    }
}

 

Read more
Dustin Kirkland

The Golden Ratio is one of the oldest and most visible irrational numbers known to humanity.  Pi is perhaps more famous, but the Golden Ratio is found in more of our art, architecture, and culture throughout human history.

I think of the Golden Ratio as sort of "Pi in 1 dimension".  Whereas Pi is the ratio of a circle's circumference to its diameter, the Golden Ratio is the ratio of a whole to one of its parts, when the ratio of that part to the remainder is equal.

Visually, this diagram from Wikipedia helps explain it:


We find the Golden Ratio in the architecture of antiquity, from the Egyptians to the Greeks to the Romans, right up to the Renaissance and even modern times.



While the base of the pyramids are squares, the Golden Ratio can be observed as the base and the hypotenuse of a basic triangular cross section like so:


The floor plan of the Parthenon has a width/depth ratio matching the Golden Ratio...



For the first 300 years of printing, nearly all books were printed on pages whose length to width ratio matched that of the Golden Ratio.

Leonardo da Vinci used the Golden Ratio throughout his works.  I'm told that his Vitruvian Man displays the Golden Ratio...


From school, you probably remember that the Golden Ratio is approximately ~1.6 (and change).
There's a strong chance that your computer or laptop monitor has a 16:10 aspect ratio.  Does 1280x800 or 1680x1050 sound familiar?



That ~1.6 number is only an approximation, of course.  The Golden Ratio is in fact an irrational number and can be calculated to much greater precision through several different representations, including:


You can plug that number into your computer's calculator and crank out a dozen or so significant digits.


However, if you want to go much farther than that, Alexander Yee has created a program called y-cruncher, which as been used to calculate most of the famous constants to world record precision.  (Sorry free software readers of this blog -- y-cruncher is not open source code...)

I came across y-cruncher a few weeks ago when I was working on the mprime post, demonstrating how you can easily put any workload into a Docker container and then produce both Juju Charms and Ubuntu Snaps that package easily.  While I opted to use mprime in that post, I saved y-cruncher for this one :-)

Also, while doing some network benchmark testing of The Fan Networking among Docker containers, I experimented for the first time with some of Amazon's biggest instances, which have dedicated 10gbps network links.  While I had a couple of those instances up, I did some small scale benchmarking of y-cruncher.

Presently, none of the mathematical constant records are even remotely approachable with CPU and Memory alone.  All of them require multiple terabytes of disk, which act as a sort of swap space for temporary files, as bits are moved in and out of memory while the CPU crunches.  As such, approaching these are records are overwhelmingly I/O bound -- not CPU or Memory bound, as you might imagine.

After a variety of tests, I settled on the AWS d2.2xlarge instance size as the most affordable instance size to break the previous Golden Ratio record (1 trillion digits, by Alexander Yee on his gaming PC in 2010).  I say "affordable", in that I could have cracked that record "2x faster" with a d2.4xlarge or d2.8xlarge, however, I would have paid much more (4x) for the total instance hours.  This was purely an economic decision :-)


Let's geek out on technical specifications for a second...  So what's in a d2.2xlarge?
  • 8x Intel Xeon CPUs (E5-2676 v3 @ 2.4GHz)
  • 60GB of Memory
  • 6x 2TB HDDs
First, I arranged all 6 of those 2TB disks into a RAID0 with mdadm, and formatted it with xfs (which performed better than ext4 or btrfs in my cursory tests).

$ sudo mdadm --create --verbose /dev/md0 --level=stripe --raid-devices=6 /dev/xvd?
$ sudo mkfs.xfs /dev/md0
$ df -h /mnt
/dev/md0 11T 34M 11T 1% /mnt

Here's a brief look at raw read performance with hdparm:

$ sudo hdparm -tT /dev/md0
Timing cached reads: 21126 MB in 2.00 seconds = 10576.60 MB/sec
Timing buffered disk reads: 1784 MB in 3.00 seconds = 593.88 MB/sec

The beauty here of RAID0 is that each of the 6 disks can be used to read and/or write simultaneously, perfectly in parallel.  600 MB/sec is pretty quick reads by any measure!  In fact, when I tested the d2.8xlarge, I put all 24x 2TB disks into the same RAID0 and saw nearly 2.4 GB/sec read performance across that 48TB array!

With /dev/md0 mounted on /mnt and writable by my ubuntu user, I kicked off y-crunch with these parameters:

Program Version:       0.6.8 Build 9461 (Linux - x64 AVX2 ~ Airi)
Constant: Golden Ratio
Algorithm: Newton's Method
Decimal Digits: 2,000,000,000,000
Hexadecimal Digits: 1,660,964,047,444
Threading Mode: Thread Spawn (1 Thread/Task) ? / 8
Computation Mode: Swap Mode
Working Memory: 61,342,174,048 bytes ( 57.1 GiB )
Logical Disk Usage: 8,851,913,469,608 bytes ( 8.05 TiB )

Byobu was very handy here, being able to track in the bottom status bar my CPU load, memory usage, disk usage, and disk I/O, as well as connecting and disconnecting from the running session multiple times over the 4 days of running.


And approximately 79 hours later, it finished successfully!

Start Date:            Thu Jul 16 03:54:11 2015
End Date: Sun Jul 19 11:14:52 2015

Computation Time: 221548.583 seconds
Total Time: 285640.965 seconds

CPU Utilization: 315.469 %
Multi-core Efficiency: 39.434 %

Last Digits:
5027026274 0209627284 1999836114 2950866539 8538613661 : 1,999,999,999,950
2578388470 9290671113 7339871816 2353911433 7831736127 : 2,000,000,000,000

Amazing, another person (who I don't know), named Ron Watkins, performed the exact same computation and published his results within 24 hours, on July 22nd/23rd.  As such, Ron and I are "sharing" credit for the Golden Ratio record.


Now, let's talk about the economics here, which I think are the most interesting part of this post.

Look at the above chart of records, which are published on the y-cruncher page, the vast majority of those have been calculated on physical PCs -- most of them seem to be gaming PCs running Windows.

What's different about my approach is that I used Linux in the Cloud -- specifically Ubuntu in AWS.  I paid hourly (actually, my employer, Canonical, reimbursed me for that expense, thanks!)  It took right at 160 hours to run the initial calculation (79 hours) as well as the verification calculation (81 hours), at the current rate of $1.38/hour for a d2.2xlarge, which is a grand total of $220!

$220 is a small fraction of the cost of 6x 2TB disks, 60 GB of memory, or 8 Xeon cores, not to mention the electricity and cooling required to run a system of this size (~750W) for 160 hours.

If we say the first first trillion digits were already known from the previous record, that comes out to approximately 4.5 billion record-digits per dollar, and 12.5 billion record-digits per hour!

Hopefully you find this as fascinating as I!

Cheers,
:-Dustin

Read more
Prakash

Are you ready to play everybody’s not-so-favorite guilt game: what was I doing at that age? Ann Makosinski, a high school student from British Columbia, Canada, has created a simple LED torch powered by body heat. So instead of having to recharge it or swap in a fresh pair of AAs every so often, you literally just need to hold it in your hand for it to start glowing.

Read More: http://www.gizmodo.com.au/2013/06/15-year-old-invents-incredible-new-kind-of-torch/

Read more