Canonical Voices

Posts tagged with 'gstreamer'

James Henstridge

One of the options available when configuring the my ThinkPad was an Infrared camera. The main selling point being “Windows Hello” facial recognition based login. While I wasn’t planning on keeping Windows on the system, I was curious to see what I could do with it under Linux. Hopefully this is of use to anyone else trying to get it to work.

The camera is manufactured by Chicony Electronics (probably a CKFGE03 or similar), and shows up as two USB devices:

04f2:b5ce Integrated Camera
04f2:b5cf Integrated IR Camera

Both devices are bound by the uvcvideo driver, showing up as separate video4linux devices. Interestingly, the IR camera seems to be assigned /dev/video0, so generally gets picked by apps in preference to the colour camera. Unfortunately, the image it produces comes up garbled:

So it wasn’t going to be quite so easy to get things working. Looking at the advertised capture modes, the camera supports Motion-JPEG and YUYV raw mode. So I tried capturing a few JPEG frames with the following GStreamer pipeline:

gst-launch-1.0 v4l2src device=/dev/video0 num-buffers=10 ! image/jpeg ! multifilesink location="frame-%02d.jpg"

Unlike in raw mode, the red illumination LEDs started flashing when in JPEG mode, which resulted in frames having alternating exposures. Here’s one of the better exposures:

What is interesting is that the JPEG frames have a different aspect ratio to the raw version: a more normal 640×480 rather than 400×480. So to start, I captured a few raw frames:

gst-launch-1.0 v4l2src device=/dev/video0 num-buffers=10 ! "video/x-raw,format=(string)YUY2" ! multifilesink location="frame-%02d.raw"

The illumination LEDs stayed on constantly while recording in raw mode. The contents of the raw frames show something strange:

00000000  11 48 30 c1 04 13 44 20  81 04 13 4c 20 41 04 13  |.H0...D ...L A..|
00000010  40 10 41 04 11 40 10 81  04 11 44 00 81 04 12 40  |@.A..@....D....@|
00000020  00 c1 04 11 50 10 81 04  12 4c 10 81 03 11 44 00  |....P....L....D.|
00000030  41 04 10 48 30 01 04 11  40 10 01 04 11 40 10 81  |A..H0...@....@..|
...

The advertised YUYV format encodes two pixels in four bytes, so you would expect any repeating patterns to occur at a period of four bytes. But the data in these frames seems to repeat at a period of five bytes.

Looking closer it is actually repeating at a period of 10 bits, or four packed values for every five bytes. Furthermore, the 800 byte rows work out to 640 pixels when interpreted as packed 10 bit values (rather than the advertised 400 pixels), which matches the dimensions of the JPEG mode.

The following Python code can unpack the 10-bit pixel values:

def unpack(data):
    result = []
    for i in range(0, len(data), 5):
        block = (data[i] |
                 data[i+1] << 8 |
                 data[i+2] << 16 |
                 data[i+3] << 24 |
                 data[i+4] << 32)
        result.append((block >> 0) & 0x3ff)
        result.append((block >> 10) & 0x3ff)
        result.append((block >> 20) & 0x3ff)
        result.append((block >> 30) & 0x3ff)
    return result

After adjusting the brightness while converting to 8-bit greyscale, I get a usable image. Compare a fake YUYV frame with the decoded version:

I suppose this logic could be wrapped up in a GStreamer element to get usable infrared video capture.

I’m still not clear why the camera would lie about the pixel format it produces. My best guess is that they wanted to use the standard USB Video Class driver on Windows, and this let them get at the raw data to process in user space.

Read more
Jussi Pakkanen

People often wonder why even the simplest of things seem to take long to implement. Often this is accompanied by uttering the phrase made famous by Jeremy Clarkson: how hard can it be.

Well let’s find out. As an example let’s look into a very simple case of creating a shared library that grabs a screen shot from a video file. The problem description is simplicity itself: open the file with GStreamer, seek to a random location and grab the pixels from the buffer. All in all, ten lines of code, should take a few hours to implement including unit tests.

Right?

Well, no. The very first problem is selecting a proper screenshot location. It can’t be in the latter half of the video, for instance. The simple reason for this is that it may then contain spoilers and the mere task of displaying the image might ruin the video file for viewers. So let’s instead select some suitable point, like 2/7:ths of the way in the video clip.

But in order to do that you need to first determine the length of the clip. Fortunately GStreamer provides functionality for this. Less fortunately some codec/muxer/platform/whatever combinations do not implement it. So now we have the problem of trying to determine a proper clip location for a file whose duration we don’t know. In order to save time and effort let’s just grab the screen shot at ten seconds in these cases.

The question now becomes what happens if the clip is less than ten seconds long? Then GStreamer would (probably) seek to the end of the file and grab a screenshot there. Videos often end in black so this might lead to black thumbnails every now and then. Come to think of it, that 2/7:th location might accidentally land on a fade so it might be all black, too. What we need is an image analyzer that detects whether the chosen frame is “interesting” or not.

This rabbit hole goes down quite deep so let’s not go there and instead focus on the other part of the problem.

There are mutually incompatible versions of GStreamer currently in use: 0.10 and 1.0. These two can not be in the same process at the same time due interesting technical issues. No matter which we pick, some client application might be using the other one. So we can’t actually link against GStreamer but instead we need to factor this functionality out to a separate executable. We also need to change the system’s global security profile so that every app is allowed to execute this binary.

Having all this functionality we can just fork/exec the binary and wait for it to finish, right?

In theory yes, but multimedia codecs are tricky beasts, especially hardware accelerated ones on mobile platforms. They have a tendency to freeze at any time. So we need to write functionality that spawns the process, monitors its progress and then kills it if it is not making progress.

A question we have not asked is how does the helper process provide its output to the library? The simple solution is to write the image to a file in the file system. But the question then becomes where should it go? Different applications have different security policies and can access different parts of the file system, so we need a system state parser for that. Or we can do something fancier such as creating a socket pair connection between the library and the client executable and have the client push the results through that. Which means that process spawning just got more complicated and you need to define the serialization protocol for this ad-hoc network transfer.

I could go on but I think the point has been made abundantly clear.

Read more
David

Ubuntu App Developer Week – Day 2 Summary

Wow, what a great follow-up to the first day! The second Ubuntu App Developer Week brought lots of awesome: great speakers and sessions, great participation, improvisation, Python, GTK, KDE, Qt, PyGI, Zeitgeist, Gstreamer, Introspection, Thunderbird, Unity, API Integration, hacking, fun… all the buzzwords you can associate when developing in your favourite Free Software Platform.

PyGTK is dead, long live PyGI! Using gobject-introspection in Python

By Martin Pitt

Martin’s complementary session to the GObject Introspection (GI) one on Monday was very popular. He started off with a recap of what GI is and the importance of the availability of several programming language bindings in any modern development platform. He provided an overview on how GI works in practice, and then delved into how it actually works in Python through the use of Pygobject and the gi.repository module, with lots of coding examples and comparison with traditiona GTK+ C code. After that he described other API differences, in particular the caveats with contructoirs, passing arrays, output arguments, GDestroyNotify and what to do with non-introspectable functions or methods. The next topic where overrides: how to provide custom code to override the introspected library’s objects. The second part of the session focused on explaining in detail how to migrate old PyGtk code to GTK3 and PyGI, in a series of easy guidelines: renaming, checking and repeating, and packaging changes. He wrapped up with a series of pointers on how to learn more and a Q+A session with lots of interesting questions from the audience.

Check out the session log here.

Zeitgeist API & Zeitgeist Application Integration

By Manish Sinha (???? ??????) and Seif Lotfy

For this session we had the luxury of having two key members of the Zeitgeist project to explain us all the details on how to integrate it to your own projects.  Manish, one of the Zeitgest developers, kicked off with an introduction on what Zeitgeist is: an automatic event logger which logs the events that happen on your computer. He then went on through the details of the Zeitgest terminology (events, manifestations, actors, timestamps…), architecture, and its interaction with D-Bus, with an overview of the API interface and the existing bindings: Python, C/Vala and C#. The session went on with examples of how real world applications and data providers use Zeitgeist, such as EOG plugins or Tomboy. Seif then chipped in with an example of  how Zeitgeist support was integrated into a GEdit plugin. Throughout the session lots of interesting questions were raised by the audience.

Check out the session log here.

GStreamer+Python: Multimedia Swiss Army Machete

By Jason DeRose

A very intersesting session indeed. In it, Jason explained all the points why GStreamer is the multimedia framework due to its economy of scale and why Python is the perfect complement with its simplicity and language clarity. According to him, together they provide the ultimative multimedia development tool, and this was why he chose to use them in hos own project: Novacut, the distributed video editor. From this point on, it was “Learning by doing”, and he then walked thorugh the code examples he’d set up for the session, showcasing how simple it is to work with multimedia streams with his swiss army machete :)

Check out the session log here.

KDE Development Intro: Q+A

By Harald Sitter and Jonathan Riddell

I’d especially like to mention this session due to a change of schedule. The original speaker, KDE/Kubuntu ninja Joathan Thomas could not make if due to last-minute commitments. But no worries, KDE/Kubuntu friends are always there to lend a hand, and in no time Harald and Jonathan stepped up to fill the gap and do an impromptu KDE Development Intro and Q+A session. In there they gave an overview on the essentials every prospective KDE developer should know and answered in detail the questions in the audience. All in all a great insight on how to get started developing KDE apps.

Check out the session log here.

Thunderbird + Unity = Awesome, and how JSCtypes lets you get to the candy

By Mike Conley

Mike has been working over the last 3 months at Mozilla on ways in which Thunderbird can integrate nicely into Ubuntu, in particular with Unity. He started explaining the main points he’s been focusing on: the messaging menu, the Unity launcher adn Ubuntu One, and for the rest of the session he covered the first two. Going straight to the subject, the next topic was to explain what a Thunderbird extension is, and how they are written using a mixture of Javascript, the XUL mark-up language and CSS, all executed by the Gecko engine. He then introduced JS-CTypes, which allow developers to access C libraries directly from Chrome-level Javascript code. and how he used them to write a Unity launcher add-on. the resto of the session focused on this subject, with plenty of code examples.

Check out the session log here.

STORY: Unity, hacking on a real-world app

By Marco Trevisan

The last session of the day was one of my favourite ones: an inspiring personal story. Marco is a community contributor to Unity who told us about his journey since he found an application itch to scratch and until his own feature was landed. He started with a very easy to understand overview of the Unity architecture and how all the pieces fit together, following with the story on how he found something that needed improvement and how he went about fixing it: indicator-sound not being precise when setting the volume with the mouse wheel. Do read it, as it is going to be a great help to all of you who are looking on how to get started contributing to Ubuntu development.

Check out the session log here.

The Day Ahead: Upcoming Sessions for Day 3

A quick look at today’s session lineup for your development pleasure:

16:00 UTC
Qt Quick: QML the Language – Jürgen Bocklage-Ryannel
Here’s a special treat for anyone interested in Qt development: Jürgen Bocklage-Ryannel, from Nokia, the maker of Qt, will be introducing Qt Quick and QML as the language used in Qt Quick. He’ll be showing some elements of the UI and the general process, and tell you the right places to go to to get more information.

17:00 UTC
Make your applications work in the cloud with Ubuntu OneStuart Langridge
Who else than the Ubuntu One mastermind himself could tell you better about supercharging your apps with cloud functionality? Join Stuart in this talk where he’ll be describing how to integrate Ubuntu One into your applications and bring your users to cloud 9 ;)

18:00 UTC
Take control of your desktop easily with DBusAlejandro J. Cura
D-Bus, the cross-desktop message bus system, is becoming more and more ubiquitous in any Free Software distribution. You can bring your applications to a whole new level letting them talk to other ones in a desktop session, and Alejandro can tell you exactly how to do that.

19:00 UTC
Touchégg: Bringing Multitouch Gestures to your DesktopJosé Expósito
It’s always great to see real-world examples of how the newest and coolest technologies are being used. José will be showcasing his multitouch-based application, Touchégg, introducing its features, describing how to add new multitouch gestures, the technologies used to develop it, and how it uses the uTouch-GEIS API. Check out the summary and the logs from the other Multitouch session on Monday to learn more.

20:00 UTC
Unity: Integrating with Launcher and PlacesMikkel Kamstrup Erlandsen
Do you want your application to seamlessly blend into the new Ubuntu user interface experience? Do you want it to provide all interaction capabilities that Unity provides? Then join Unity developer Mikkel Kamstrup in his walkthrough with examples on how to plug your app into the Launcher and Places API.

21:00 UTC
Tracking Source Code History with BazaarJelmer Vernooij
Learn how to control the history of your source code with a distributed and modern revision control system. Bazaar is powerful, fast, and most importantly, easy and fun to use. Jelmer has had a lot to do in developing Bazaar, so he knows well what he’s talking about. Join him in this session where he’ll tell you the basics and more sophisticated uses of the revision control system used to develop Ubuntu and thousands of other projects in Launchpad.

Looking forward to seeing you all there in a few hours!


Read more