Archive for the ‘Multitouch’ Category

Chase Douglas

Smooth scrolling in Chromium through uTouch

Thomas Voss and Jussi Pakkanen have been working on a project to bring gesture support to the Chromium browser. Very cool and promising stuff! See all the details and a video here.

admin

Input Action Event Sequences

There is an unsolved problem in modern toolkits. The solutions provided so far are merely hacks that solve 90% of problems while leaving corner cases. The problem revolves around how to handle input events.

Traditionally, when a toolkit receives an event from an input device, the event is delivered to widgets and applications for handling. When you move your mouse, pointer motion events are sent in this fashion.

Now let’s make things a little more complicated: What happens when the user presses and releases a mouse button? A button press event is delivered followed by a button release. What if the user pressed a mouse button while the cursor was over a button widget, then moves the cursor outside the button? The button widget still needs to receive the button release event so it can perfom the action, or depress itself without performing the action because the cursor moved.

We’ll make things even more complicated: What happens when the user presses and releases a widget button within a scrollable view using a touchscreen? If the touch hasn’t moved much, the widget button should be tapped. If the touch has moved, should the scrollable view scroll, or should the button be tapped, or should the button be depressed without performing an action? One real-life example we can look to for an answer is buttons in a scrollable pane on a mobile phone. If the user drags with a touch that began over a button, the scroll should be performed instead. However, the button often is pressed and then depressed graphically in order to give the user immediate feedback.

In traditional toolkits, events are generally delivered to the inner-most (child) widget first and then propagate up the widget hierarchy until a widget consumes the event. This is called event “bubbling”. In the scroll and button scenario, the touch begin event would be consumed by the button widget because it may perform a tap. However, this will cause the scroll widget container to never receive the touch begin event. If we only send touch events for a touch sequence to the widget receiving the touch begin, then the scroll widget container will never receive any events for touches beginning over the button widget. If we send touch events to the scroll widget container once the button determines the touch is really a drag to scroll, only the end of the event stream will be acted on by the scroll widget. It would be broken in other ways as well, but this is enough for our purposes.

Many toolkits have attempted to resolve this issue by adding an event “capture” phase. Events propagate from the top level widget to the child widget first, and then from the child widget to the top level widget. In some implementations, the capture phase may inhibit the bubbling phase. For the button in scroll-view issue, a capture phase would allow for the scroll view to receive touch events. However, if it inhibits the bubbling phase, the button will never see the touch events. If it does not inhibit the bubbling phase, a touch drag will simultaneously scroll and tap the button.

This isn’t a new issue. Mouse button press, motion, and then release events have the same issues. However, typical UI design guidelines have not resulted in problems for common use cases. Now that we have multitouch gestures this issue is going to be very common.

I believe we need to transition from the traditional toolkit model of sending individual events separately towards a model where event actions are sent. An action is comprised of a sequence of events. For mice, an action may be a button press/release cycle with pointer motion events in between. For touchscreens, an action is a gesture, which may be a one touch drag or a three touch tap or a two touch rotate.

Event actions are then sent as a sequence of events to widgets using the bubbling propagation flow. Any widget subscribed for the action receives the action events. The widget must make a decision on whether to accept or reject the action. However, acception or rejection may occur at any time, even after the sequence has physically ended. An example use case is a double tap where two touch taps occur in sequence within a short period of time. A widget wanting to handle a double tap may receive one action sequence for a tap and then set a timeout. If another tap action sequence occurs before the timeout expires, a double tap is recognized and both actions are accepted. If the timeout expires first, the original tap action is rejected so parent widgets may handle the it instead.

Action handling and bubbling will cause latency for parent widgets. It may be necessary to add the ability to send events to all subscribed widgets and transfer ownership from child to parent when the child rejects an action, or to cancel an action for parent widgets if the child accepts the action.

I believe this event handling and propagation mechanism has the ability to resolve all possible user interface scenarios without adding “capture” or “grab” semantics. Although it is complex and different from how toolkits operate today, it has a better logical mapping between what our brains expect an action to do, and what happens in the toolkit.

Chase Douglas

Bringing uTouch to QML

We have a version of Unity that runs on systems without 3D graphics acceleration. It is built on top of Qt QML. QML is great because it allows one to develop a new UI and the basic logic behind it very quickly using JSON to describe objects and Javascript to manipulate them. I attended the Qt Contributors Summit a few months ago and saw all kinds of great software using QML. We thought it would be great to develop a QML plugin not only for our own use in Unity but for other application developers as well.

The result is uTouch-qml. QML has an object hierarchy (apparently called a scene graph, but I’ll leave terminology up to people who really know) that describes how visual objects are laid out on screen. The uTouch-qml plugin provides some invisible gesture objects that may receive gesture events. For example, the following code from our eventprinter.qml example listens for two finger pinch gestures:

 

UTouchPinchArea {
    width: 500
    height: 500
    subscription {
        touches.start: 2
    }

    onGestureStart:
        printGestureEvent(gesture, "Start")

    onGestureUpdate:
        printGestureEvent(gesture, "Update")

    onGestureEnd:
        printGestureEvent(gesture, "End")

    onFocusChanged:
        console.log("Pinch Focus: (" + focus.x + ", " + focus.y + ")\n")

    centroid.onInitialChanged:
        console.log("Pinch Initial Centroid: (" + centroid.initial.x + ", " + centroid.initial.y + ")\n")

    centroid.onCurrentChanged:
        console.log("Pinch Centroid: (" + centroid.current.x + ", " + centroid.current.y + ")\n")

    radius.onInitialChanged:
        console.log("Pinch Initial Radius: " + radius.initial + "\n")

    radius.onCurrentChanged:
        console.log("Pinch Radius: " + radius.current + "\n")
}

 

There are callbacks for imperative style programming like OnGestureStart, but there are also declarative signals like radius.onCurrentChanged that allow for simple property bindings between QML objects. This should provide a great deal of flexibility in how gestures may be used in applications.

Best of all, uTouch-qml is available in Ubuntu Oneiric and will ship in Ubuntu 11.10 when it is released this Fall! Simply install the libutouch-qml package to get started. The library is also fully documented; after installing libutouch-qml-doc you can browse the documentation at file:///usr/share/doc/utouch-qml/html/index.html or in the Help section of Qt Creator. Also, check out the examples in the source tree for hints on how to get started.

I hope to see many new QML applications with multitouch gesture support in the future!

Chase Douglas

Multitouch in Ubuntu 11.04

One of my key goals for Ubuntu 11.04 has been to introduce full multitouch support through X.org. In technical terms, this means adding touch support to the XInput protocol. You may see others refer to multitouch in X.org as simply XInput 2.1. We hatched our plan back at UDS-N to push hard on developing the XInput 2.1 protocol and implementing it as best as possible in 11.04. The idea was that Ubuntu would be a test bed for the protocol before it is adopted by X.org upstream.

We’re now past feature freeze for Ubuntu 11.04 and nearing the beta release. How well has the plan worked? I believe we’ve been mostly successful. 11.04 includes a pre-release version of XInput 2.1, and we’ve even got support for multitouch through Qt! However, working around issues in the existing X protocol has provided many challenges that became visible only after the initial implementation was developed. In 11.04 we have support for the major pieces of XInput 2.1, but we have since encountered a few corner cases that require a bit more work to get right. I will be writing another post about these challenges to give a better overview of the issues we are facing.

That said, what we have in Ubuntu 11.04 works pretty well. However, we need to set some boundaries so people aren’t caught off guard by changes from the XInput 2.1 implementation in Ubuntu 11.04 and what ends up in X.org’s X Server 1.11 and Ubuntu 11.10 later this year.

First, boundaries for developers. The XInput 2.1 protocol is still under review and has changed quite a bit since the version that has landed in Ubuntu 11.04. Most of the changes are cosmetic, such as renaming the macro XITouchOwnerReject to XITouchReject. However, there may be one or two major changes that affect protocol event handling. We suggest that developers eager to begin developing multitouch applications use Qt as its touch API is stable and will not change. Developers may use the XInput 2.1 implementation in 11.04, but they need to be aware that the protocol will change before it is officially released by X.org

Second, boundaries for users. The multitouch implementation in 11.04 covers most of the major use cases for multitouch. However, there are some corner cases that are not handled properly yet. As an example, the behavior in Ubuntu 11.04 is incorrect if you have multiple touches on a touchpad and then use a second mouse to move the cursor to a different window. That said, we believe that all typical multitouch usage will work fine. As a side note, OS X doesn’t handle atypical multitouch input in a reasonable or even predictable way, so we’re at least up to their standard!

I encourage anyone with multitouch hardware to play with the Qt touch examples. I find fingerpainting to be rather fun! You can run the examples by installing qt4-demos and then manually running them from the terminal:

$ sudo apt-get install qt4-demos
$ /usr/lib/qt4/examples/touch/fingerpaint/fingerpaint

There are three other demos: knobs, pinchzoom, and dials.

Thank you to all those who have provided feedback for XInput 2.1! I am looking forward to new touch support in Ubuntu applications!

Chase Douglas

News on the Magic Trackpad Driver

After I added Magic Trackpad functionality to the current hid_magicmouse driver, I began to notice three, four, and five finger gestures would sometimes stall out. I used all our test tools to check events through our gesture stack, and I realized that events were getting stalled from the kernel itself. I then poked inside the hid_magicmouse driver and saw that it wasn’t receiving events at all during the stalls!

What to do now? I peeked at the hid layer of the kernel and found a nice debugging interface in /sys/kernel/debug/<device>/events. If you cat the file as your use the device, you’ll get a bunch of data. While this worked wonderfully for my N-Trig touchscreen, nothing was printed out for the trackpad. After some debugging, I found a bug in hid-debug.c where debug events could be added to the debug buffer in the kernel, but processes wanting to read the events are not woken up. After fixing that issue I started to get a stream of event data from the kernel.

Aha, I thought as I looked at the stream. Events were coming through even when the device seemed stalled. It took me a bit to decipher what was going on, but I figured it out after some more code browsing. The trackpad likes to send three, four, and five finger events coallesced into a packet with two events. To give a general idea of the protocol:

0xF7 0xXX <Trackpad packet 1> <Trackpad packet 2>

The first byte (0xF7) is a new HID report type that we weren’t listening for in the driver. The second byte is the length of the first trackpad packet. We have the full size of this double touch packet, so the length of the second packet can be calculated without any more information.

The change to the driver is rather small. The driver now listens for this 0xF7 report type, and we just split the double touch packet into the two touch packets and reprocess them using the code that’s already there. Voila! A driver that works well for touches up to 10 fingers!

Now I just need to figure out why powering off the trackpad without disconnecting it first causes the kernel to panic…

Chase Douglas

Decoding Apple’s Magic Trackpad

Notes on the meaning of protocol bits

Notes on the meaning of protocol bits

We’ve received a lot of questions on whether Apple multitouch products will be supported in Ubuntu 10.10. I’m pleased to say that all of the currently available products will have some support:

  • Magic Mouse
  • Magic Trackpad
  • iPhone
  • iPod
  • iPad
  • Macbook, Macbook Air, and Macbook Pro with multitouch touchpads

I’ve written an App for iOS, Remotux, that sends mouse and keyboard input to Ubuntu, and I will be bringing full multitouch support to it. But this post is about the trackpad. The Magic Trackpad has been out a few weeks now, and it just so happens that I received one last week. I started hacking on it yesterday by analyzing the protocol using the PacketLogger utility that ships with Apple’s OS X SDK. I was able to decode the protocol by comparing reams of data generated by the device and looking for patterns. I found that the protocol is similar to the Magic Mouse protocol, but a little different. The Magic Mouse protocol is six bytes of data followed by N x 8 bytes of touch data for N touches active on the surface of the mouse. The Magic Trackpad protocol is four bytes of data followed by N x 9 bytes of touch data. The difference is that the mouse has two relative axes due to its mouse nature, while the trackpad has a higher range of values across its surface in both the X and Y axes. I’m currently testing patches to the hid-magicmouse driver. For those interested, the patches can be found in git://kernel.ubuntu.com/cndougla/ubuntu-maverick.git in the magictrackpad branch. With these patches, I’ve been able to use the trackpad with both synaptics and Ubuntu’s hacked up gevdev X input modules.

Chase Douglas

Thoughts on the Architecture of Multitouch in Ubuntu

Introduction

We have been working to introduce a full multitouch and gesture solution for Ubuntu, and we’ve targetted Ubuntu 10.10 (Maverick) Netbook Edition as an initial test and integration milestone for our efforts. As we are near to showing off our work, I would like to give an overview of the technical approaches we have taken for 10.10 and highlight our future architectural directions.

When we started a few months ago, multitouch support was really only integrated and enabled in the Linux kernel. There are a few families of devices with multitouch support, and we are targetting N-Trig touchscreens specifically. We also plan to support the Apple Magic Mouse and touchpads utilizing the BCM5974 chip, which can be found in Apple and other mainstream laptops. There are a few other multitouch devices that have drivers in Ubuntu 10.10, but we have not been able to fully test them.

Multitouch Slots

The first bit of good news was the addition of the multitouch slots protocol between the kernel and userspace. Henrik Rydberg, a member of the Canonical Multitouch team, wrote the slots protocol that is now available in Linux 2.6.36. Further, he has created a library named mtdev to convert legacy kernel drivers to the slots protocol in userspace. What’s so great about this slots protocol? The slots protocol provides touch tracking, which ensures that applications don’t become confused when two or more fingers are on the device at once. With mtdev, the touch events representing your index finger and your thumb will be kept separated so an application can easily keep track of them.

Multitouch Application Support

Now that we have tracked touches, we need some way to send them to applications. Applications built for Linux generally run through the X Window System. This system provides for windows and mice and keyboard input, among many other features. The most logical way to send multitouch events to applications is by sending them through the X server. This way, a touch on top of one application’s window will be sent only to that application. Other applications with windows underneath the top window should not be aware of the touch. The X server already handles this input propagation for mice, so it just needs to be extended for multitouch devices.

Unfortunately, there are a lot of hairy issues with X input that aren’t readily apparent to users. For example, if I hold the mouse button down over one window and drag outside of the window, the window still receives all the mouse events even though the cursor may physically be located above a different window. This functionality is implemented through a concept called “grabbing”. Without getting into the details here, supporting multitouch grabbing means we have to develop an extension to the current X server input architecture. Work is underway on this, as Peter Hutterer has proposed an XInput protocol extension for multitouch. Although we have a proposed protocol, the implementation will not be ready in time for Ubuntu 10.10. We are looking forward to our continued work with Peter and other X developers to help develop the implementation and provide it in a future Ubuntu release.

Gesture Support

So what multitouch features will be ready for Ubuntu 10.10? Without the ability to send multitouch events directly to applications, we can still send some data. We can listen to the multitouch events that are sent from the kernel to the X server and make decisions about whether they are useful and what they mean in a given context. We can group touches into a set of predefined meanings: gestures. For example, two touches that are moving towards each other can represent a pinch-to-zoom gesture. To facilitate the recognition of gestures, we have created a library named grail (Gesture Recognition and Instantiation Library). Grail takes tracked touches from mtdev or the kernel and attempts to recognize gestures from a predefined list:

* Swipe (moving fingers in a uniform direction)
* Pinch (moving fingers closer or farther apart)
* Rotate (moving fingers around each other)
* Tapping

Further, grail provides for recognizing each of the above gestures from one to five fingers. Lastly, grail uses a callback function to ask for clients for a recognized gesture. If a client requests the gesture, the gesture event is passed to the client. If no client requested the gesture, the touches are translated into a single-touch pointer motion to support general mouse input control.

X Hacking

Here’s where things become interesting! Applications in Linux work through the X server as noted above, so it would be nice if we could pass gestures through X to the applications. In Ubuntu 10.10 we have taken the xserver-xorg-input-evdev module, an X input module/driver that translates kernel keyboard and mouse input into events passed to applications, and added a bit of code to pass kernel multitouch events through grail. When grail recognizes gestures, it uses a callback into the module to determine if any clients, X applications in this case, are listening for the gestures. If so, then the gesture event is passed to the correct client through X.

How do we pass gestures through X? We’ve written a new X Gesture extension. X clients can listen for a set of gestures on any X window. If a gesture occurs in the window, the gesture event is passed to the client. Note that gesture event propagation occurs similarly to input event propagation. A gesture occurs in a child window, and all windows up the X window hierarchy from the child window to the root window are tested for any clients listening for the gesture type. The first window with a client listening for the gesture type receives a gesture event, and propagation stops. The difference between normal mouse input and gesture propagation is in determining the child window. In normal input event propagation, the child window is the top-most window under the cursor when the event occurred. In the gesture case, the child window is the top-most window that contains all the touches that make up the gesture.

(Note that the above leaves out some details for clarity. If you would like more details on X event propagation, see the X11R7.5 documentation)

So we have multitouch gestures through X, we’re done now right? Wrong. Our approach of embedding all of this can be termed as a “hack”, a suboptimal solution. We fully recognize this and include it in Ubuntu 10.10 only as a stop-gap measure. We will be reaching out to the X developer community to open a dialog over the potential inclusion of a more optimal solution in X, if that is desirable for everyone. We have taken great pains to create a new X extension without publishing an official API for it in an effort to make clear that the solution is still in its infancy and interfaces may change. However, we invite developers to poke around and play if they are so inclined. Stephen Webb, another member of the Canonical Multitouch team, has created a higher-level C library called geis that will be included in Ubuntu 10.10. We hope that the library is flexible and extensible such that we will not have to break backwards compatibility once all the X implementation details are worked out, but we can’t make any guarantees at this time.

To give an idea of where we think the X gesture implementation is headed, I believe the first step is implementing true multitouch support through X as discussed above. Then, we can look into extending our small X Gesture extension to abstract out the gesture recognizer such that grail or any other gesture recognizer can be plugged in. We could then siphon multitouch events from DIX, a component of the X server that processes event propagation, instead of siphoning off events in an input module.

Again, I invite you to check out all our work at http://launchpad.net/canonical-multitouch/, feel free to poke us in #ubuntu-touch, and we look forward to your comments and suggestions!

Chase Douglas

The Canonical Multitouch And Gestures Project

A little over two months ago, a “skunkworks” type project inside Canonical was started. The goal is to enable gestures and multitouch support throughout Ubuntu and the software that comprises it. The landscape of multitouch-capable devices has exploded recently. Today, we have many different types of devices, including touchpads, touchscreens, and Apple’s Magic Mouse (sort of a category unto itself). Tomorrow will yield devices like multitouch tables such as the Microsoft Surface.

The best multitouch solutions available today consist of two different system-level paradigms. First, we have mobile systems, like Google’s Android and Apple’s iOS. These systems are special in that they only present one application to the user at a time. Though it may not be obvious, this greatly simplifies the task of providing an immersive multitouch solution. Essentially, it is up to each individual application to be multitouch-aware, and they know that no other application can wrest control away from them. Beyond mobile systems, some multitouch support exists in traditional multi-application systems, such as OS X and Windows. While they enable applications to process multitouch input, they don’t offer very much at the system level.

We want to take the best ideas from these current systems and bring them to Ubuntu, while also enabling deep support throughout the desktop environment. For example, we want a user to be able to rotate photos in an image editor and zoom in and out of PDFs in a reader, but we also want to enable greater window management through gestures. Think simple window placement and sizing in the desktop. What if you could reach out and grab the window, move it around the screen, and resize it just with the tips of your fingers in an intuitive fashion? What if you performed a simple gesture to show you a view of all your open windows at once? Many great possibilities open up both to users and application developers with the proper support from the system.

Here’s where the new Canonical Multitouch team gets involved. We want to make all the above possible, and we want to work with the community to ensure that application developers can create fantastic products. Over the past two months, we have been researching and developing solutions all through the stack from the kernel to the X server to application libraries. Now that we’ve moved out of chaos-mode, we will be sharing all our work with developers and users alike. Ubuntu 10.10 (Maverick) Netbook Edition will be our proving grounds as we create multitouch solutions that push the boundaries of what is currently possible in a Linux system. The Unity window manager will have gesture support to enable greater window management, and we hope to be able to offer some multitouch support to legacy applications as well. For those interested, you can find all our work at http://launchpad.net/canonical-multitouch and you can find our team at #ubuntu-touch on the Freenode IRC service.



Create a new blog