Canonical Voices

Posts tagged with 'debugging'

Colin Ian King

health-check revisited

Earlier this month I wrote about health-check, a tool to sanity check application resource usage.  Since then I've re-worked and simplified the system call tracing and added some new features.

Originally, health-check was designed to attach itself to running processes. To make the tool even easier to use it can now start applications and follow any subsequent new processes spawned off from fork() or clone(). For example:

sudo health-check -u joeuser -f thunderbird
..will start thunderbird (running as user "joeuser") and follow any new processes it creates.

Sometimes applications will force flushing of data or metadata to disk by excessive use of fsync(), fdatasync() and sync().  Health-check will now keep track of these system calls and provide feedback on their use.

Health-check already has some memory checking techniques - however, it now has been extended to check explicitly for heap changes by examining the brk() system call and also keeping track of mmap() and munmap() mappings.  This allows better tracking of potential memory leaks.

Source code can be found at: git://kernel.ubuntu.com/cking/health-check

Packages found in my White PPA in ppa:colin-king/white so to install on Ubuntu systems use:
 sudo add-apt-repository ppa:colin-king/white  
sudo apt-get update
sudo apt-get install health-check

Read more
Colin Ian King

Kernel tracing using lttng

LTTng (Linux Trace Toolkit - next generation) is a highly efficient system tracer that allows tracing of the kernel and userspace. It also provides tools to view and analyse the gathered trace data.  So let's see how to install and use LTTng kernel tracing in Ubuntu. First, one has to install the LTTng userspace tools:

 sudo apt-get update  
sudo apt-get install lttng-tools babeltrace
LTTng was already recently added into the Ubuntu 13.10 Saucy kernel, however, with earlier releases one needs to install the LTTng kernel driver using lttng-modules-dkms as follows:

 sudo apt-get install lttng-modules-dkms  
It is a good idea to sanity check to see if the tools and driver are installed correctly, so first check to see the available kernel events on your machine:

 sudo lttng list -k  
And you should get a list similar to the following:
 Kernel events:  
-------------
mm_vmscan_kswapd_sleep (loglevel: TRACE_EMERG (0)) (type: tracepoint)
mm_vmscan_kswapd_wake (loglevel: TRACE_EMERG (0)) (type: tracepoint)
mm_vmscan_wakeup_kswapd (loglevel: TRACE_EMERG (0)) (type: tracepoint)
mm_vmscan_direct_reclaim_begin (loglevel: TRACE_EMERG (0)) (type: tracepoint)
mm_vmscan_memcg_reclaim_begin (loglevel: TRACE_EMERG (0)) (type: tracepoint)
..
Next, we need to create a tracing session:
 sudo lttng create examplesession  
..and enable events to be traced using:
 sudo lttng enable-event sched_process_exec -k  
One can also specify multiple events as a comma separated list. Next, start the tracing using:
 sudo lttng start  
and to stop and complete the tracing use:
 sudo lttng stop  
sudo lttng destroy
and the trace data will be saved in the directory ~/lttng-traces/examplesession-[date]-[time]/.  One can examine the trace data using the babeltrace tool, for example:
 sudo babeltrace ~/lttng-traces/examplesession-20130517-125533  
And you should get a list similar to the following:
 [12:56:04.490960303] (+?.?????????) x220i sched_process_exec: { cpu_id = 2 }, { filename = "/usr/bin/firefox", tid = 4892, old_tid = 4892 }  
[12:56:04.493116594] (+0.002156291) x220i sched_process_exec: { cpu_id = 0 }, { filename = "/usr/bin/which", tid = 4895, old_tid = 4895 }
[12:56:04.496291224] (+0.003174630) x220i sched_process_exec: { cpu_id = 2 }, { filename = "/usr/lib/firefox/firefox", tid = 4892, old_tid = 4892 }
[12:56:05.472770438] (+0.976479214) x220i sched_process_exec: { cpu_id = 2 }, { filename = "/usr/lib/libunity-webapps/unity-webapps-service", tid = 4910, old_tid = 4910 }
[12:56:05.478117340] (+0.005346902) x220i sched_process_exec: { cpu_id = 2 }, { filename = "/usr/bin/ubuntu-webapps-update-index", tid = 4912, old_tid = 4912 }
[12:56:10.834043409] (+5.355926069) x220i sched_process_exec: { cpu_id = 3 }, { filename = "/usr/bin/top", tid = 4937, old_tid = 4937 }
[12:56:13.668306764] (+2.834263355) x220i sched_process_exec: { cpu_id = 3 }, { filename = "/bin/ps", tid = 4938, old_tid = 4938 }
[12:56:16.047191671] (+2.378884907) x220i sched_process_exec: { cpu_id = 3 }, { filename = "/usr/bin/sudo", tid = 4939, old_tid = 4939 }
[12:56:16.059363974] (+0.012172303) x220i sched_process_exec: { cpu_id = 3 }, { filename = "/usr/bin/lttng", tid = 4940, old_tid = 4940 }
The LTTng wiki contains many useful worked examples and is well worth exploring.

As it stands, LTTng is relatively light weight.   Research by Romik Guha Anjoy and Soumya Kanti Chakraborty shows that LTTng describes how the CPU overhead is ~1.6% on a Intel® CoreTM 2 Quad with four 64 bit Q9550 cores.  With measurements I've made with oprofile on a Nexus 4 with 1.5 GHz quad-core Snapdragon S4 Pro processor shows a CPU overhead of < 1% for kernel tracing.  In flight recorder mode, one can generate a lot of trace data. For example, with all tracing enabled running multiple stress tests I was able to generate ~850K second of trace data, so this will obviously impact disk I/O.

Read more
Colin Ian King

Valgrind stack traces

Sometimes when debugging an application it is useful to generate a stack dump when a specific code path is being executed.  The valgrind tool provides a very useful and easy to use mechanism to do this:

1. Add in the following to the source file:

 #include <valgrind/valgrind.h>  
2. Generate the stack trace at the point you desire (and print a specific message) using VALGRIND_PRINTF_BACKTRACE(), for example:
 VALGRIND_PRINTF_BACKTRACE("Stack trace @ %s(), %d", __func__, __LINE__);  
3. Run the program with valgrind.  You may wish to use the --tool=none option to make valgrind run a little faster:
  valgrind --tool=none ./generate/unix/bin64/acpiexec *.dat  
4. Observe the strack trace. For example, I added this to the ACPICA acpiexec in AcpiDsInitOneObject() and got stack traces such as:
 ACPI: SSDT 0x563a480 00249 (v01 LENOVO TP-SSDT2 00000200 INTL 20061109)  
**7129** Stack trace @ AcpiDsInitOneObject(), 174 at 0x416041: VALGRIND_PRINTF_BACKTRACE (in /home/king/repos/acpica/generate/unix/bin64/acpiexec)
==7129== by 0x4160A6: AcpiDsInitOneObject (in /home/king/repos/acpica/generate/unix/bin64/acpiexec)
==7129== by 0x441F76: AcpiNsWalkNamespace (in /home/king/repos/acpica/generate/unix/bin64/acpiexec)
==7129== by 0x416312: AcpiDsInitializeObjects (in /home/king/repos/acpica/generate/unix/bin64/acpiexec)
==7129== by 0x43D84D: AcpiNsLoadTable (in /home/king/repos/acpica/generate/unix/bin64/acpiexec)
==7129== by 0x450448: AcpiTbLoadNamespace (in /home/king/repos/acpica/generate/unix/bin64/acpiexec)
==7129== by 0x4502F6: AcpiLoadTables (in /home/king/repos/acpica/generate/unix/bin64/acpiexec)
==7129== by 0x405D1A: AeInstallTables (in /home/king/repos/acpica/generate/unix/bin64/acpiexec)
==7129== by 0x4052E8: main (in /home/king/repos/acpica/generate/unix/bin64/acpiexec)

There are a collection of very useful tricks to be found in the Valgrind online manual which I recommend perusing at your leisure.

Read more
Colin Ian King

Striving for better code quality.

Software is complex and is never bug free, but fortunately there are many different tools and techniques available to help to identify and catch a large class of common and obscure bugs.

Compilers provide build options that can help drive up code quality by being particularly strict to detect questionable code constructions, for example gcc's -Wall and -pedantic flags.  The gcc -Werror flag is useful during code development to ensure compilation halts with an error on warning messages, this ensures the developer will stop and fix code.

Static analysis during compilation is also a very useful technique, tools such as smatch and Concinelle can identify bugs such as deferencing of NULL pointers, checks for return values and ranges,  incorrect use of && and ||, bad use of unsigned or signed values and many more beside.  These tools were aimed for use on the Linux kernel source code, but can be used on C application source too.  Let's take a moment to see how to use smatch when building an application.

Download the dependencies:

 sudo apt-get install libxml2-dev llvm-dev libsqlite3-dev

Download and build smatch:
 mkdir ~/src  
cd ~/src
git clone git://repo.or.cz/smatch
cd smatch
make

Now build your application using smatch:
 cd ~/your_source_code  
make clean
make CHECK="~/src/smatch/smatch --full-path" \
CC=~/src/smatch/cgcc | tee warnings.log

..and inspect the warnings and errors in the file warnings.log.  Smatch will produce false-positives, so not every warning or error is necessarily buggy code.

Of course, run time profiling of programs also can catch errors.  Valgrind is an excellent run time profiler that I regularly use when developing applications to catch bugs such as memory leaks and incorrect memory read/writes. I recommend starting off using the following valgrind options:
 --leak-check=full --show-possibly-lost=yes --show-reachable=yes --malloc-fill=  

For example:
 valgrind --leak-check=full --show-possibly-lost=yes --show-reachable=yes \
--malloc-fill=ff your-program

Since the application is being run on a synthetic software CPU execution can be slow, however it is amazingly thorough and produces detailed output that is extremely helpful in cornering buggy code.

The gcc compiler also provides mechanism to instrument code for run-time analysis.  The -fmudflap family of options instruments risky pointer and array dereferencing operations, some standard library string and heap functions as well as some other range + validity tests.   For threaded applications use -fmudflapth instead of -fmudflap.   The application also needs to be linked with libmudflap.

Here is a simple example:
 int main(int argc, char **argv)  
{
static int x[100];
return x[100];
}

Compile with:
 gcc example.c -o example -fmudflap -lmudflap  

..and mudflap detects the error:
 ./example   
*******
mudflap violation 1 (check/read): time=1347817180.586313 ptr=0x701080 size=404
pc=0x7f98d3d17f01 location=`example.c:5:2 (main)'
/usr/lib/x86_64-linux-gnu/libmudflap.so.0(__mf_check+0x41) [0x7f98d3d17f01]
./example(main+0x7a) [0x4009c6]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f98d397276d]
Nearby object 1: checked region begins 0B into and ends 4B after
mudflap object 0x190a370: name=`example.c:3:13 x'
bounds=[0x701080,0x70120f] size=400 area=static check=3r/0w liveness=3
alloc time=1347817180.586261 pc=0x7f98d3d175f1
number of nearby objects: 1

These are just a few examples, however there are many other options too. Electric Fence is a useful malloc debugger, and gcc's -fstack-protector produces extra code to check for buffer overflows, for example in stack smashing. Tools like bfbtester allow us to brute force check command line overflows - this is useful as I don't know many developers who try to thoroughly validate all the options in their command line utilities.

No doubt there are many more tools and techniques available.  If we use these wisely and regularly we can reduce bugs and drive up code quality.

Read more
Colin Ian King

Debugging using Visualisation.

There are many different classes of bugs and hence many different debugging techniques can be used.   Sometimes there is a lot of complexity in a problem and it is hard to make a mental model of what exactly is going on between multiple interdependent components, especially when non-deterministic behaviour is occurring.

Unfortunately the human mind is limited; there is only so much debugging state that it can follow.  The problem is compounded when a bug manifests itself after several hours or days of continuous execution time - there just seems like there is too much state to track and to make sense from.

Looking at thousands of lines of debug trace and trying to spot any lurking evidence that may offer some hint to why code is failing is not trivial. However the brain is highly developed at making sense of visual input, so it makes sense to visualise copious amounts of debug data to spot anomalies.

The kernel offers many ways to gather data, be it via different trace mechanisms or just by using plain old printk().   The steps to debug are thus:

1.  Try and find a way to reliably reproduce the problem to be debugged.
2.  Where possible, try to remove complexity from the problem and still end up with a reliable way of reproducing the problem.
3.  Rig up a way of collecting internal state or data on system activity.   Obviously the type of data to be collected is dependant on the type of problem being worked on.   Be careful that any instrumentation does not affect the behaviour of the system.
4.  Start collecting data and try to reproduce the bug.  You may have to do this multiple times to collect enough data to allow one to easily spot trends over different runs.
5.  Visualise the data.

Iterating on steps 2 to 5 allow one to keep on refining a problem down to the bare minimum required to corner a bug.

Now step 5 is the fun part.  Sometimes one has to lightly parse the output to collect specific items of data. I generally use tools such as awk to extract specific fields or to re-munge data into a format that can be easily processed by graphing tools.  It can be useful to also collect time stamps with the data as some bugs are timing dependant and seeing interactions between components is key to understanding why issues occur.  If one gathers multiple sets of data from different sources then being able to correlate the data on a timestamp  can be helpful.

If I have just a few tens of hundreds items of data to visualise I generally just collate my data into tab or comma separated output and then import it into LibreOffice Calc and then generate graphs from the raw data.   However, for more demanding graphing I normally resort to using gnuplot.   Gnuplot is an incredibly powerful plotting tool - however for more complex graphs one often needs to delve deep into the manual and perhaps crib from the very useful worked examples.

Graphing data allows one to easily spot trends or correlate between seemingly unrelated parts of a system. What was originally an overwhelmingly huge mass of debug data turns into a powerful resource.   Sometimes I find it useful to run multiple tests over a range of tweaked kernel tuneables to see if bugs change behaviour - often this aids understanding when there is significant amounts of inter-component complexity occurring.

Perhaps it is just the way I like to think about issues, but I do recommend experimenting with collecting large data sets and creatively transforming the data into visualise representations to allow one to easily spot issues.  It can be surprising just how much one can glean from thousands of seemingly unrelated  traces of debug data.

Read more
Colin Ian King

A new Ubuntu portal http://odm.ubuntu.com is a jump-start page containing links to pages and documents useful for Original Design Manufactures (ODMs), Original Equipment Manufacturers (OEMs) and Independent BIOS vendors.

Some of the highlights include:

  • A BIOS/UEFI requirements document that containing recommendations to ensure firmware is compatible with the Linux kernel.
  • Getting started links describing how to download, install, configure and debug Ubuntu.
  • Links to certified hardware, debugging tools, SystemTap guides, packaging guides, kernel building notes.
  • Debugging tips, covering: hotkeys, suspend/resume, sound, X and wireless and an A5 sized Ubuntu Debugging booklet.
  • Link to fwts-live, the Firmware Test Suite live image.
 ..so lots of useful technical resources to call upon.

Kudos to Chris Van Hoof for organizing this useful portal.

Read more
Jussi Pakkanen

I played around with btrfs snapshots and discovered two new interesting uses for them. The first one deals with unreliable operations. Suppose you want to update a largish SVN checkout but your net connection is slightly flaky. The reason can be anything, bad wires, overloaded server, electrical outages, and so on.

If SVN is interrupted mid-transfer, it will most likely leave your checkout in a non-consistent state that can’t be fixed even with ‘svn cleanup’. The common wisdom on the Internet is that the way to fix this is to delete or rename the erroneous directory and do a ‘svn update’, which will either work or not. With btrfs snapshots you can just do a snapshot of your source tree before the update. If it fails, just nuke the broken directory and restore your snapshot. Then try again. If it works, just get rid of the snapshot dir.

What you essentially gain are atomic operations on non-atomic tasks (such as svn update). This has been possible before with ‘cp -r’ or similar hacks, but they are slow. Btrfs snapshots can be done in the blink of an eye and they don’t take extra disk space.

The other use case is erroneous state preservation. Suppose you hack on your stuff and encounter a crashing bug in your tools (such as bzr or git). You file a bug on it and then get back to doing your own thing. A day or two later you get a reply on your bug report saying “what is the output of command X”. Since you don’t have the given directory tree state around any more, you can’t run the command.

But if you snapshot your broken tree and store it somewhere safe, you can run any analysis scripts on it any time in the future. Even possibly destructive ones, because you can always run the analysis scripts in a fresh snapshot. Earlier these things were not feasible because making copies took time and possibly lots of space. With snapshots they don’t.

Read more
Colin Ian King

Monitoring /proc/timer_stats

The /proc/timer_stats interface allows one to check on timer usage in a Linux system and hence detect any misuse of timers that can cause excessive wake up events (and also waste power).  /proc/timer_stats reports the process id (pid) of a task that initialised the timer, the name of the task, the name of the function that initialised the timer and the name of the timer callback function.  To enable timer sampling, write "1\n" to /proc/timer_stats and to disable write "0\n".

While this interface is simple to use, collecting multiple samples over a long period of time to monitor overall system behaviour takes a little more effort.   To help with this, I've written a very simple tool called eventstat that calculates the rate of events per second and can dump the data in a .csv (comma separated values) format for importing into a spreadsheet such as LibreOffice for further analysis (such as graphing).

In its basic form, eventstat will run ad infinitum and can be halted by control-C. One can also specify the sample period and number of samples to gather, for example:

 sudo eventstat 10 60  

.. this gathers samples every 10 seconds for 60 samples (which equates to 10 minutes).

The -t option specifies an events/second threshold to discard events less than this threshold, for example:  sudo cpustat -t 10 will show events running at 10Hz or higher.

To dump the samples into a .csv file, use the -r option followed by the name of the .csv file.  If you just want to collect just the samples into a .csv file and not see the statistics during the run, use also the -q option, e.g.

 sudo eventstat -q -r event-report.csv


With eventstat you can quickly identify rouge processes that cause a high frequency of wake ups.   Arguably one can do this with tools such as PowerTop, but eventstat was written to allow one to collect the event statistics over a very long period of time and then help to analyse or graph the data in tools such as Libre Office spreadsheet.


The source is available in the following git repository:  git://kernel.ubuntu.com/cking/eventstat.git and in my power management tools PPA: https://launchpad.net/~colin-king/+archive/powermanagement

In an ideal world, application developers should check their code with tools like eventstat or PowerTop to ensure that the application is not misbehaving and causing excessive wake ups especially because abuse of timers could be happening in the supporting libraries that applications may be using.

Read more
Colin Ian King

Dumping the contents of the Embedded Controller (EC) can be useful when debugging some x86 BIOS/kernel related issues.  At a hardware level to get access to the EC memory one goes via the EC command/status and data port.   As a side note, one can determine these ports as follows:

cat /proc/ioports  | grep EC

The preferred way to access these is via the ACPI EC driver in drivers/acpi/ec.c which is used by the ACPI driver to handle read/write operations to the EC memory region.

In addition to this driver, there the ec_sys module that provides a useful debugfs interface to allow one to read + write to the EC memory.  Write support is enabled with the ec_sys module parameter 'write_support' but it is generally discouraged as one may be poking data into memory may break things in an unpredictable manner, hence by default write support is disabled.

So, to dump the contents of the first EC (assuming debugfs is mounted), do:

sudo modprobe ec_sys
sudo od -t x1 /sys/kernel/debug/ec/ec0/io

Simple!

As a bonus, the General Purpose Event bits are also readable from /sys/kernel/debug/ec/ec0/gpe.

Before I stumbled upon ec_sys.c I used a SystemTap script to execute ec_read() in ec.c to do the reading directly.  Yes it's ugly and stupid, but it does prove SystemTap is a very useful tool.


Read more
Colin Ian King

SystemTap is a very useful and powerful tool that enables one to insert kernel debug into a running kernel.  Today I wanted to inspect the I/O read/write operations occurring when running some ACPI AML, so it was a case of hacking up a few lines of system tap to dump out the relevant state (e.g. which port being accessed, width of the I/O operation in bits and value being written or read).

So instead of spinning a bespoke kernel with a few lines of debug in, I use SystemTap to quickly write a re-usable script.  Simple and easy!

I've put the SystemTap script in my git repository for any who are interested.


Read more
Colin Ian King

The ACPI engine in the kernel can be debugged by building with CONFIG_ACPI_DEBUG and configuring /sys/module/acpi/parameters/debug_layer and /sys/module/acpi/parameters/debug_level appropriately.   This can provide a wealth of data and is generally a very powerful debug state tracing mechanism.  However, there are times when one wants to get a little more debug data out or perhaps just drill down on a specific core area of functionality without being swamped by too much ACPI debug. This is where tools like SystemTap are useful.

SystemTap is a very powerful tool that allows one to add extra debug instrumentation into a running kernel without the hassle and overhead of rebuilding a kernel with debug printk() statements in. It allows very quick turnaround in writing debug and one does not have to reboot a machine to load a new kernel since the debug is loaded and unloaded dynamically.

SystemTap has its own scripting language for writing debug scripts, but for specialised hackery it provides a mechanism ('guru mode') to embed C directly which can be called from the SystemTap script.   The SystemTap language is fairly small and easy to understand and one easily becoming proficient with the language in a day.

The only downside is that one requires a .ddeb kernel package which is huge since it contains all the necessary kernel debug information. 

Over the past week  I have been looking at debugging various aspects of the ACPI core, such as fulling tracing suspend/resume and dumping out executed AML code at run time.   I was able to quickly prototype a SystemTap script that dumps out AML opcodes on the Oneiric kernel - this saved me the usual build of a debug kernel with CONFIG_ACPI_DEBUG enabled and then capturing the appropriate debug and wading through copious amounts of debug data.

Conclusion: Some initial investment in time and effort is required to understand SystemTap (and to get to grips with the more useful features in 'guru mode'). However, one can be far more productive because the debug cycle is made far more efficient. Also, SystemTap provides plenty of functionality to allow very detailed and targeted debugging scripts.


Read more
Colin Ian King

PC Speaker weirdnesses

Today we were trying to do some debugging by getting some tones out of a laptop speaker by frobbing bit 1 of port 0x61 on the keyboard controller.  Rather unexpectedly I got no sound whatsoever out of the speaker, yet I had managed to do so the day before.   So I double checked what had changed since the day before:

1. Was it because I upgraded my kernel?
2. Did I unexpected disabled the speaker when tweaking BIOS settings?
3. Was it something interfering with my port 0x61 bit twiddling?
4. Was the hardware now broken?

As per usual, I first assumed that the most complex parts of the system were to blame as they normally can go wrong in the most subtle way.  After a lot of fiddling around I discovered that the PC speaker only worked when I plugged the AC power into the laptop.  Now that wasn't obvious.

I suspect I should have applied Occam's Razor to this problem to begin with. We live and learn...


Read more
Colin Ian King

Back in February I wrote about turning off a PC using the Intel I/O Controller Hub and had some example code to do this, and it was a dirty hack.  I've revised this code to be more aligned with how the Linux kernel does this, namely:

1. Cater for the possible existence of PM1b_EVT_BLK and PM1b_CNT_BLK registers.
2. Clear WAKE_STATUS before transitioning to S5
3. Instead of setting SLP_TYP and SLP_EN on, one sets SLP_TYPE and then finally sets SLP using separate writes.

The refined program requires 4 arguments, namely the port address of the PM1a_EVT_BLK, PM1b_EVT_BLK, PM1a_CNT_BLK and PM1b_CNT_BLK.  If the PM1b_* ports are not defined, then these should be 0.

To find these ports run either:

cat /proc/ioports | grep PM1

or do:

sudo fwts acpidump - | grep PM1 | grep Block:

For example, on a Sandybridge laptop, I have PM1a_EVT_BLK = 0x400, PM1a_CNT_BLK = 0x404 and the PM1b_* ports are not defined, so I use:

sudo ./halt 0x400 0 0x404 0

..and this will transition the laptop to the S5 state very quickly. Needless to say, make sure you have sync'd and/or unmounted your filesystems before doing this.

The PM1_* port addresses from /proc/ioport and the fwts acpidump come from the PM1* configuration data from the ACPI FACP, so if this is wrong, then powering down the machine won't work.  So, if you can't shutdown your machine using this example code then it's possible the FACP is wrong. At this point, one should sanity check the port addresses using the appropriate Southbridge data sheet for your machine - generally speaking look for:

PM1_STS—Power Management 1 Status Register (aka PM1a_EVT_BLK)
PM1_CNT—Power Management 1 Control  (aka PM1a_CNT_BLK)

..however these are offsets from PMBASE which are defined in the LPC Interface PCI Register Address Map so you may require a little bit of work to figure out the addresses of these registers on your hardware.


Read more
Barry Warsaw

For the last couple of days I've been debugging a fun problem in the Ubuntu tool called Jockey. Jockey is a tool for managing device drivers on Ubuntu. It actually contains both a command-line and a graphical front-end, and a dbus backend service that does all the work (with proper authentication, since it modifies your system). None of that is terribly relevant to the problem, although the dbus bit will come back to haunt us later.


What is important is that Jockey is a Python application, written using many Python modules interfacing to low-level tools such as apt and dbus. The original bug report was mighty confusing. Aside from not being reproducible by myself and others, the actual exception made no fricken sense! Basically, it was code like this that was throwing a TypeError:

_actions = []
# _actions gets appended to at various times and later...
for item in _actions[:]:
# do something

Everyone who reported the problem said the TypeError was getting thrown on the for-statement line. The exception message indicated that Python was getting some object that it was trying to convert to an integer, but was failing. How could you possible get that exception when either making a copy of a list or iterating over that copy? Was the list corrupted? Was it not actually a list but some list-like object that was somehow returning non-integers for its min and max indexes?

To make matters worse, this little code snippet was in Python's standard library, in the subprocess module. A quick search of Python's bug database did reveal some recent threads about changes here, made to ensure that popen objects got properly cleaned up by the garbage collector if they weren't cleaned up explicitly by the program. Note that we're using Python 2.7 here, and after some reading of the tracker issues and the Python subprocess.py code, I just couldn't see any problem here, or at least none that could possibly lead to the error being seen.

When I originally looked at the Launchpad bug report, that was about as far as I got. I couldn't see any possible way this bug could happen, and I couldn't reproduce it, so I set the bug to Incomplete. Unfortunately, it kept hitting Ubuntu Natty beta testers so it wasn't going to go away. Fortunately, Martin Pitt found a test recipe, with which I could reproduce the bug 100% of the time. Yay! At least it probably wasn't going to be a race condition.

How to debug this though? Normally, I'd just attach gdb to the thing and start tracing, but the problem was that when I set the Jockey dbus backend to use the debug version of Python, the error went away (or rather morphed into something that was really unrelated). Here's where I started to sharpen up my favorite yak shaving blade.

The parable about shaving yaks is so entirely appropriate to problems like this that I'll take a quick detour. One of my (and my brother's) favorite shows in the 90's was Ren and Stimpy. That show was groundbreaking, and you can see elements of it in just about every NickToons cartoon on cable today. Some of the episodes are brilliant and others are horrible but as a whole, Ren and Stimpy are an undeniable classic of American animation. In one particular awesome episode, Ren and Stimpy celebrate Kilted Yaks-eve where a Yak comes up from the tub drain and shaves his stubble, leaving behind a present of the highly desired shaving scum for Stimpy in the morning. Based on this episode (most likely!) the programming term "yak shaving" typically means having to take detour after meaningless detour before you can actually solve the problem you're having.

In the context of this bug, the attempt to use the debug build of Python in the Jockey dbus backend was the first yak to shave. Because I wanted to get more information out of the process, I tried to attach to a running backend process, but this proved to be quite difficult. I was debugging this in a 64bit virtual machine, and gdb+debug-python was just not cooperating.

Now the question is: do you debug that problem (or at least get to a place where you can decide to file a bug or not), or do you punt and try a different approach? The former is most definitely shaving the yak; it gets you no closer to fixing the original problem, but certain wastes enough of your time to make you seem productive. Of course, it's inevitable that if you do follow that secondary bug, it'll lead to a third one, and a fourth one, and so on until you're deep in the shaving scum and no closer to a solution to the original problem. You need to be constantly on guard against these types of yak, er, rabbit holes.

I'll spare you the gory details about hitting bugs in my text editor, bugs in building Python on a multiarch system, and other paths that all lead to smooth chinned yaks, but not to happy Pythons. When I punted on trying to attach to the Jockey backend, I actually decided to try to figure out exactly where the TypeError was occurring by using a twist on old-fashioned print statements. By grepping the Python source I found about a dozen occurrences of the error message "an integer is required". Which one was getting tripped? I added a little marker text to every such occurrence and re-built a Python package to aid in debugging.

Here's where I did detour to yak shaving. The Ubuntu source package for Python itself, runs the full test suite on every build. That's great for ensuring a high quality Python package, but it's terrible for turnaround time when experimentally hacking Python. Building the package with DEB_BUILD_OPTIONS="nocheck nobench" is supposed to work, but for some reason did not with my sbuild environment. I guess debugging that would have been like shaving the yak's left check, but instead it was so much easier to shave his right check. So I spent some time with my razor chopping out huge sections of the debian/rules file so that Python 2.7 would build as fast as possible. While a seemingly meaningless task, this did actually help quite a bit since I was able to try out ideas with a much shorter turnaround.

Anyway, with many iterations on the exception marking idea, I finally nailed down the culprit. It was in the Python C API function PyInt_AsLong(), but even here I was unsure which arm of the conditional was getting triggered. Another round of hack-build-scp-reset-test and I found what I suspected: PyInt_AsLong() was getting passed an object that could not be turned into an integer. But what was that object?

So now back to the original problem with gdb. To solve this I downloaded and built a 32bit virtual machine, and was able to reproduce the bug there. Fortunately, in the 32bit environment I was much more successful in attaching to a running backend Jockey dbus process, and even though I didn't have the Python source available (and no, I won't talk about where that particular yak shaving detour led), I could fairly easily print the objects in the debugger at the offending code, where I learned that PyInt_AsLong() was getting called with None as its argument. And yeah, you can't turn None into an integer in Python!

But, how was this function getting called with None? Popping up the call stack led me eventually to PyArg_Parse(), an older C API function that parses a Python tuple into a set of objects based on some format flags. This is used when implementing Python functions in C, to parse an argument list. Popping up the stack again led me into some python-apt code. python-apt is a C++ library that exposes the APT system to Python programs. It's fairly mature and robust, but I wasn't as familiar with all its dark corners as I was with Python.

Now, my first instinct is never that I've found a bug in Python. That's not to say they don't exist, but just that Python has been around so long, is so well tested, and used so extensively that I'm always suspicious in cases like this (i.e. strange, inexplicable bugs that make no sense) of extension modules and third party code. And indeed, my sleuthing has led me to python-apt, exactly the kind of complex Python extension module that can have odd lurking bugs in it. Still, the problem I was now faced with was this: the call stack led me into a code path that had nothing to do with iterating over a list, or copying that list. So, what gives?

Well, it's useful to know how Python exceptions work at the C level. In general terms, when some C code raises an exception, it basically sets some global state, and then returns error codes up the stack until either something catches it and deals with it, or it percolates up to Python's top eval loop. The key thing here is that there are usually two states that exist: the global exception value currently in effect, and an error code that gets returned up the C call stack. Usually this return code is a zero or one, but it can also be NULL or -1 in certain cases. The Python C API documentation is very good at describing these.

So now that I looked closely at the python-apt code, I could see what what going on and it all began to make sense! The python-apt package was pulsing a progress meter, set up as a callback by the client of the python-apt code. Meaning, python-apt couldn't really control what this callback would return. python-apt expected the callback to return either Python's True or False, but it could return anything, including nothing! In Python, True can be coerced into the integer 1 and False into the integer 0, and python-apt wanted an integer, so indeed the call stack lead right to a call to PyArg_Parse() to turn the callback's return object into an integer. What happens if the callback didn't return something that could be turned into an integer, or worse, didn't return anything at all?

In Python, a function always returns something, even if there is no explicit return statement. In those cases, None is implicitly returned. Yes, you see it now. And if not, there was this clue in the python-apt code: "most of the time the user who subclasses the pulse() method forgot to add a return {True,False} so we just assume he wants a True." Translated: Hey guy! You forgot to add a "return True" or "return False" to your pulse() method, and it probably fell of the end, giving us None, which we dutifully passed to PyArg_Parse().

PyArg_Parse() did its thing when getting None, by correctly setting the global exception state to the TypeError, and returning a zero code to indicate an error occurred. But looking at the python-apt code, it recognizes the error code, but forgets that some global exception state was set! Meaning, even though python-apt was ignoring the exception, Python still knew about it. But because control wasn't being passed on up to the Python eval loop, the exception was just lurking there like a angry unshaven yak, waiting to be discovered. And in fact, the next time Python itself checked the exception state was in, yep, the for-loop iterating over the perfectly fine list object. Python hits the for-loop, finds this orphaned TypeError lurking there and raises it, in a place that has literally nothing to do with the original exception.

The fix is a one-liner. In python-apt, where its ignoring any exception returned by PyArg_Parse(), it must both swallow the error code (which it was doing), and clear the global exception state (which it was notdoing). By adding a call to PyErr_Clear(), python-apt was keeping the interpreter's state consistent, and properly ignoring the argument parsing error, thus fixing the bug.

As I mentioned to my colleague Colin Watson, this did turn out to be a fun one to debug, though not as "fun" as one he recently worked on.

Happy Yak Shaving.

Read more
Colin Ian King

Poking around the HD-Audio Configuration

HD-Audio can be reconfigured without having to re-load the driver using the special sysfs files - this enables one to twiddle and debug HD-audio in a relatively pain free way.

Each codec-hwdep device has a sysfs directory in /sys/class/sound populated with a bunch of files which can be read. Some of these files can be written to, enabling one to over-ride the default. For example on my Lenovo I have /sys/class/sound/hwC0D0 which contains:

vendor_id
32 bit codec vendor id (hex) (read-write)

subsystem_id
32 bit subsystem id (hex) (read-write)

revision_id
32 bit revision id (hex) (read-write)

chip_name
name of chipset

afg
AFG id (read-only)

mfg
MFG id (read-only)

name
code name string (read-write)

modelname
current model option (read-write)

init_verbs
verbs to be execute at initialisation time. Extra verbs can be added by writing to init_verbs as numbers in the form: nid verb parameter

hints
hint strings in format 'key= value', e.g. eapd_switch = yes (for example, this particular hint is picked up by a call to snd_hda_get_bool_hint(codec, "eapd_switch") in the patch_sigmatel.c source)

init_pin_configs
show the default initial pin configurations as set up by the BIOS at boot time

driver_pin_configs
show the default pin configurations as set by the codec parser. Only pin configurations changed by the parser are shown.

user_pin_configs
show ping configurations used to override the BIOS set up configurations. One can append new values by writing a nid and value tuple to this file.

reconfig
triggers codec reconfiguration by writing any value to this file

clear
codec reset, remove mixer elements, clear all init verbs and hints

For example to see the BIOS pin configurations on my Lenovo laptop:

$ cat /sys/class/sound/hwC0D0/init_pin_configs
0x14 0x99130110
0x15 0x411111f0
0x16 0x411111f0
0x17 0x411111f0
0x18 0x03a19820
0x19 0x99a3012f
0x1a 0x411111f0
0x1b 0x0321401f
0x1c 0x411111f0
0x1d 0x40178e2d
0x1e 0x411111f0

And to identify my audio hardware, I can use:

$ cat /sys/class/sound/hwC0D0/ \
{vendor_name,chip_name,vendor_id,subsystem_id}
Realtek
ALC861-VD
0x10ec0862
0x17aa3867

..showing that my Lenovo has Realtek ALC861-VD and the vendor and subsystem ids.

Enjoy!


Read more
Colin Ian King

Kernel Oops page fault error codes

The x86 Linux kernel Oops messages provide normally just enough information to help a kernel developer corner and fix critical bugs. The start of a typical Oops message may look like the following:

kernel BUG at kernel/signal.c:1599!
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pc = 84427f6a
*pde = 00000000
Oops: 0001 [#1]

The 4 digit value after the "Oops:" message dumps out the page fault error code in hexadecimal which in turn can help one deduce what caused the oops. The page fault error code is encoded as follows:

bit 0 - 0 = no page found, 1 = protection fault
bit 1 - 0 = read access, 1 = write access
bit 2 - 0 = kernel-mode access, 1 = user mode access
bit 3 - 0 = n/a, 1 = use of reserved bit detected
bit 4 - 0 = n/a, 1 = fault was an instruction fetch

So, in the above example, the Oops error code was 0x0001 which means it was a page protection fault, read access in kernel mode.

A lot of Oops error codes are 0x0000, which means a page was not found by a read access in kernel mode.

For more information, consult arch/x86/mm/fault.c


Read more
Colin Ian King

Kernel Early Printk Messages

I've been messing around with the earlyprintk kernel options to allow me to get some form of debug out before the console drivers start later on in the kernel init phase. The earlyprintk kernel option supports debug output via the VGA, serial port and USB debug port.

The USB debug port is of interest - most modern systems seem to provide a debug port capability which allows one to send debug over USB to another machine. To check if your USB controller has this capability, use:

sudo lspci -vvv | grep "Debug port"

and look for a string such as "Capabilities: [58] Debug port: BAR=1 offset=00a0". You may have more than one of these on your system, so beware you use the correct one.

One selects this mode of earlyprintk debugging using:


earlyprintk=dbgp

for the default first port, or select the Nth debug enabled port using:

earlyprintk=dbgpN

One also needs to build a kernel with the following config option enabled:

CONFIG_EARLY_PRINTK_DBGP=y

On my debug set-up I used a NET20DC-USB Hi-Speed USB 2.0 Host-to-Host Debug Device connecting the target machine and a host with which I capture the USB debug using /dev/ttyUSB0 with minicom. So that I won't bore you with the details, this is all explained in the kernel documentation in Documentation/x86/earlyprintk.txt

As it was, I needed to tweak the earlyprintk driver to put in some delays in the EHCI probing and reset code to get it working on my fairly fast target laptop.
My experience with this approach wasn't great - I had to plug/unplug the debug device quite frequently for the earlyprintk EHCI reset and probe to work. Also, the EHCI USB driver initialisation later on in the kernel initialisation hung which wasn't useful.

Overall, I may have had problems with the host/target and/or the NET20DC-USB host-to-host device, but it did allow me to get some debug out, be it rather unreliably.

Probably an easier way to get debug out is just using the boot option:

earlyprintk=vga

however this has the problem that the messages are eventually overwritten by the real console.

Finally, for anyone with old legacy serial ports on their machine (which is quite unlikely nowadays with newer hardware), one can use:

earlyprintk=serial,ttySn,baudrate

where ttySn is the nth tty serial port.

One can also append the ",keep" option to not disable the earlyprintk once the real console is up and running.

So, with earlyprintk, there is some chance of being able to get some form of debug out to a device to allow one to debug kernel problems that occur early in the initialisation phase.


Read more
Colin Ian King

Debugging grub2 with gdb

?ubomír Rintel has written a detailed and very helpful debugging guide for grub2. The guide covers how to debug with gdb using emulators such as QEMU and Bochs as well as traditional serial line debugging using a null-modem.

The tricky part is to be able to pull in the debug and symbol files for dynamically loaded modules, however this has been solved by ?ubomír with a gdb and perl script.

The guide gives some useful tricks which can be used not just with grub2 but other boot loaders too. It's well worth a look just to learn some useful gdb hacks.


Read more
Colin Ian King

Debugging with QEMU and gdb

QEMU is one very powerful tool - and combined with gdb this has allowed me to debug Intel based boot loaders. Here is a quick run down of the way I drive this:

Firstly, I recommend removing KVM as this has caused me some grief catching breakpoints. This means QEMU will run slower, but I want to remove any kind of grief I can to simplify my debugging environment.

Start QEMU and use the -s -S options to enable gdb debugging and halt the CPU to wait for gdb to attach:

$ qemu -s -S -bios bios-efi.bin -m 1024 karmic-efi-qcow2.img -serial stdio

..then in another terminal, attach gdb:

$ gdb
GNU gdb (GDB) 7.0-ubuntu
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.

(gdb) target remote localhost:1234
Remote debugging using localhost:1234
0x0000fff0 in ?? ()
(gdb)


..and start the boot process..

(gdb) c


..and get debugging!


Read more
Colin Ian King

QEMU and EFI BIOS hackery

Earlier this week I blogged about QEMU and EFI BIOS. Trying to debug a problem with grub2-efi-ia32 has given me a few little headaches but I'm finding ways to work around them all.

The first issue is getting a system installed with an EFI BIOS. My quick hack was to create a 4GB QEMU qcow2 disk image and then inside this create a small EFI FAT12 boot partition using fdisk - (partition type 0xef in the 1st primary partition). I then installed Ubuntu Karmic Desktop with ext4 and swap in primary partitions 2 and 3 by booting with the conventional BIOS. I then installed grub2-efi-ia32 in the EFI boot partition and then booted QEMU using the TianoCore EFI BIOS that has been ported to QEMU.

One problem is that the EFI BIOS does not scroll the screen, hence all output when it reaches the end of the screen just keeps over writing the last line, making debugging with printf() style prints nearly impossible. Then I found that the BIOS emits characters over a serial port, which QEMU can emulate. Unfortunately, the output contains VT control characters to do cursor positioning and pretty console colours, which makes reading the output a little painful. So I hacked up a simple tool to take the output from QEMU and strip out the VT control chars to make the text easier to read.

Now QEMU boot line is as follows:

qemu -bios bios-efi.bin -m 1024 karmic-efi-qcow2.img -serial stdio 2>&1 | ./parse-output

..and this dumps the output from the BIOS and grub2 to stdout in a more readable form.

The parse-output tool is a little hacky - but does the job. For reference, I've put it in my debug repository here.

Grub2 allows one to enable some level of debugging output by issuing the following command:

set debug=all

..which gives me some idea of what's working or broken at a fairly low enough level before I start attaching gdb. Fortunately debugging using gdb has been fairly well documented here - I just now need to shoe-horn in small patch to allow me to attach gdb to grub2 from outside QEMU - but that's for another debug session...


Read more