I’ve been working on the bug below on and off for ~4 weeks now. As it’s taken so long, I felt obliged to write a few words detailing the processes and methods used to debug this most horrific bug. The problem occurs when a laptop with an internal SD card reader is put to sleep (suspend or hibernate) when a card is inserted and mounted. Yesterday I discovered that this bug only affects cards which support an ext2 or ext4 filesystem (vfat and ext3 filesystems do not succumb).

LP Bug #569882
  Suspend Lenovo Thinkpad X61 with card in SD card reader

This bug and others relating to it have been reported to Ubuntu over 130 times and ~20 bugs are related. A couple of those have an importance rating of ‘High’. Annoyingly, while I was working on this bug someone marked it as a duplicate of LP Bug #477106, so I hopped over and used that as the reference bug instead. It is there you can see the true range of this bug and where all other related bugs point. A list of related bugs can be found at the bottom of this page.

LP Bug #477106
  [regression] lucid alpha-2 and earlier freeze upon suspend with sd card plugged
in with some hardware

The first debugging phase was to decipher which device was causing the issue. As the most common factor, I decided the best course of action was to hunt down the offending SD/MMC card reader driver and disable it. An ‘lsmod’ informed me that the sdhci and sdhci_pci drivers were in current operation. I ‘rmmod’ed them and attempted to sleep – good night ThinkPad. I had a dig around in the Power Management Utilities (pm-utils) and found the variable SUSPEND_MODULES which is sourced from the pm configs. I posted this on the bug report which seemed to please most:

Momentary workaround:
  echo "SUSPEND_MODULES=\"sdhci sdhci_pci\"" | sudo tee -a /etc/pm/config.d/suspend_modules

Naturally, we’d like this bug to be eradicated forever and for Ubuntu to work seamlessly ‘out-of-the-box’. So I endeavored to find a more permanent solution within the kernel. After turning debug on within the sdhci driver and adding a few of my own DBGs, it was apparent that the system was coming to a screaming halt within the interrupt handler. My initial theory was that the SD card was being accessed following a full filesystem sync and free, thus causing a kernel oops or panic before the system could be fully suspended. I placed in some extra debug code to try and follow the thread of execution, but further attempts were seemingly futile.

(Thanks to cnd) I enlisted the help of ftrace to aid further kernel prodding, by enabling function_graph (Kernel hacking -> Tracers -> Kernel function tracer & Kernel function graph tracer). Once enabled it can be controlled from within sysfs. To enable and disable, echo ‘function_graph’ and ‘nop’ to /sys/kernel/debug/tracing/current_tracer respectively. Once enabled simply cat /sys/kernel/debug/tracing/trace_pipe, either to stdout or a file. This produces output similar to this:

 0)               |  schedule() {
 0)   0.334 us    |    rcu_sched_qs();
 0)   0.319 us    |    _spin_lock_irq();
 0)               |    put_prev_task_fair() {
 0)               |      update_curr() {
 0)   0.296 us    |        task_of();
 0)   0.951 us    |      }
 0)   1.584 us    |    }

The function tracer almost always ended in a spinlock, which lead me to my next theory, spinlock corruption. I enabled every configuration related to spinlocks and other locking mechanisms within the kernel. These included; Detect soft lockups, Detect hung tasks, Spinlock and rw-lock debugging, Mutex debugging, Lock debugging: detect incorrect freeing of live locks/prove locking and correctness and finally, Spinlock debugging. None of this helped.

Next I tried enabling the system console. Oddly, GDM decided it wanted to jump on the system console instead of it’s own tty. This made things a little more tricky. I killed it and tried to suspend manually using pm-suspend. A message informing me of the suspend would appear, then the console would go to sleep, rendering me blind as to what was happening. (thanks to cking for suggesting) CONFIG_PM_DISABLE_CONSOLE was the solution to this. It prevents the console from being allowed to sleep and subsequently turning off. After enabling this feature, it was plain to see, well… nothing! Even though I was on the system console, no kernel messages were produced. I knew they were there, as many of them had been saved in /var/log/kern.log and /var/log/messages, but they weren’t being displayed. I believe this has something to do with gdm and the differences in suspend regimes. I tried displaying pm-suspend using set +x. The script was displayed and nothing else.

Next on the agenda was to enable system console via serial. As I don’t own a ThinkPad dock, I had to use a serial->USB converter. Once up and running it was clear that the thread of execution was going a great deal further than the sdhci interrupt handler. As it turns out, the debug messages were finishing there because the final sync to disk was already performed. The drives (and therefore the logging) were no longer available. It is worth mentioning that suspend/resume logs are written on the resume, hence if suspend fails, the logs are never written – as with this case.

Now we have serial working, I figured it might be worth giving kgdb a shot. I enabled it in Kernel hacking and tried to configure it over serial. The arguments kgdbwait and kgdboc=ttyUSB0,115200 were placed onto the kernel commandline, but kgdb just spat out “kgdb: Unregistered I/O driver kgdbts, debugger disabled.” I then tried to place them in manually via sysfs, giving: “echo ttyUSB0 | sudo tee -a /sys/module/kgdboc/parameters/kgdboc”, only to receive “No such device”. Supplementing ttyUSB0 for ttyS0 froze my computer solid and presumably passed all control to the serial port (which I don’t have the expansion/dock for). Result, kgdb only supports true serial devices.

I found that it was possible to write the ftrace results out over serial, however, all messages stopped when the USB->Serial and USB went to sleep. I tried to keep them awake by enabling CONFIG_USB_SUSPEND_OFF and hacking early returns into usb_serial_suspend and uart_suspend_port, but it didn’t seem to make any difference.

More debugging options were then added over successive builds, including; PM verbose, ACPI debug, ThinkPad ACPI debug, and Kobject debugging. The latter was a bad idea, as it swamped the console and logs making them unreadable. No further useful data was produced by enabling them.

Finally I decided to put an early return to the sdhci suspend routine, just in case the thread of execution was reaching it – this worked, but resume was not possible, therefore neither was access to the logs. An early return in the resume routine allowed a successful resume of the system, providing me with access to log functionality. Now I had printk I would follow the execution path with an advancing return statement, placing breadcrumbs (printks) behind it. This is where debugging became _slow_ but steady.

Due to the fact that sdhci functionality is plugged in as a module, debugging pointers to functions was a little tricky. In the end I compiled all MMC/SD code into the kernel as built-ins. This allowed me to decode them by looking them up in the System.map. It then seemed as though the MMC remove routine was being called recursively. I placed in a static variable and an “if (count++ > 10) return;” to unwind any accidental recursion.

I followed the thread of execution though the following path:

(Remember: I discovered this by advancing a return statement and commenting out functions as I went. This kind of debugging is very _slow_ progress.)

sdhci_suspend_host    - drivers/mmc/host/sdhci.c
mmc_suspend_host      - drivers/mmc/core/core.c
mmc_sd_remove         - drivers/mmc/core/sd.c
mmc_remove_card       - drivers/mmc/core/bus.c
device_del            - drivers/base/core.c
bus_remove_device     - drivers/base/bus.c
device_release_driver - drivers/base/dd.c
mmc_bus_remove        - drivers/mmc/core/bus.c
mmc_blk_remove        - mmc/card/block.c
del_gendisk           - fs/partitions/check.c
invalidate_partition  - block/genhd.c
fsync_bdev            - fs/block_dev.c
sync_filesystem       - fs/sync.c
sync_quota_sb         - fs/quota/quota.c
ext2_sync_fs          - fs/ext2/super.c
ext2_sync_super       - fs/ext2/super.c
mark_buffer_dirty     - fs/buffer.c
set_page_dirty        - fs/buffer.c
radix_tree_tag_set    - lib/radix-tree.c
root_tag_set          - lib/radix-tree.c

I eventually found myself in an inline function called root_tag_set, which contained the following line of code.

root->gfp_mask |= (__force gfp_t)(1 << (tag + __GFP_BITS_SHIFT));

I tested this line to oblivion, trying to find out why it was failing. It turned out that it wasn’t failing at all, but it was making a difference. When this line was commented out suspend/resume would work perfectly, with the code left in, the system would come to a crashing halt. Another piece of code must be using the tag bit later down the line. The tag bit is used to mark memory pages which are dirty, or under write back [1]. Result, false alarm.

So I backed up to sync_filesystem and followed through another failing route I found. This went as follows:

sdhci_suspend_host        - drivers/mmc/host/sdhci.c
mmc_suspend_host          - drivers/mmc/core/core.c
mmc_sd_remove             - drivers/mmc/core/sd.c
mmc_remove_card           - drivers/mmc/core/bus.c
device_del                - drivers/base/core.c
bus_remove_device         - drivers/base/bus.c
device_release_driver     - drivers/base/dd.c
mmc_bus_remove            - drivers/mmc/core/bus.c
mmc_blk_remove            - mmc/card/block.c
del_gendisk               - fs/partitions/check.c
invalidate_partition      - block/genhd.c
fsync_bdev                - fs/block_dev.c
sync_filesystem           - fs/sync.c
sync_blockdev             - fs/block_dev.c
filemap_write_and_wait    - mm/filemap.c
filemap_fdata_write       - mm/filemap.c
filemap_fdata_write_range - mm/filemap.c
do_writepages             - mm/page-writeback.c
generic_writepages        - mm/page-writeback.c
write_cashe_pages         - mm/page-writeback.c
writepage                 - mm/page-writeback.c
blkdev_writepage          - fs/block_dev.c
block_write_full_page     - fs/buffer.c

I’m sure you can imagine the kind of debug vomit you’d expect from placing printks in here. Result, another dead end.

Unsure if it was happening during SD unload or something was referencing it after in the common code areas, I decided to insert debug in the tail-end of sdhci_suspend_host. But what? printks were no good. I had to run the code with the error residing, in which case I wouldn’t receive any output unless there was a successful resume. I ended up stealing a small for-loop (from cking) which flashed the CapsLk LED situated on the monitor several times. Result, the error was occurring somewhere during the SD unload.

I found another piece of information today which may prove to be useful:

- This problem only occurs when _both_ the HDD and SD drives are mounted. If either one of them are unmounted, the problem does not surface. I uncovered this by booting a live CD and suspending. Under normal live CD operation the system suspends without issue. When I manually mount the HDD, then try to suspend the computer crumbles.

From here I had two possible routes to follow up on:

- Working backwards, up the thread of execution from the tail end of sdhci_suspend_host (LED flashing).

- Change the suspend/resume sequence, so the sdhci suspends _before_ some of the useful debugging tools.

But before I had chance to act on them I met an Israeli gentleman named Maxim Levitsky. This wasn’t quite a chance meeting, I actually emailed him because it looked like he was toying with the same symptoms as me. It turned out has he is a bit of a subject matter expert in suspend/resume bugs within the Linux kernel (although, he denies it). He has written a patch which allow system logs to be printed into an area of RAM specified on the kernel commandline. Whenever a deadlock or hang occurs the system reboots automatically allowing power to be applied throughout and RAM to be persistent over said reboot. When the device reboots, the specified area of RAM can be read via sysfs and piped though strings for clarity. This one’s going in lag’s toolbox for sure.

So, what had been alluding us for all this time? The answer lies in the MMC subsystem. During a suspend operation, multiple areas of the kernel are being bedded down simultaneously. When the MMC subsystem is removed, a del_gendisk is attempted which eventually calls prepare_to_wait, but because the kernel and userspace threads have been frozen due to the suspend they will never respond. Result, OS fail!

To rectify situations such as these there is a feature called pm_notify. Once registered, pm_notify routines will be called _prior_ to any imminent sleep action (suspend or hibernate). It is here we must remove the SD card and sync its filesystem. This will ensure that the other kernel and user threads will respond when waited on.

For anyone interested the working patch written by Maxim Levitsky can be found here:

  - http://kernel.ubuntu.com/git?p=lag/ubuntu-lucid.git;a=commitdiff;h=
    6e99723a4fcd62cfe480f46e1f0ea0ab7359a72c;hp=3dd3d3a8be6cf0039af2ff94cf91c5c01b69b782

Don’t you just love suspend/resume bugs?!

=================    Duplicate bugs    =================

LP Bug #330671
  [Dell Mini-9] suspend/resume problem

LP Bug #345354
  [Dell Inc. Latitude D420] suspend/resume failure

LP Bug #353387
  [Dell Inc. Inspiron 910] suspend/resume failure

LP Bug #353596
  [Dell Inc. Inspiron 910] suspend/resume failure

LP Bug #436729
  [Dell Inc. Inspiron 910] suspend/resume failure when SD card is mounted

LP Bug #442314
  [Dell Inc. Inspiron 910] suspend/resume failure [non-free: wl]

LP Bug #444051
  Dell mini 9. suspend/resume failure with Karmic Beta

LP Bug #449688
  [Dell Inc. Inspiron 910] suspend/resume failure [non-free: wl]

LP Bug #450317
  [Dell Inc. Inspiron 910] suspend/resume failure [non-free: wl]

LP Bug #451254
  [Dell Inc. Inspiron 910] suspend/resume failure [non-free: wl]

LP Bug #462722
  Karmic RC fails to suspend (freezes) on Thinkpad X61s

LP Bug #464712
  Black screen + freeze when suspending or hibernating with mounted SD Card

LP Bug #468243
  [Thinkpad X61 tablet] suspend/resume failure

LP Bug #469087
  Suspend doesn't work on Dell Vostro A90/Dell Mini 9 with SD card inserted

LP Bug #523957
  If SD card inserted on Dell Mini 9 suspend locks up machine. This is fixed if
the SD card is locked. Version 9.10 has this bug. Version 10.04 also has this bug.

LP Bug #530432
  suspend with sd card inserted fails to suspend, hangs, and loses USB

LP Bug #561677
  [STAGING] Freeze on suspend if SD card is mounted with Acer Extensa 5630Z

LP Bug #562274
  Suspend/Resume now fails on 10.04 on Dell Mini 9

==================    References    ==================

[1] Trees I: Radix trees [Posted March 13, 2006 by corbet]. http://lwn.net/Articles/175432/