Canonical Voices

Posts tagged with 'kernel'

Colin Ian King

Static analysis on the Linux kernel

There are a wealth of powerful static analysis tools available nowadays for analyzing C source code. These tools help to find bugs in code by just analyzing the source code without actually having to execute the code.   Over that past year or so I have been running the following static analysis tools on linux-next every weekday to find kernel bugs:

Typically each tool can take 10-25+ hours of compute time to analyze the kernel source; fortunately I have a large server at hand to do this.  The automated analysis creates an Ubuntu server VM, installs the required static analysis tools, clones linux-next and then runs the analysis.  The VMs are configured to minimize write activity to the host and run with 48 threads and plenty of memory to try to speed up the analysis process.

At the end of each run, the output from the previous run is diff'd against the new output and generates a list of new and fixed issues.  I then manually wade through these and try to fix some of the low hanging fruit when I can find free time to do so.

I've been gathering statistics from the CoverityScan builds for the past 12 months tracking the number of defects found, outstanding issues and number of defects eliminated:

As one can see, there are a lot of defects getting fixed by the Linux developers and the overall trend of outstanding issues is downwards, which is good to see.  The defect rate in linux-next is currently 0.46 issues per 1000 lines (out of over 13 million lines that are being scanned). A typical defect rate for a project this size is 0.5 issues per 1000 lines.  Some of these issues are false positives or very minor / insignficant issues that will not cause any run time issues at all, so don't be too alarmed by the statistics.

Using a range of static analysis tools is useful because each one has it's own strengths and weaknesses.  For example smatch and sparse are designed for sanity checking the kernel source, so they have some smarts that detect kernel specific semantic issues.  CoverityScan is a commercial product however they allow open source projects the size of the linux-kernel to be built daily and the web based bug tracking tool is very easy to use and CoverityScan does manage to reliably find bugs that other tools can't reach.  Cppcheck is useful as scans all the code paths by forcibly trying all the #ifdef'd variations of code - which is useful on the more obscure CONFIG mixes.

Finally, I use clang's scan-build and the latest verion of gcc to try and find the more typical warnings found by the static analysis built into modern open source compilers.

The more typical issues being found by static analysis are ones that don't generally appear at run time, such as in corner cases like error handling code paths, resource leaks or resource failure conditions, uninitialized variables or dead code paths.

My intention is to continue this process of daily checking and I hope to report back next September to review the CoverityScan trends for another year.

Read more
Colin Ian King

The BPF Compiler Collection (BCC) is a toolkit for building kernel tracing tools that leverage the functionality provided by the Linux extended Berkeley Packet Filters (BPF).

BCC allows one to write BPF programs with front-ends in Python or Lua with kernel instrumentation written in C.  The instrumentation code is built into sandboxed eBPF byte code and is executed in the kernel.

The BCC github project README file provides an excellent overview and description of BCC and the various available BCC tools.  Building BCC from scratch can be a bit time consuming, however,  the good news is that the BCC tools are now available as a snap and so BCC can be quickly and easily installed just using:

 sudo snap install --devmode bcc  

There are currently over 50 BCC tools in the snap, so let's have a quick look at a few:

cachetop allows one to view the top page cache hit/miss statistics. To run this use:

 sudo bcc.cachetop  

The funccount tool allows one to count the number of times specific functions get called.  For example, to see how many kernel functions with the name starting with "do_" get called per second one can use:

 sudo bcc.funccount "do_*" -i 1  

To see how to use all the options in this tool, use the -h option:

 sudo bcc.funccount -h  

I've found the funccount tool to be especially useful to check on kernel activity by checking on hits on specific function names.

The slabratetop tool is useful to see the active kernel SLAB/SLUB memory allocation rates:

 sudo bcc.slabratetop  

If you want to see which process is opening specific files, one can snoop on open system calls use the opensnoop tool:

 sudo bcc.opensnoop -T

Hopefully this will give you a taste of the useful tools that are available in BCC (I have barely scratched the surface in this article).  I recommend installing the snap and giving it a try.

As it stands,BCC provides a useful mechanism to develop BPF tracing tools and I look forward to regularly updating the BCC snap as more tools are added to BCC. Kudos to Brendan Gregg for BCC!

Read more
Colin Ian King

Finding kernel bugs with cppcheck

For the past year I have been running the cppcheck static analyzer against the linux kernel sources to see if it can detect any bugs introduced by new commits. Most of the bugs being found are minor thinkos, null pointer de-referencing, uninitialized variables, memory leaks and mistakes in error handling paths.

A useful feature of cppcheck is the --force option that will check against all the configurations in the source (and the kernel does have many!).  This allows us to check for code that may not be exercised much (because it is normally not built in with most config options) or even find dead code.

The downside of using the --force option is that each source file may need to be checked multiple times for each configuration.  For ~20800 sources files this can take a 24 processor server several hours to process.  Errors and warnings are then compared to previous runs (a delta), making it relatively easy to spot new issues on each run.

We also use the latest sources from the cppcheck git repository.  The upside of this is that new static analysis features are used early and this can result in finding existing bugs that previous versions of cppcheck missed.

A typical cppcheck run against the linux kernel source finds about 600 potential errors and 1700 warnings; however a lot of these are false positives.  These need to be individually eyeballed to sort the wheat from the chaff.

Finally, the data is passed through a gnu plot script to generate a trend graph so I can see how errors (red) and warnings (green) are progressing over time:

..note that the large changes in the graph are mostly with features being enabled (or fixed) in cppcheck.

I have been running the same experiment with smatch too, however I am finding that cppcheck seems to have better code coverage because of the --force option and seems to have less false positives.   As it stands, I am finding that the most productive time for finding issues is around the -rc1 and -rc2 merge times (obviously when most of the the major changes land in the kernel).  The outcome of this work has been a bunch of small fixes landing in the kernel to address bugs that cppcheck has found.

Anyhow, cppcheck is an excellent open source static analyzer for C and C++ that I'd heartily recommend as it does seem to catch useful bugs.

Read more
Colin Ian King

During idle moments in the final few weeks of 2014 I have been adding some more stressors and features to stress-ng as well as tidying up the code and fixing some small bugs that have crept in during the last development spin.   Stress-ng aims to stress a machine with various simple torture tests to trip overheating and kernel race conditions.

The mmap stressor now has the '--mmap-file' to use synchronous file backed memory mapping instead of the default anonymous mapping, and the '--mmap-async' option enables asynchronous file mapping if desired.

For socket stressing, the '--sock-domain unix' option now allows AF_UNIX (aka AF_LOCAL) sockets to be used. This compliments the existing AF_INET and AF_INET6 IPv4 and IPv6 protocols available with this stress test.

The CPU stressor now includes mixed integer and floating point stressors, covering 32 and 64 bit integer mixes with the float, double and long double floating point types. The generated object code contains a nice mix of operations which should exercise various functional units in the CPU.  For example, when running on a hyper-threaded CPU one notices a performance hit because these cpu stressor methods heavily contend on the CPU math functional blocks.

File based locking has been extended with the new lockf stressor, this stresses multiple locking and unlocking on portions of a small file and the default blocking mode can be turned into a CPU consuming rapid polling retry with the '--lockf-nonblock' option.

The dup(2) system call is also now stressed with the new dup stressor. This just repeatedly dup's a file opened on /dev/zero until all the free file slots are full, and then closes these. It is very much like the open stressors.

The fcntl(2) F_SETLEASE command is stress tested with the new lease stressor. This has a parent process that rapidly locks and unlocks a file based lease and 1 or more child processes try to open this file and cause lease breaking signal notifications to the parent.

For x86 CPUs, the cache stressor includes two new cache specific options. The '--cache-fence' option forces write serialization on each store operation, while the '--cache-flush' option forces flush cache on each store operation. The code has been specifically written to not incur any overhead if these options are not enabled or are not available on non-x86 CPUs.

This release also includes the stress-ng project mascot too; a humble CPU being tortured and stressed by a rather angry flame.

For more information, visit the stress-ng project page, or check out the stress-ng manual.

Read more
Colin Ian King

Before I started some analysis on benchmarking various popular file systems on Linux I was recommended to read "Systems Performance: Enterprise  and the Cloud" by Brendan Gregg.

In today's modern server and cloud based systems the multi-layered complexity can make it hard to pin point performance issues and bottlenecks. This book is packed full useful analysis techniques covering tracing, kernel internals, tools and benchmarking.

Critical to getting a well balanced and tuned system are all the different components, and the book has chapters covering CPU optimisation (cores, threading, caching and internconnects),  memory optimisation (virtual memory, paging, swapping, allocators, busses),  file system I/O, storage, networking (protcols, sockets, physical connections) and typical issues facing cloud computing.

The book is full of very useful examples and practical instructions on how to drill down and discover performance issues in a system and also includes some real-world case studies too.

It has helped me become even more focused on how to analyse performance issues and consider how to do deep system instrumentation to be able to understand where any why performance regressions occur.

All-in-all, a most systematic and well written book that I'd recommend to anyone running large complex servers and cloud computing environments.

Read more
Colin Ian King

even more stress in stress-ng

Over the past few weeks in spare moments I've been adding more stress methods to stress-ng  ready for Ubuntu 15.04 Vivid Vervet.   My intention is to produce a rich set of stress methods that can stress and exercise many facets of a system to force out bugs, catch thermal over-runs and generally torture a kernel in a controlled repeatable manner.

I've also re-structured the tool in several ways to enhance the features and make it easier to maintain.  The cpu stress method has been re-worked to include nearly 40 different ways to stress a processor, covering:

  • Bit manipulation: bitops, crc16, hamming
  • Integer operations: int8, int16, int32, int64, rand
  • Floating point:  long double, double,  float, ln2, hyperbolic, trig
  • Recursion: ackermann, hanoi
  • Computation: correlate, euler, explog, fibonacci, gcd, gray, idct, matrixprod, nsqrt, omega, phi, prime, psi, rgb, sieve, sqrt, zeta
  • Hashing: jenkin, pjw
  • Control flow: jmp, loop
..the intention was to have a wide enough eclectic mix of CPU exercising tests that cover a wide range of typical operations found in computationally intense software.   Use the new --cpu-method option to select the specific CPU stressor, or --cpu-method all to exercise all of them sequentially.

I've also added more generic system stress methods too:
  • bigheap - re-allocs to force OOM killing
  • rename - rename files rapidly
  • utime - update file modification times to create lots of dirty file metadata
  • fstat - rapid fstat'ing of large quantities of files
  • qsort - sorting of large quantities of random data
  • msg - System V message sending/receiving
  • nice - rapid re-nicing processes
  • sigfpe - catch rapid division by zero errors using SIGFPE
  • rdrand - rapid reading of Intel random number generator using the rdrand instruction (Ivybridge and later CPUs only)
Other new options:
  • metrics-brief - this dumps out only the bogo-op metrics that are relevant for just the tests that were run.
  • verify - this will sanity check the stress results per iteration to ensure memory operations and CPU computations are working as expected. Hopefully this will catch any errors on a hot machine that has errors in the hardware. 
  • sequential - this will run all the stress methods one by one (for a default of 60 seconds each) rather than all in parallel.   Use this with the --timeout option to run all the stress methods sequentially each for a specified amount of time. 
  • Specifying 0 instances of any stress method will run an instance of the stress method on all online CPUs. 
The tool also builds and runs on Debian kFreeBSD and GNU HURD kernels although some stress methods or stress options are not included due to lack of support on these other kernels.
The stress-ng man page gives far more explanation of each stress method and more detailed examples of how to use the tool.

For more details, visit here or read the manual.

Read more
Colin Ian King

Testing eCryptfs

Over the past several months I've been occasionally back-porting a bunch of eCryptfs patches onto older Ubuntu releases.  Each back-ported fix needs to be properly sanity checked and so I've been writing test cases for each one and adding them to the eCryptfs test suite.

To get hold of the test suite, check it out using bzr:

 bzr checkout lp:ecryptfs  
and install the dependencies so one can build the test suite:
 sudo apt-get install debhelper autotools-dev autoconf automake \
intltool libtool libgcrypt11-dev libglib2.0-dev libkeyutils-dev \
libnss3-dev libpam0g-dev pkg-config python-dev swig acl \
If you want to test eCrytpfs with xfs and btrfs as the lower file system onto which eCryptfs is mounted, then one needs to also install the tools for these:
 sudo apt-get install xfsprogs btrfs-tools  
And then build the test programs:
 cd ecryptfs  
autoreconf -ivf
intltoolize -c -f
./configure --enable-tests --disable-pywrap
To run the tests, one needs to create lower and upper mount points. The tests allow one to create ext2, ext3, ext4, xfs or btrfs loop-back mounted file systems on the lower mount point, and then eCryptfs is mounted on the upper mount point on top.   To create these, use something like:
 sudo mkdir /lower /upper  
The loop-back file system image needs to be placed somewhere too, I generally place mine in a directory /tmp/image, so this needs creating too:
 mkdir /tmp/image  
There are two categories of tests, "safe" and "destructive".  Safe tests should run in such a ways as to not lock up the machine.  Destructive tests try hard to force bugs that can cause kernel oopses or panics. One specifies the test category with the -c option.  Now to run the tests, use:
 sudo ./tests/ -K -c safe -b 1000000 -D /tmp/image -l /lower -u /upper  
The -K option tells the test suite to run the kernel specific tests. These are the ones I am generally interested in since I'm testing kernel patches.

The -b option specifies the size in 1K blocks of the loop-back mounted /lower file system size.  I generally use 1000000 blocks as a minimum.

The -D option specifies the path where the temporary loop-back mounted image is kept and the -l and -u options specified the paths of the lower and upper mount points.

By default the tests will use an ext4 lower filesystem. One can also run specify which file systems to run the tests on using the -f option, this can be a comma separated list of one or more file systems, for example:
 sudo ./tests/ -K -c safe -b 1000000 -D /tmp/image -l /lower -u /upper \
-f ext2,ext3,ext4,xfs
And also, instead of running a bunch of tests, one can just a particular test using the -t option:
 sudo ./tests/ -K -c safe -b 1000000 -D /tmp/image -l /lower -u /upper \
-f ext2,ext3,ext4,xfs -t
..which tests the fix for LaunchPad bug 926292
We also run these tests regularly on new kernel images to ensure we don't introduce and regressions.   As it stands, I'm currently adding in tests for each bug fix that we back-port and for most new bugs that require a thorough test. I hope to expand the breadth of the tests to ensure we get better general test coverage.

And finally, thanks to Tyler Hicks for writing the test framework and for his valuable help in describing how to construct a bunch of these tests.

Read more
Colin Ian King

A new Ubuntu portal is a jump-start page containing links to pages and documents useful for Original Design Manufactures (ODMs), Original Equipment Manufacturers (OEMs) and Independent BIOS vendors.

Some of the highlights include:

  • A BIOS/UEFI requirements document that containing recommendations to ensure firmware is compatible with the Linux kernel.
  • Getting started links describing how to download, install, configure and debug Ubuntu.
  • Links to certified hardware, debugging tools, SystemTap guides, packaging guides, kernel building notes.
  • Debugging tips, covering: hotkeys, suspend/resume, sound, X and wireless and an A5 sized Ubuntu Debugging booklet.
  • Link to fwts-live, the Firmware Test Suite live image. lots of useful technical resources to call upon.

Kudos to Chris Van Hoof for organizing this useful portal.

Read more
Colin Ian King

Kernel Early Printk Messages

I've been messing around with the earlyprintk kernel options to allow me to get some form of debug out before the console drivers start later on in the kernel init phase. The earlyprintk kernel option supports debug output via the VGA, serial port and USB debug port.

The USB debug port is of interest - most modern systems seem to provide a debug port capability which allows one to send debug over USB to another machine. To check if your USB controller has this capability, use:

sudo lspci -vvv | grep "Debug port"

and look for a string such as "Capabilities: [58] Debug port: BAR=1 offset=00a0". You may have more than one of these on your system, so beware you use the correct one.

One selects this mode of earlyprintk debugging using:


for the default first port, or select the Nth debug enabled port using:


One also needs to build a kernel with the following config option enabled:


On my debug set-up I used a NET20DC-USB Hi-Speed USB 2.0 Host-to-Host Debug Device connecting the target machine and a host with which I capture the USB debug using /dev/ttyUSB0 with minicom. So that I won't bore you with the details, this is all explained in the kernel documentation in Documentation/x86/earlyprintk.txt

As it was, I needed to tweak the earlyprintk driver to put in some delays in the EHCI probing and reset code to get it working on my fairly fast target laptop.
My experience with this approach wasn't great - I had to plug/unplug the debug device quite frequently for the earlyprintk EHCI reset and probe to work. Also, the EHCI USB driver initialisation later on in the kernel initialisation hung which wasn't useful.

Overall, I may have had problems with the host/target and/or the NET20DC-USB host-to-host device, but it did allow me to get some debug out, be it rather unreliably.

Probably an easier way to get debug out is just using the boot option:


however this has the problem that the messages are eventually overwritten by the real console.

Finally, for anyone with old legacy serial ports on their machine (which is quite unlikely nowadays with newer hardware), one can use:


where ttySn is the nth tty serial port.

One can also append the ",keep" option to not disable the earlyprintk once the real console is up and running.

So, with earlyprintk, there is some chance of being able to get some form of debug out to a device to allow one to debug kernel problems that occur early in the initialisation phase.

Read more
Colin Ian King

Ubuntu Kernel Team Bug Policies

Got a Ubuntu kernel related bug and you need help in report it? Or got an audio bug that's kernel related and you want to log the bug? Need advice or hints on how to gather kernel oops messages into a bug report? Or need to figure out how to report a bug upstream?

Well if you need this kind of help, then look no further than the Ubuntu Kernel Bug Policies wiki page. It's got a load of helpful information on kernel bug reporting and also how bug states are recorded from being initially reported, triaged, processed by a kernel developer and fixed.

We hope this will take the pain out of reporting bugs and helping you understand the bug fixing process. Kudos to Leann Ogasawara for this wiki page!

Read more