Canonical Voices

Posts tagged with 'tools'

Colin Ian King

The hwloc (hardware locality) package contains the useful tool lstopo. To install use:

sudo apt-get install hwloc

By default, lstopo will display a logical view of the system caches and CPU cores, for example:


To get a non-graphical output use:

lstopo -

Machine (1820MB) + Socket #0 + L3 #0 (3072KB)
  L2 #0 (256KB) + L1 #0 (32KB) + Core #0
    PU #0 (phys=0)
    PU #1 (phys=2)
  L2 #1 (256KB) + L1 #1 (32KB) + Core #1
    PU #2 (phys=1)
    PU #3 (phys=3)

lstopo is also able to output the toplogy image in a variety of formats (Xfig, PDF, Postscript, PNG, SVG and XML) by specifying the output filename and extension, e.g.

lstopo topology.pdf

For more information, consult the manual for hwloc and lstopo.


Read more
Colin Ian King

hardinfo - look at hardware information in a GUI

Leandro Pereira has created hardinfo, a useful user-friendly GTK+ utility to browse the hardware information on one's machine. It also includes a bunch of benchmarking tools too.

To install, simply use:

apt-get install hardinfo

and run with Applications->System Tools->System Profiler and Benchmark or run from the command line using:

hardinfo

The tool presents information from /proc and /sys in a very easy to user and clean GUI. The view is broken into four categories:

Computer: OS, kernel modules, filesystem, display, environment variables, users.
Devices: CPU, memory, PCI, USB, printers, battery, sensors, input, storage, DMI, resources.
Network: Interfaces, IP connections, routing tables, ARP table, DNS servers, statistics, shares.
Benchmarks: Blowfish, CryptoHash, Fibonacci, CPU N-Queens, CPU FFT, FPU Raytracing.

Below are some sample views of my machine using hardinfo:




Quite a handy tool.


Read more
Colin Ian King

Perf tool now in Ubuntu Lucid.

A few months ago I blogged about the perf tool and how useful it is to drill down into kernel and user space applications to analyse performance bottle necks. Well, thanks to Andy Whitcroft, the perf tool is now available in Ubuntu Lucid. Perf needs to be in lock-step with the kernel, which complicated the packaging of this tool.

To install, use:

sudo apt-get install linux-tools

and this installs the perf, perf-stat, perf-top, perf-record, perf-report and perf-list tools.

The author of perf, Ingo Molnar has written some basic instructions on driving perf in the tools/perf/Documentation/example.txt file. They should give one a feel of how to drive these tools.

For example, to examine google-chrome:

perf record google-chrome

... run a test and then exit, and then generate a summary of activity:


perf report

One can examine specific events in the system too. To get a list of available events use:

perf list

For example, to measure CPU cycles, number of instructions, context switches and kmallocs on google-chrome one uses:

perf stat -e cpu-cycles -e instructions -e context-switches -e kmem:kmalloc google-chrome

Performance counter stats for 'google-chrome':

15576472832 cycles # 0.000 M/sec
13466330277 instructions # 0.865 IPC
40791 context-switches # 0.000 M/sec
38602 kmem:kmalloc # 0.000 M/sec

18.062301718 seconds time elapsed

One more quick example, this time we record a call-graph (stack chain/backtrace) of the dd command:

perf record -g dd if=/dev/zero of=/dev/null bs=1M count=4096
and get the call graph using:

# Samples: 12584 # # Overhead Command Shared Object Symbol # ........ ............... ............................... ...... # 95.96% dd [kernel] [k] __clear_user | |--99.69%-- read_zero | vfs_read | sys_read | system_call_fastpath | __read --0.31%-- [...]
1.37% dd [kernel] [k] read_zero 0.87% dd [kernel] [k] _cond_resched | |--50.91%-- read_zero | vfs_read | sys_read | system_call_fastpath | __read
..etc.

I recommend reading Ingo's example.txt and playing around with this tool. It is very powerful and allows one to drill down and examine system performance right down to the instruction level.


Read more
Colin Ian King

Saving power on my HPMini (revisited on Lucid)

A while ago I blogged about power saving on my HPMini. Well, today I'm revisiting this using today's Lucid daily and I'm pleased to see that I can save a small amount of power with Lucid compared to Karmic. However, the difference is small - the inaccuracies of estimating power consumption using ACPI may be more significant.

My steps were as follows:

0. Run powertop for about 15 minutes on battery power and note down the recommended power saving tricks.

1. Blacklist Bluetooth, since I don't use this and it really sucks a load of power. To do so, add the following to /etc/modprobe.d/blacklist.conf

blacklist btusb

2. Enable HDA audio powersaving:

echo 1 > /sys/module/snd_hda_intel/parameters/power_save

3. Increase dirty page writeback time:

echo 1500 > /proc/sys/vm/dirty_writeback_centisecs

4. Disable the webcam driver (not sure exactly if this saves much power), add the following to /etc/modprobe.d/blacklist.conf

blacklist uvcvideo

5. Turn down the screen brightness (this saves 0.3 Watts)

6. Disable desktop effects to save some power used by compositing.

7. Disable cursor blinking on the gnome terminal to save 2 wakeups a second, using:

gconftool-2 --type string --set /apps/gnome-terminal/profiles/Default/cursor_blink_mode off

I managed to push the power consumption down to ~6.5 Watts which is an improvement of nearly 1 Watt from my tests on Karmic. Not bad, but I don't fully trust the data from the battery/ACPI and I really need to see how much extra battery life these tweaks give me when I'm using the machine on my travels.

On an idle machine I'm seeing > ~99.0% C4 residency, which is quite acceptable. Running powertop is always leads to insights to saving power, so kudos to Intel for this wonderful utility.


Read more
Colin Ian King

HDA Analyzer

The HDA Analyzer tool that allows one to look at the raw HD-audio control data in an easy to user GUI. The instructions on how to download and use this tool are described at the HDA Analyzer page - they are as follows:

1. Fetch:


2. Run:

python run.py

And browse...


This certainly takes the pain out of looking at the control information.


Read more
Colin Ian King

The FirmWare Test Suite (fwts) is a tool I've been working on to do automatic testing of a PC's firmware. There can be a lot of subtle or vexing Linux Kernel/firmware issues caused when firmware is buggy, so it's useful to have a tool that can automatically check for common BIOS and ACPI errors. Where possible the tool will give some form of advice on how to fix issues or workaround firmware issues.

It's packaged up and in Maverick universe, you can install it using:

sudo apt-get install fwts

To see the tests available in the tool use:

fwts --show-tests

There are over 30 tests and I hope to expand this every time I find new firmware issues which can be diagnosed automatically in a tool.

To run a test use, for example the ACPI AML syntax checking test use:

sudo fwts syntaxcheck

There are categories of tests, for example, by default fwts will run batch mode tests which run without the need of user intervention. Some tests, such as checking the laptop lid works or hotkeys requires user intervention - these are interactive tests and can be invoked using:

sudo fwts --interactive

By default the tool will append the test results into a log file called results.log. This logs the date/time the test was run, the name of the test and the test results and hopefully some useful advice if a test fails.

I suggest checking out the manual page to see some examples how to fully drive this tool.

Quite a lot of the tests have been picked up from the core of linuxfirmwarekit.org, but I've added a bunch more tests, and expanded the types of errors it checks for and the feedback advice it reports. I've targeted fwts to run with the Maverick 2.6.35 kernel but it should work fine on Lucid kernels too. I've written fwts with command line driven test framework to run the tests mainly to allow fwts to easily plug into more powerful test frameworks.

If you want to run the tool from a bootable USB flash key, then one can download a i386 or amd64 image and dd it to a USB flash key.

For example:

wget http://kernel.ubuntu.com/~kernel-ppa/testing/maverick-desktop-i386-fwts.img
sudo dd if=maverick-desktop-i386-fwts.img of=/dev/sdX

where /dev/sdX is the block device of your USB flash key

then boot off this USB flash key and let it run the tests. At the end it will automatically shutdown the PC and you can then remove the key. The key has a FAT formatted partition containing the results of the test in a directory named: fwts/ddmmyyyy/hhmm/results.log, where ddmmyyyy are the digits in the date and hhmm for the time the test was started.

The fwts PPA can be found in the Firmware Testing Team project and the source code is available in a git repository here.

I've also written a short OpenOffice presentation on the tool which also may prove instructive.


Read more
Colin Ian King

Looking at a PC's option ROMs

An Option ROM is firmware called by the BIOS, for example the Video BIOS or a SCSI controller card. The BIOS Boot Specification requires that option ROMs are aligned to 2K boundaries and must begin with bytes 0x55 0xaa. After Power On Self Test (POST) the BIOS scans for option ROMs and if it finds a correct header executes the initialisation code at byte offset 0x3.

The Option ROM header is as follows:

Byte 0: 0x55
Byte 1: 0xaa
Byte 2: size of ROM in 512 byte pages
Byte 3: Option ROM entry point

To dump out the option ROMs, use the ree utility:

sudo apt-get install ree

..and run as follows:

sudo ree

..this will dump out all the option ROMs in the form: hexaddress.rom where hexaddress is the memory segment where the ROM is located. e.g. for my Video BIOS ROM, I get the file c0000.rom.

To disassemble this use ndisasm:

sudo apt-get install nasm

ndisasm -k 0,3 c0000.rom | less

..or just use strings on the ROM image to get an idea what the Option ROM is, e.g.

strings c0000.rom
000000000000
00IBM VGA Compatible BIOS.
PCIR
(00`
*@0p
H?@0b
..

..in this case it is a VGA BIOS, and this makes sense as VGA BIOS ROMs normally start from segment c0000. On my Lenovo laptop I observe that the ROM contains the string "DECOMPILATION OR DISASSEMBLY PROHIBITED" which spoils our fun in finding out what the ROM is doing...

Anyhow, ree + ndisasm are useful tools for poking around your PCs option ROMs on Linux.


Read more
Colin Ian King

Debugging ACPI using acpiexec

The acpiexec tool is an AML emulator that allows one to execute and interactively ACPI AML code from your BIOS.  The tarball can be downloaded from the ACPICA website  and built as follows:

1. Unzip and untar the acica-unix-20100304.tar.gz tarball.
2. cd into tools/acpiexec
3. run make

This should build acpiexec. Now for the fun part - executing your ACPI inside the emulator. To do this grab your ACPI tables and extract them using:

sudo acpidump > acpi.info && acpixtract -a acpi.info

Now load these tables into the emulator and run with verbose mode:

./acpiexec -v *.dat

Inside the emulator you can type help to navigate around the help system.  It may take a little bit of work to get familiar with all the commands available.

As a quick introduction, here is how to execute the battery information _BIF method.

1. Get a list of all the available methods, type:

methods

on my Lenovo laptop the battery information method is labelled \_SB_.PCI0.LPCB.BAT1._BIF, so to execute this method I use:

execute \_SB_.PCI0.LPCB.BAT1._BIF
Executing \_SB_.PCI0.LPCB.BAT1._BIF
Execution of \_SB_.PCI0.LPCB.BAT1._BIF returned object 0x19669d0 Buflen 178
  [Package] Contains 13 Elements:
    [Integer] = 0000000000000001
    [Integer] = 0000000000000FA0
    [Integer] = 0000000000000FA0
    [Integer] = 0000000000000001
    [Integer] = 0000000000002B5C
    [Integer] = 00000000000001A4
    [Integer] = 000000000000009C
    [Integer] = 0000000000000108
    [Integer] = 0000000000000EC4
    [String] Length 08 = PA3465U
    [String] Length 05 = 3658Q
    [String] Length 06 = Li-Ion
    [String] Length 07 = COMPAL

So far so good. I single stepped through the code using the debug command on the method as follows:

debug \_SB_.PCI0.LPCB.BAT1._BIF

at each % prompt, one can press enter to step the next instruction. If the method requires arguments, these can be passed into the method by specifying them after the method name from the debug command.

To see any local variables used during execution, use the locals command. The list command lists the current AML instructions. The set commands allows one to set method data and interact with the debugging processes.

Hopefully this gives one a taste of what the emulator can do. The internal help is enough to get one up and running, and one does generally require the current ACPI specification to figure out what's happening in your ACPI tables.


Read more
Colin Ian King

inotail - tail clone using inotify

I regularly use the tail command to look at the end of log files - generally I use the -f flag to follow (monitor and display) new entries that are appended to the log.

Tail will poll every second to see if the file size is changed and will act if data has been appended or the file has been truncated, and the poll interval is adjustable.  However, polling a file is plain ugly, it causes extra wakeup events and if one wants very fast updates then polling is not a good solution.

Hence, a better tool to use is inotail when following a file. Inotail uses inotify to only wake up inotail when a file has been modified, deleted or moved - hence no polling is required.  This also means inotail will output the changes to a file almost immediately - unlike tail which by default may wait almost a second before detecting a change.

To install inotail use:


sudo apt-get install inotail

..and use it the same was as tail.

UPDATE:  Pete Graner has also informed me that the tailf command is also an inotify based tail -f like replacement too. Just goes to show there are many ways of doing the same thing...


Read more
Colin Ian King

A little used tool from the coreboot project is inteltool. This tool is useful for dumping the configuration space of Intel CPUs and for examining the Northbridge and Southbridge settings.

The debian package can be found in universe, and can be installed using:

sudo apt-get install inteltool

..and has to be run using sudo.

At the simplest level, it can be run to get CPU and Northbridge/Southbridge version information:

$ sudo inteltool
Intel CPU: Family 6, Model f
Intel Northbridge: 8086:2a00 (PM965)
Intel Southbridge: 8086:2815 (ICH8-M)

However, one can drill down a bit, for example to get a dump of the GPIO settings, use the -g flag, e.g.:

$ sudo inteltool -g
Intel CPU: Family 6, Model f
Intel Northbridge: 8086:2a00 (PM965)
Intel Southbridge: 8086:2815 (ICH8-M)

============= GPIOS =============

GPIOBASE = 0x1180 (IO)

gpiobase+0x0000: 0x99541d02 (GPIO_USE_SEL)
gpiobase+0x0004: 0xe0fa7fc2 (GP_IO_SEL)
gpiobase+0x0008: 0x00000000 (RESERVED)
gpiobase+0x000c: 0xe1aa5fc7 (GP_LVL)
gpiobase+0x0010: 0x00000000 (GPIO_USE_SEL Override (LOW))
gpiobase+0x0014: 0x00000000 (RESERVED)
gpiobase+0x0018: 0x00000000 (GPO_BLINK)
gpiobase+0x001c: 0x00000000 (GP_SER_BLINK)
gpiobase+0x0020: 0x00080000 (GP_SB_CMDSTS)
gpiobase+0x0024: 0x00000000 (GP_SB_DATA)
gpiobase+0x0028: 0x00000000 (RESERVED)
gpiobase+0x002c: 0x00001900 (GPI_INV)
gpiobase+0x0030: 0x00000146 (GPIO_USE_SEL2)
gpiobase+0x0034: 0x00540f70 (GP_IO_SEL2)
gpiobase+0x0038: 0x00540f74 (GP_LVL2)
gpiobase+0x003c: 0x00000000 (GPIO_USE_SEL Override (HIGH))

There are other options to dump out the RCBA, Power Management, Memory Controller, EPBAR, DMIBAR and PCIEXBAR registers as well as the CPU MSRs.

It's an excellent utility for digging into the configuration of any Intel PC.


Read more
Colin Ian King

itop - look at top interrupt activity

When I want to see interrupt activity on a Linux box I normally use the following rune:


watch -n 1 cat /proc/interrupts

..which simply dumps out the interrupt information every second.  Another way is to use itop, which has the benefit that it outputs the interrupt rate which is lacking from my rune above.

To install, use:

sudo apt-get install itop

and to run, use:

itop

The output is refreshed every second, and outputs something like the following:

INT                NAME          RATE             MAX
  0 [PIC-edge      time]   154 Ints/s     (max:   154)
 14 [PIC-edge      ata_]    16 Ints/s     (max:    16)
 18 [PIC-fasteoi   ehci]     4 Ints/s     (max:     4)
 22 [PIC-fasteoi   ohci]     2 Ints/s     (max:     2)
 29 [MSI-edge      i915]    38 Ints/s     (max:    38)
 30 [MSI-edge      iwl3]     6 Ints/s     (max:     6)

Unfortunately itop does truncate the interrupt names, but I'm not so worried about this - I generally want to see very quickly if a machine is suffering from interrupt saturation or is missing interrupts, which I can get from itop easily.

To see all interrupts, run itop with:

itop -a

And to run for a number of iterations, run with the -n flag, e.g.

itop -n 10


Read more
Colin Ian King

blktrace graphs revisited

I blogged about seekwatcher a couple of days ago - it's great for producing charts from blktrace - however, I wasn't totally happy with the graph output. I wanted to be able to graph the data based on a tunable sample rate. So I've hacked up a script using awk and gnuplot to do this.

To run, suppose I've generated blktrace files readtest.blktrace.0 and readtest.blktrace.1 and I want a graph generated with 0.25 seconds per point, then do:

./generate-read-graph blktrace readtest.blktrace. 0.25

This generates a gnuplot file graph-reads.png, for example:



Now I'm happier with these results.


Read more
Colin Ian King

The perf tool by Ingo Molnar allows one to do some deep performance using Linux performance counters. It covers a broad range of performance monitoring at the hardware level and software level. In this blog posting I just want to give you a taste of some of the ways to use this powerful tool.

One needs to build this from the kernel source, but it's fairly easy to do:

1) Install libelf-dev, on a Ubuntu system use:

sudo apt-get install libelf-dev

2) Get the kernel source

either from kernel.org or from Ubuntu kernel source package:

apt-get source linux-image-2.6.31-14-generic

3) ..and build the tool..

in the kernel source:

cd tools/perf
make

There is plenty of documentation on this tool in the tools/perf/Documentation directory and I recommend reading this to get a full appreciation of what the tool can do and how to drive it.

My first example is a trivial performance counter example on the dd command:

./perf stat dd if=/dev/zero of=/dev/null bs=1M count=4096
4096+0 records in
4096+0 records out
4294967296 bytes (4.3 GB) copied, 0.353498 s, 12.1 GB/s

Performance counter stats for 'dd if=/dev/zero of=/dev/null bs=1M count=4096':

355.148424 task-clock-msecs # 0.998 CPUs
18 context-switches # 0.000 M/sec
0 CPU-migrations # 0.000 M/sec
501 page-faults # 0.001 M/sec
899141721 cycles # 2531.735 M/sec
2212730050 instructions # 2.461 IPC
67433134 cache-references # 189.873 M/sec
6374 cache-misses # 0.018 M/sec

0.355829317 seconds time elapsed


But we can dig deeper than this. How about seeing what's really going on on the application and the kernel? The next command records stats into a file perf.data and then we can then examine these stats using perf report:

./perf record -f dd if=/dev/urandom of=/dev/null bs=1M count=16
16+0 records in
16+0 records out
16777216 bytes (17 MB) copied, 2.39751 s, 7.0 MB/s
[ perf record: Captured and wrote 1.417 MB perf.data (~61900 samples) ]

..and generate a report on the significant CPU consuming functions:

./perf report --sort comm,dso,symbol | grep -v "0.00%"
# Samples: 61859
#
# Overhead Command Shared Object Symbol
# ........ ....... ......................... ......
#
75.52% dd [kernel] [k] sha_transform
14.07% dd [kernel] [k] mix_pool_bytes_extract
3.38% dd [kernel] [k] extract_buf
2.33% dd [kernel] [k] copy_user_generic_string
1.36% dd [kernel] [k] __ticket_spin_lock
0.90% dd [kernel] [k] _spin_lock_irqsave
0.72% dd [kernel] [k] _spin_unlock_irqrestore
0.67% dd [kernel] [k] extract_entropy_user
0.27% dd [kernel] [k] default_spin_lock_flags
0.22% dd [kernel] [k] sha_init
0.11% dd [kernel] [k] __ticket_spin_unlock
0.08% dd [kernel] [k] copy_to_user
0.04% perf [kernel] [k] copy_user_generic_string
0.02% dd [kernel] [k] clear_page_c
0.01% perf [kernel] [k] memset_c
0.01% dd [kernel] [k] page_fault
0.01% dd /lib/libc-2.10.1.so [.] 0x000000000773f6
0.01% perf [kernel] [k] __ticket_spin_lock
0.01% dd [kernel] [k] native_read_tsc
0.01% dd /lib/libc-2.10.1.so [.] strcmp
0.01% perf [kernel] [k] kmem_cache_alloc
0.01% perf [kernel] [k] __block_commit_write
0.01% perf [kernel] [k] ext4_do_update_inode

..showing us where most of the CPU time is being consumed, down to the function names in the kernel, application and shared libraries.

One can drill down deeper, in the previous example strcmp() was using 0.01% of the CPU; we can see where using perf annotate:

./perf annotate strcmp
objdump: 'vmlinux': No such file

------------------------------------------------
Percent | Source code & Disassembly of vmlinux
------------------------------------------------
------------------------------------------------
Percent | Source code & Disassembly of libc-2.10.1.so
------------------------------------------------
:
:
:
: Disassembly of section .text:
:
: 000000000007ee20 :
50.00 : 7ee20: 8a 07 mov (%rdi),%al
0.00 : 7ee22: 3a 06 cmp (%rsi),%al
25.00 : 7ee24: 75 0d jne 7ee33
25.00 : 7ee26: 48 ff c7 inc %rdi
0.00 : 7ee29: 48 ff c6 inc %rsi
0.00 : 7ee2c: 84 c0 test %al,%al
0.00 : 7ee2e: 75 f0 jne 7ee20
0.00 : 7ee30: 31 c0 xor %eax,%eax
0.00 : 7ee32: c3 retq
0.00 : 7ee33: b8 01 00 00 00 mov $0x1,%eax
0.00 : 7ee38: b9 ff ff ff ff mov $0xffffffff,%ecx
0.00 : 7ee3d: 0f 42 c1 cmovb %ecx,%eax
0.00 : 7ee40: c3 retq

Without the debug info in the object code, just the annotated assember is displayed.

To see which events one can trace with, use the perf list command:

./perf list

List of pre-defined events (to be used in -e):

cpu-cycles OR cycles [Hardware event]
instructions [Hardware event]
cache-references [Hardware event]
cache-misses [Hardware event]
branch-instructions OR branches [Hardware event]
branch-misses [Hardware event]
bus-cycles [Hardware event]

cpu-clock [Software event]
task-clock [Software event]
page-faults OR faults [Software event]
minor-faults [Software event]
major-faults [Software event]
context-switches OR cs [Software event]
cpu-migrations OR migrations [Software event]

L1-dcache-loads [Hardware cache event]
L1-dcache-load-misses [Hardware cache event]
...
...
sched:sched_migrate_task [Tracepoint event]
sched:sched_process_free [Tracepoint event]
sched:sched_process_exit [Tracepoint event]
sched:sched_process_wait [Tracepoint event]
sched:sched_process_fork [Tracepoint event]
sched:sched_signal_send [Tracepoint event]


On my system I have over 120 different types of events that can be monitored. One can select events to monitor using the -e event option, e.g.:

./perf stat -e L1-dcache-loads -e instructions dd if=/dev/zero of=/dev/null bs=1M count=1
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.000784806 s, 1.3 GB/s

Performance counter stats for 'dd if=/dev/zero of=/dev/null bs=1M count=1':

1166059 L1-dcache-loads # inf M/sec
4283970 instructions # inf IPC

0.003090599 seconds time elapsed


This is one powerful too! I recommend reading the documentation and trying it out for yourself on a 2.6.31 kernel.

References: http://lkml.org/lkml/2009/8/4/346


Read more
Martin Pool

mcrepogen


Neil Martinsen-Burrell just announced MCREPOGEN, a tool to generate random version control histories for Bazaar or anything that can read the fastimport format.

It uses a Markov Chain model where the states are directory trees and various changes to the tree have associated probabilities. The intent is that by giving complete control over the characteristics of the history, performance testing of different aspects of VCS can be improved.

Read more
Colin Ian King

Today I stumbled on the wonders of e2freefrag, a tool for reporting the free space fragmentation on ext[2-4] filesystems. It scans the block bitmap data and reports the amount of free blocks a present in terms of free contiguous blocks and also aligned free space.

$ sudo e2freefrag  /dev/sda1
Device: /dev/sda1
Blocksize: 1024 bytes
Total blocks: 489951
Free blocks: 242944 (49.6%)

Min. free extent: 1 KB
Max. free extent: 7676 KB
Avg. free extent: 2729 KB

HISTOGRAM OF FREE EXTENT SIZES:
Extent Size Range :  Free extents   Free Blocks  Percent
    1K...    2K-  :             4             4    0.00%
    2K...    4K-  :            30            90    0.04%
    4K...    8K-  :             5            26    0.01%
  128K...  256K-  :             2           301    0.12%
  256K...  512K-  :             1           340    0.14%
  512K... 1024K-  :             6          4540    1.87%
    1M...    2M-  :             5          8141    3.35%
    2M...    4M-  :             5         14751    6.07%
    4M...    8M-  :            31        214751   88.40%


From this one can get an idea of the level of free space fragmentation in your filesystem.


Read more
Colin Ian King

Previously I blogged about blktrace and how it can be used to analyse block I/O operations - however, it can generate a lot of data that can be overwhelming. This is where Chris Mason's Seekwatcher tool comes to the rescue. Seekwatcher uses blktrace data to generate graphs to help one visualise and understand I/O patterns. It allows one to plot multiple blktrace runs together to enable easy comparison between benchmarking test runs.

It requires matplotlib, python and the numpy module - on Ubuntu download and install these packages using:

sudo apt-get install python python-matplotlib python-numpy

and then get the seekwatcher source and extract seekwatcher from the source package and you are ready to run the seekwatcher python script.

Seekwatcher also can general animations of I/O patterns which also improves visualisation and understanding of I/O operations over time.

To use seekwacher, first start a blktrace capture:

blktrace -o trace -d /dev/sda

next kick off the test you want to analyse and when that's complete, kill blktrace. Next run seekwatcher on the blktrace output:

seekwatcher -t trace.blktrace -o output.png

..and this generates a png file output.png. Easy!

Attached is the output from a test I just ran on my HP Mini 1000 starting up the Open Office word processor:

One can generate a movie from the same data using:

seekwatcher -t trace.blktrace -o open-office.mpg --movie

The generated movie is below:





There are more instructions on other ways to use seekwatcher on the seekwatcher webpage. All in all, a very handy tool - kudos to Chris Mason.


Read more
Colin Ian King

blktrace is a really useful tool to see what I/O operations are going on inside the Linux block I/O layer. So what does blktrace provide:

  • plenty of block layer information on I/O operations
  • very low level (2%) overhead when tracing
  • highly configurable - trace I/O on one or several devices, selectable filter events
  • live and playback tracing
There are many different types of events that can be captured, for example, I/O merges, request re-queues, requests to underlying block device, I/O split/bounces, request completions and more beside.

One the user-space side, two tools are used: blktrace which is the event extraction utility and blkparse which takes the event data and turns it into human readable output.

Typically, one uses the tools as follows:

sudo blktrace -d /dev/sda -o - | blkparse -i -

And this will dump out data as follows:

8,0 0 1 0.000000000 1245 A W 25453095 + 8 <- (8,1) 25453032
8,0 0 2 0.000002374 1245 Q W 25453095 + 8 [postgres]
8,0 0 3 0.000010616 1245 G W 25453095 + 8 [postgres]
8,0 0 4 0.000018228 1245 P N [postgres]
8,0 0 5 0.000023397 1245 I W 25453095 + 8 [postgres]
8,0 0 6 0.000034222 1245 A W 25453103 + 8 <- (8,1) 25453040
8,0 0 7 0.000035968 1245 Q W 25453103 + 8 [postgres]
8,0 0 8 0.000040368 1245 M W 25453103 + 8 [postgres]


The 1st column shows the device major,minor tuple, e.g. (8,0). The 2nd column shows the CPU number. The 3rd column shows the sequence number. 4th column is the time stamp, which as you can see has a fairly high resolution time stamp. The 5th column is the PID of the process issuing the I/O request (in this example, 1245, the PID of postgres). The 6th column shows the event type, e.g. 'A' means a remapping from device (8,1) /dev/sda1 to device (8,0) /dev/sda, refer to the "ACTION IDENTIFIERS" section in the blkparse man page for more details on this field. The 7th column is R for Read, W for Write, D for block, B for Barrier operation. The next field is the block number and a following + number is the number of blocks requested. The final field between the [ ] brackets is the process name of the process issuing the request.

When one wants to stop tracing, hit control-C and a summary of the I/O operations is provided, e.g.:

CPU0 (8,0):
Reads Queued: 0, 0KiB Writes Queued: 128, 504KiB
Read Dispatches: 0, 0KiB Write Dispatches: 22, 508KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 0, 0KiB Writes Completed: 32, 576KiB
Read Merges: 0, 0KiB Write Merges: 105, 420KiB
Read depth: 12 Write depth: 2
PC Reads Queued: 0, 0KiB PC Writes Queued: 0, 0KiB
PC Read Disp.: 9, 0KiB PC Write Disp.: 0, 0KiB
PC Reads Req.: 0 PC Writes Req.: 0
PC Reads Compl.: 0 PC Writes Compl.: 32
IO unplugs: 18 Timer unplugs: 0
CPU1 (8,0):
Reads Queued: 0, 0KiB Writes Queued: 16, 56KiB
Read Dispatches: 0, 0KiB Write Dispatches: 2, 52KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 0, 0KiB Writes Completed: 0, 0KiB
Read Merges: 0, 0KiB Write Merges: 11, 44KiB
Read depth: 12 Write depth: 2
PC Reads Queued: 0, 0KiB PC Writes Queued: 0, 0KiB
PC Read Disp.: 3, 0KiB PC Write Disp.: 0, 0KiB
PC Reads Req.: 0 PC Writes Req.: 0
PC Reads Compl.: 0 PC Writes Compl.: 0
IO unplugs: 6 Timer unplugs: 0

Total (8,0):
Reads Queued: 0, 0KiB Writes Queued: 144, 560KiB
Read Dispatches: 0, 0KiB Write Dispatches: 24, 560KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 0, 0KiB Writes Completed: 32, 576KiB
Read Merges: 0, 0KiB Write Merges: 116, 464KiB
PC Reads Queued: 0, 0KiB PC Writes Queued: 0, 0KiB
PC Read Disp.: 12, 0KiB PC Write Disp.: 0, 0KiB
PC Reads Req.: 0 PC Writes Req.: 0
PC Reads Compl.: 0 PC Writes Compl.: 32
IO unplugs: 24 Timer unplugs: 0

Throughput (R/W): 0KiB/s / 537KiB/s
Events (8,0): 576 entries
Skips: 0 forward (0 - 0.0%)


As tools go, this one is excellent for in-depth understanding of block I/O operations inside the kernel. I am sure it has many different applications and it's well worth playing with this tool to get familiar with all of the features provided. The ability to filter specific events allows one to focus and drill down on specific types of I/O operations without being buried by tracing output overload.

Jens Axboe the block layer maintainer developed and maintains blktrace. Alan D. Brunelle has contributed a lot of extra functionality - I recommend reading Brunelle's user guide to get started and also blktrace paper that contains a lot more indepth instruction on how to use this tool. The blktrace and blkparse manual pages provide more details on how to use the tools, but I'd recommend eyeballing Brunelle's user guide first.


Read more
Colin Ian King

There are times when I'm looking at wifi problems and I use iwconfig to check out link quality, signal level and noise level information. I did get into the habit of using the following rune:

watch -n 1 iwconfig wlan1

..to check out the status on my wifi interface wlan1.

There's always a tool out there that does the same kind of thing, but in a more presentable form - in this case it's wavemon.

To install use:

sudo apt-get install wavemon

and using it is as simple as:

wavemon

or one can specify the interface using:

wavemon -i interfacename

It's a minimally interactive screen based curses application, one can switch between display modes using the function keys. Pretty straight forward and easy to use.


One can configure the refresh rate, so one can poll the stats at the frequency you desire. Not bad at all.


Read more
Colin Ian King

Paul Warren's iftop tool provides quick-n-easy way of seeing the top bandwidth hogging connections on a specified network interface. Just like other top like tools, it's a text console based tool.

To install, use:

sudo apt-get install iftop

To monitor traffic on my wifi interface wlan1, I used:

sudo iftop -i wlan1

..note that specifying more than one interface causes it to segfault :-(

It has a few interactive commands, press 'h' to get help, and 'h' again to toggle back out of the help screen. For example, the 't' key allows one to cycle through send/receive display modes, and the 'p' key toggles on/off the port number on the connection display. Naturally, the 'q' key quits the application.


The bottom of the screen displays a summary of the TX/RX statistics including cumulative totals and peak transfer rates. All in all quite a helpful little tool.

Read more