Canonical Voices

Posts tagged with 'disk'

Colin Ian King

I've been fortunate to get my hands on an Intel ® 520 2.5" 240GB Solid State Drive so I thought I'd put it through some relatively simple tests to see how well it performs.

Power Consumption


My first round of tests involved seeing how well it performs in terms of power consumption compared to a typical laptop spinny Hard Disk Drive.  I rigged up a Lenovo X220i (i3-2350M @ 2.30GHz) running Ubuntu Precise 12.04 LTS (x86-64) to a Fluke 8846A precision digital multimeter and then compared the SSD with a 320GB Seagate ST320LT020-9YG142 HDD against some simple I/O tests.  Each test scenario was run 5 times and I based my results of the average of these 5 runs.

The Intel ® 520 2.5" SSD fits into conventional drive bays but comes with a black plastic shim attached to one side that has to be first removed to reduce the height so that it can be inserted into the Lenovo X220i low profile drive bay. This is a trivial exercise and takes just a few moments with a suitable Phillips screwdriver.   (As a bonus, the SSD also comes with a 3.5" adapter bracket and SATA 6.0 signal and power cables allowing it to be easily added into a Desktop too).

In an idle state, the HDD pulled ~25mA more than the SSD, so in overall power consumption terms the SSD saves ~5%, (e.g. adds ~24 minutes life to an 8 hour battery).

I then exercised the ext4 file system with Bonnie++ and measured the average current drawn during the run and using the idle "baseline" calculated the power consumed for the duration of the test.    The SSD draws more current than the HDD, however it ran the Bonnie++ test ~4.5 times faster and so the total power consumed to get the same task completed was less, typically 1/3 of the power of the HDD.

Using dd, I next wrote 16GB to the devices and found the SSD was ~5.3 times faster than the HDD and consumed ~ 1/3 the power of the HDD.    For a 16GB read, the SSD was ~5.6 times faster than the HDD and used about 1/4 the power of the HDD.

Finally, using tiobench I calculated that the SSD was ~7.6 times faster than the HDD and again used about 1/4 the power of the HDD.

So, overall, very good power savings.  The caveat is that since the SSD consumes more power than the HDD per second (but gets way more I/O completed) one can use more power with the SSD if one is using continuous I/O all the time.    You do more, and it costs more; but you get it done faster, so like for like the SSD wins in terms of reducing power consumption.

 

Boot Speed


Although ureadhead tries hard to optimize the inode and data reads during boot, the HDD is always going to perform badly because of seek latency and slow data transfer rates compared to any reasonable SSD.   Using bootchart and five runs the average time to boot was ~7.9 seconds for the SSD and ~25.8 seconds for the HDD, so the SSD improved boot times by a factor of about 3.2 times.  Read rates were topping ~420 MB/sec which was good, but could have been higher for some (yet unknown) reason. 

 

Palimpsest Performance Test


Palimpsest (aka "Disk Utility") has a quick and easy to use drive benchmarking facility that I used to measure the SSD read/write rates and access times.  Since writing to the drive destroys the file system I rigged the SSD up in a SATA3 capable desktop as a 2nd drive and then ran the tests.  Results are very impressive:

Average Read Rate: 535.8 MB/sec
Average Write Rate: 539.5 MB/sec
Average Access Time: sub 0.1 milliseconds.

This is ~7 x faster in read/write speed and ~200-300 x faster in access time compared to the Seagate HDD.

File System Benchmarks


So which file system performs best on the SSD?  Well, it depends on the use case. There are may different file system benchmarking tools available and each one addresses different types of file system behaviour.   Which ever test I use it most probably won't match your use case(!)  Since SSDs have very small latency overhead it is worth exercising various file systems with multiple threaded I/O read/writes and see how well these perform.  I rigged up the threaded I/O benchmarking tool tiobench to exercise ext2, ext3, ext4, xfs and btrfs while varying the number of threads from 1 to 128 in powers of 2.  In theory the SSD can do multiple random seeks very efficiently, so this type of testing should show the point where the SSD has optimal performance with multiple I/O requests.

 

Sequential Read Rates

Throughput peaks at 32-64 threads and xfs performs best followed by ext4, both are fairly close to the maximum device read rate.   Interestingly btrfs performance is always almost level.

Sequential Write Rates


xfs is consistently best, where as btrfs performs badly with the low thread count.

 

Sequential Read Latencies



These scale linearly with the number of threads and all file systems follow the same trend.

 

 Sequential Write Latencies



Again, linear scaling of latencies with number of threads.

Random Read Rates


Again, best transfer rates seem to occur at with 32-64 threads, and btrfs does not seem to perform that well compared to ext2, ext3, ext4 and xfs

Random Write Rates



Interestingly ext2 and ext3 fair well with ext4 and xfs performing very similarly and btrfs performing worst again.

 

Random Read Latencies



Again the linear scaling with latency as thread count increases with very similar performance between all file systems.  In this case, btrfs performs best.

Random Write Latencies


With random writes the latency is consistently flat, apart from the final data point for ext4 at 128 threads which could be just due to an anomaly.

Which I/O scheduler should I use?

 

Anecdotal evidence suggests using the noop scheduler should be best for an SSD.  In this test I exercised ext4, xfs and btrfs with Bonnie++ using the CFQ, Noop and Deadline schedulers.   The tests were run 5 times and below are the averages of the 5 test runs.

ext4:




CFQNoopDeadline
Sequential Block Write (K/sec):506046513349509893
Sequential Block Re-Write (K/sec):213714231265217430
Sequentual Block Read (K/sec):523525551009508774


So for ext4 on this SSD, Noop is a clear winner for sequential I/O.

xfs:




CFQNoopDeadline
Sequential Block Write (K/sec):514219514367514815
Sequential Block Re-Write (K/sec):229455230845252210
Sequentual Block Read (K/sec):526971550393553543


It appears that Deadline for xfs seems to perform best for sequential I/O.

 

btrfs:




CFQNoopDeadline
Sequential Block Write (K/sec):511799431700430780
Sequential Block Re-Write (K/sec):252210253656242291
Sequentual Block Read (K/sec):629640655361659538


And for btrfs, Noop is marginally better for sequential writes and re-writes but Deadline is best for reads.

So it appears for sequential I/O operations, CFQ is the least optimal choice with Noop being a good choice for ext4, deadline for xfs and either for btrfs.   However, this is just based on Sequential I/O testing and we should explore Random I/O testing before drawing any firm conclusions.

Conclusion

 

As can be seen from the data, SSD provide excellent transfer rates, incredibly short latencies as well as a reducing power consumption.   At the time of writing the cost per GB for an SSD is typically slightly more than £1 per GB which is around 5-7 times more expensive than a HDD.    Since I travel quite frequently and have damaged a couple of HDDs in the last few years the shock resistance, performance and power savings of the SSD are worth paying for.

Read more
Colin Ian King

The ext4 file system has a bunch of per-device /sys entries in /sys/fs/ext4/ that can be used to inspect and change ext4 tuning parameters.   One of the read-only values available is lifetime_write_kbytes which shows the number of kilobytes of data written to the file system since it was created.   

For example, to see how much data was written on an ext4 filesystem on /dev/sda5, one uses:

cat /sys/fs/ext4/sda5/lifetime_write_kbytes

To see how much data in has been written in kilobytes since mount time, read  session_write_kbytes, for example:

cat /sys/fs/ext4/sda5/session_write_kbytes

For a full description of all the tunables, consult Documentation/ABI/testing/sysfs-fs-ext4 in the Linux source.


Read more
Colin Ian King

Figuring Out SATA SErr codes

Today I had to cast my eye over a SATA error message:

ata1: exception Emask 0x10 SAct 0x0 SErr 0xd0000 action 0xe frozen
ata1: irq_stat 0x00400000, PHY RDY changed
ata1: SError: { PHYRdyChg CommWake 10B8B }
ata1: hard resetting link

So how does one interpret the cryptic SErr magic? Pages 269-270 of the SATA ATA spec serialata10a.zip explains the SError register fields in fine detail.

Well, "SErr 0xd0000" relates to the SError bits 16, 18 and 19 which are the SERR_PHYRDY_CHG, SERR_COMM_WAKE and SERR_10B_8B_ERR bits as defined in include/linux/ata.h. The kernel decodes these and dumps out the error state in the "ata1: SError: { PHYRdyChg CommWake 10B8B }" line above - so that's helpful.

Anyhow, the spec describes these error codes in the DIAG field (page 270) - they are the top 16 bits of the SError register. Armed with the spec one can then decode these error bits. It's not rocket science, one just needs to know where to look this information up.


Read more
Colin Ian King

loop devices, device mapper and kpartx

Recently I copied a hard disk image onto a backup drive using dd and later on needed to mount the third partition from this image. Usually I figure out the partition offset and mount using mount -oloop,offset=xxxx but I just could not be bothered this time around to figure out the offsets especially as I have several partitions in the image. Instead, I used the following runes:

1. Associate a loop device with the drive image:

losetup --show --find image-of-drive.img

..the --show option prints out the name of the loop device being used. In my case it was /dev/loop0

2. Create a device mapper device associated with the loop device:

echo "0 `blockdev --getsize /dev/loop0` linear /dev/loop0 0" | dmsetup create sdX

..this will create /dev/mapper/sdX

3. Use kpartx to create device maps from the partition tables on the mapper device:

kpartx -a /dev/mapper/sdX

4. And lo and behold this creates device maps over the detected partition segments. You can then mount the partitions as usual, e.g. partition #3:

mount /dev/mapper/sdX3 /mnt

..I should really put this now into a bash script for next time I need this...


Read more