Canonical Voices

Posts tagged with 'cpu'

Colin Ian King

The latest release of stress-ng V0.08.09 incorporates new stressors and a handful of bug fixes. So what is new in this release?

  • memrate stressor to exercise and measure memory read/write throughput
  • matrix yx option to swap order of matrix operations
  • matrix stressor size can now be 8192 x 8192 in size
  • radixsort stressor (using the BSD library radixsort) to exercise CPU and memory
  • improved job script parsing and error reporting
  • faster termination of rmap stressor (this was slow inside VMs)
  • icache stressor now calls cacheflush()
  • anonymous memory mappings are now private allowing hugepage madvise
  • fcntl stressor exercises the 4.13 kernel F_GET_FILE_RW_HINT and F_SET_FILE_RW_HINT
  • stream and vm stressors have new mdavise options
The new memrate stressor performs 64/32/16/8 bit reads and writes to a large memory region.  It will attempt to get some statistics on the memory bandwidth for these simple reads and writes.  One can also specify the read/write rates in terms of MB/sec using the --memrate-rd-mbs and --memrate-wr-mbs options, for example:

 stress-ng --memrate 1 --memrate-bytes 1G \  
--memrate-rd-mbs 1000 --memrate-wr-mbs 2000 -t 60
stress-ng: info: [22880] dispatching hogs: 1 memrate
stress-ng: info: [22881] stress-ng-memrate: write64: 1998.96 MB/sec
stress-ng: info: [22881] stress-ng-memrate: read64: 998.61 MB/sec
stress-ng: info: [22881] stress-ng-memrate: write32: 1999.68 MB/sec
stress-ng: info: [22881] stress-ng-memrate: read32: 998.80 MB/sec
stress-ng: info: [22881] stress-ng-memrate: write16: 1999.39 MB/sec
stress-ng: info: [22881] stress-ng-memrate: read16: 999.66 MB/sec
stress-ng: info: [22881] stress-ng-memrate: write8: 1841.04 MB/sec
stress-ng: info: [22881] stress-ng-memrate: read8: 999.94 MB/sec
stress-ng: info: [22880] successful run completed in 60.00s (1 min, 0.00 secs)

...the memrate stressor will attempt to limit the memory rates but due to scheduling jitter and other memory activity it may not be 100% accurate.  By careful setting of the size of the memory being exercised with the --memrate-bytes option one can exercise the L1/L2/L3 caches and/or the entire memory.

By default, matrix stressor will perform matrix operations with optimal memory access to memory.  The new --matrix-yx option will instead perform matrix operations in a y, x rather than an x, y matrix order, causing more cache stalls on larger matrices.  This can be useful for exercising cache misses.

To complement the heapsort, mergesort and qsort memory/CPU exercising sort stressors I've added the BSD library radixsort stressor to exercise sorting of hundreds of thousands of small text strings.

Finally, while exercising various hugepage kernel configuration options I was inspired to make stress-ng mmap's to work better with hugepage madvise hints, so where possible all anonymous memory mappings are now private to allow hugepage madvise to work.  The stream and vm stressors also have new madvise options to allow one to chose hugepage, nohugepage or normal hints.

No big changes as per normal, just small incremental improvements to this all purpose stress tool.

Read more
Colin Ian King

Over the past month I've been hitting excessive thermal heating on my laptop and kidle_inject has been kicking in to try and stop the CPU from overheating (melting!).  A quick double-check with older kernels showed me that this issue was not thermal/performance regression caused by software - instead it was time to clean my laptop and renew the thermal paste.

After some quick research, I found that Artic MX-4 Thermal Compound provided an excellent thermal conductivity rating of 8.5W/mK so I ordered a 4g sample as well as a can of pressurized gas cleaner to clean out dust.

The X230 has an excellent hardware maintenance manual, and following the instructions I stripped the laptop right down so I could pop the heat pipe contacts off the CPU and GPU.  I carefully cleaned off the old dry and cracked thermal paste and applied about 0.2g of MX-4 thermal compound to the CPU and GPU and re-seated the heat pipe.  With the pressurized gas I cleaned out the fan and airways to maximize airflow over the heatpipe.   The entire procedure took about an hour to complete and for once I didn't have any screws left over after re-assembly!

I normally take photos of the position of components during the strip down of a laptop for reference in case I cannot figure out exactly how parts are meant to fix on the re-assembly phase.  In this case, the X230 maintenance manual is sufficiently detailed so I didn't take any photos this time.

I'm glad to report that my X230 is now no-longer overheating. Heat is being effectively pumped away from the CPU and GPU and one can feel the additional heat being pushed out of the laptop.  Once again I can fully max out the CPU and GPU without passive thermal cooling mechanisms being kicked into action, so I've now got 100% of my CPU performance back again; as good as new!

Now and again I see laptop overheating bugs being filed in LaunchPad.  While some are legitimate issues with broken software, I do wonder if the majority of issues with the older laptops is simply due to accumulation of dust and/or old and damaged thermal paste.

Read more
Colin Ian King

even more stress in stress-ng

Over the past few weeks in spare moments I've been adding more stress methods to stress-ng  ready for Ubuntu 15.04 Vivid Vervet.   My intention is to produce a rich set of stress methods that can stress and exercise many facets of a system to force out bugs, catch thermal over-runs and generally torture a kernel in a controlled repeatable manner.

I've also re-structured the tool in several ways to enhance the features and make it easier to maintain.  The cpu stress method has been re-worked to include nearly 40 different ways to stress a processor, covering:

  • Bit manipulation: bitops, crc16, hamming
  • Integer operations: int8, int16, int32, int64, rand
  • Floating point:  long double, double,  float, ln2, hyperbolic, trig
  • Recursion: ackermann, hanoi
  • Computation: correlate, euler, explog, fibonacci, gcd, gray, idct, matrixprod, nsqrt, omega, phi, prime, psi, rgb, sieve, sqrt, zeta
  • Hashing: jenkin, pjw
  • Control flow: jmp, loop
..the intention was to have a wide enough eclectic mix of CPU exercising tests that cover a wide range of typical operations found in computationally intense software.   Use the new --cpu-method option to select the specific CPU stressor, or --cpu-method all to exercise all of them sequentially.

I've also added more generic system stress methods too:
  • bigheap - re-allocs to force OOM killing
  • rename - rename files rapidly
  • utime - update file modification times to create lots of dirty file metadata
  • fstat - rapid fstat'ing of large quantities of files
  • qsort - sorting of large quantities of random data
  • msg - System V message sending/receiving
  • nice - rapid re-nicing processes
  • sigfpe - catch rapid division by zero errors using SIGFPE
  • rdrand - rapid reading of Intel random number generator using the rdrand instruction (Ivybridge and later CPUs only)
Other new options:
  • metrics-brief - this dumps out only the bogo-op metrics that are relevant for just the tests that were run.
  • verify - this will sanity check the stress results per iteration to ensure memory operations and CPU computations are working as expected. Hopefully this will catch any errors on a hot machine that has errors in the hardware. 
  • sequential - this will run all the stress methods one by one (for a default of 60 seconds each) rather than all in parallel.   Use this with the --timeout option to run all the stress methods sequentially each for a specified amount of time. 
  • Specifying 0 instances of any stress method will run an instance of the stress method on all online CPUs. 
The tool also builds and runs on Debian kFreeBSD and GNU HURD kernels although some stress methods or stress options are not included due to lack of support on these other kernels.
The stress-ng man page gives far more explanation of each stress method and more detailed examples of how to use the tool.

For more details, visit here or read the manual.

Read more
Colin Ian King

Linaro's idlestat is another useful tool in the arsenal of CPU monitoring utilities.  Idlestat monitors and captures CPU C-state and P-state transitions using the kernel Ftrace tracer and outputs statistics based on entering/exiting each state for each CPU.  Idlestat  also captures IRQ activity as well which ones caused a CPU to exit an idle state -  knowing why a processor came out of a deep C state is always very useful way to help diagnose power consumption issues.

Using idlestat is easy, to capture 20 seconds of activity into a log file called example.log, run:

 sudo idlestat --trace -f example.log -t 20    
..and this will display the per CPU C-state and P-state and IRQ statistics for that run.

One can also take the saved log file and parse it again to calculate the statistics again using:
 idlestat --import -f example.log  

One can get the source from here and I've packaged version 0.3 (plus a bunch of minor fixes that will land in 0.4) for Ubuntu 14.10 Utopic Unicorn.

Read more
Colin Ian King

Keeping cool with thermald

The push for higher performance desktops and laptops has inevitably lead to higher power dissipation.  Laptops have also shrunk in size leading to increasing problems with removing excess heat and thermal overrun on heavily loaded high end machines.

Intel's thermald prevents machines from overheating and has been recently introduced in the Ubuntu Trusty 14.04 LTS release.  Thermald actively monitors thermal sensors and will attempt to keep the hardware cool by modifying a variety of cooling controls:

* Active or passive cooling devices as presented in sysfs
* The Running Average Power Limit (RAPL) driver (Sandybridge upwards)
* The Intel P-state CPU frequency driver (Sandybridge upwards)
* The Intel PowerClamp driver

Thermald has been found to be especially useful when using the Intel P-state CPU frequency scaling driver since this can push the CPU harder than other CPU frequency scaling drivers.

Over the past several weeks I've been working with Intel to shake out some final bugs and get thermald included into Ubuntu 14.04 LTS, so kudos to Srinivas Pandruvada for handling my patches and also providing a lot of timely fixes too.

By default, thermald works without any need for configuration, however, if one has incorrect thermal trip settings or other firmware related thermal zone bugs one can write one's own thermald configuration. 

For further details, consult the Ubuntu thermald wiki page.

Read more
Colin Ian King

Simple performance test of rdrand

My new Lenovo X230 laptop is equipped with a Intel(R) i5-3210M CPU (2.5 GHz, with 3.1 GHz Turbo) which supports the new Digital Random Number Generator (DRNG) - a high performance entropy and random number generator.  The DNRG is read using the new Intel rdrand instruction which can return 64, 32 or 16 bit random numbers.

The DRNG is described in detail in this article and provides very useful code examples in assembler and C which I used to write a simple and naive test to see how well the rdrand performs on my i5-3210M.

For my test, I simply read 100 million 64 bit random numbers on a single thread. The Intel literature states one can get up to about 70 million rdrand invocations per second on 8 threads, so my simple test is rather naive as it only exercises rdrand on one thread.  For a set of 10 iterations on my test, I'm getting around 40-45 nanoseconds per rdrand, or about 22-25 million rdrands per second, which is really impressive.   The test is a mix of assembler and C, and is not totally optimal, so I am sure I can squeeze a little more performance out with some extra work.

The next test I suspect is to see just random the data is and to see how well it compares to other software random number generators... but I will tinker with that after my vacation.

Anyhow, for reference, the test can be found here in my git repository.

Read more
Colin Ian King

The hwloc (hardware locality) package contains the useful tool lstopo. To install use:

sudo apt-get install hwloc

By default, lstopo will display a logical view of the system caches and CPU cores, for example:

To get a non-graphical output use:

lstopo -

Machine (1820MB) + Socket #0 + L3 #0 (3072KB)
  L2 #0 (256KB) + L1 #0 (32KB) + Core #0
    PU #0 (phys=0)
    PU #1 (phys=2)
  L2 #1 (256KB) + L1 #1 (32KB) + Core #1
    PU #2 (phys=1)
    PU #3 (phys=3)

lstopo is also able to output the toplogy image in a variety of formats (Xfig, PDF, Postscript, PNG, SVG and XML) by specifying the output filename and extension, e.g.

lstopo topology.pdf

For more information, consult the manual for hwloc and lstopo.

Read more
Colin Ian King

Reading the TSC from userspace

Recently I've been poking around looking at some Time Stamp Counter (TSC) anomalies when coming out of suspend. So what is the TSC? It's a 64 bit high resolution tick counter found on X86 processors (since Pentiums) and can be read using the rdtsc instruction.

It is intended to be a fast method of getting a high resolution timer. However it is known to problematic on multi-core and hyperthreaded CPUs - one needs to be locked to one CPU to get reliable results since the TSC may be different on each CPU. It is also known to reset when coming out of resume which means time can look like it goes backwards in a huge jump.

If the CPU speed is changed then the TSC rate can change too. If you have a more recent Intel CPU where the constant_tsc flag is set (see /proc/cpuinfo) then the TSC will run at a constant rate no matter the CPU speed - but this means that benchmarking with the constant TSC may make programs look like they use more CPU cycles than in reality!

Anyhow, getting the 64 bit TSC value is a simple case of using the rdtsc instruction. I've got some example code to do this here with the necessary inlined assembler magic to handle this correctly for 32 and 64 bit builds.

Read more
Colin Ian King

Atom Z530 identity crisis

Last week I peeked at /proc/cpuinfo on a Atom Z530 netbook and got the following model name information:

model name : Intel(R) Core (TM) CPU Z530 @ 1.60GHz

Is the kernel mistaken? It's not a Core CPU, it's an Atom! In fact the Z530 is mistaken. If one examines page 29 of you will see errata AAE29:

"AAE29 CPUID Instruction Returns Incorrect Brand String

When a CPUID instruction is executed with EAX = 80000002H, 80000003H and 80000004H on an Intel® Atom(TM) processor, the return value contains the brand string Intel(R) Core(TM)2 CPU when it should have Intel(R) Atom(TM) CPU."

Doh! That's a rather poor mistake in the silicon.

Apparently this affects Intel® Atom(TM) processors Z550, Z540, Z530, Z520, Z515, Z510, and Z500 on 45-nm process technology. It is fixable with a microcode fix, which normally involves getting a BIOS upgrade.

The errata makes interesting reading - especially errata AAE44 and AAE46 - for older kernels I suggest booting with kernel boot option mem=nopentium to work around any bizarre kernel oopses caused by these particular processor bugs. In fact, I recommend this for any Atom processor as these bugs seem to also apply to the Atom Nxxx series to.

Read more
Colin Ian King

Installing Intel Microcode Updates

It is not unknown for personal computers to be shipped with subtle and rarely occurring bugs in the x86 processor. Normally these bugs can be fixed with microcode updates that get loaded by the BIOS at boot time. However, re-flashing a BIOS may be deemed to risky (since it can brick a machine) or perhaps one can no longer get BIOS updates to an older machine.

This is where the Intel microcode updates come in useful. To install these on Ubuntu use:

sudo apt-get install intel-microcode

These may then fix subtle bugs, so it's always worth a try when you see strange processor related issues such inexplicable memory related oopses.

The caveat is that the microcode is loaded late in boot time, so you may not be able to workaround bugs in the early boot phase. For example, when coming out of hibernate you may hit a processor related bug that's fixed with the microcode update - however, the microcode is loaded late into the resume from hibernate phase, so it cannot be fixed this way.

Read more
Colin Ian King

Sensors reporting hot CPU in Maverick

My clunky old Lenovo has and Intel(R) Core(TM) Duo CPU (model 15) which coretemp in Lucid 10.04 assumed TjMax was 85 degrees C but now in Maverick 10.10 believes TjMax is 100 degrees C.

Kernel commit a321cedb12904114e2ba5041a3673ca24deb09c9 attempts to get TjMax from msr 0x1a2. If it fails to read this msr it defaults TjMax to 100 degrees C for CPU models 14, 15, 22 and 26, and one will see the following warning message:

[ 9.650025] coretemp coretemp.0: TjMax is assumed as 100 C!
[ 9.650322] coretemp coretemp.1: TjMax is assumed as 100 C!

For CPU models 23 and 28 (Atoms) TjMax will be 90 or 100 depending if it's a nettop or a netbook. Otherwise the patch will default TjMax to 100 degrees C.

One can check the value of TjMax using:

cat devices/platform/coretemp*/temp1_crit

Coretemp calculates the core temperature of the CPU by subtracting the thermal status from TjMax. Since the default has been increased from 85 to 100 degrees between Lucid and Maverick, the apparent core temperature now reads 15 degrees higher.

Now, if my machine really was running 15 degrees hotter between Lucid and Maverick I would see more power consumption. I checked the power consumption for Lucid and Maverick kernels on my Lenovo in idle and fully loaded CPU states with a power meter and observed that Maverick uses less power, so that's encouraging.

As for the correct value, why did the default change? Well, from what I can understand from several forums that discuss the setting of TjMax is that this is not well documented and not disclosed by Intel, hence the values are rule-of-thumb guesswork.

So, the bottom line is that if your CPU appears to run hot from the core temp readings between Lucid 10.04 and Maverick 10.10 first check to see if TjMax has changed on your hardware.

Read more
Colin Ian King

Hot Laptop

My Lenovo 3000N200 laptop has been playing me up. When I've been fully loading the processor or driving video hard it's been shutting down because of overheating. I suspect periodic SMIs are detecting an overheated CPU and the BIOS just stops the machine to avoid it turning into toast.

Suspecting that the latest 2.6.35 Maverick kernel was the cause I booted with a 2.6.32 Lucid kernel and that didn't help, so it didn't look like an obvious kernel regression.

Well, perhaps it's getting old and cranky - it's nearly 3 years old. Perhaps the thermal paste between the CPU and the heatsink is not working like it should. Since it was most probably a hardware issue I downloaded the service manual and got out the trusty screwdriver and opened it up. Lo and behold 5mm of dust had accumulated over the fan grill which wasn't going to help the poor machine offload all that heat out of the laptop case. I removed the fan, gave it a good clean and removed all the dust from the fan outlet grill.

After reassembly the laptop was good as new. Instead of rebooting at 95+ degrees Celsius the Lenovo now runs happily.

The moral of the story is that I should regularly service the fans on my machines. Cooking the CPU is something I would like to avoid in the future.

Read more
Colin Ian King

There are times when I drive my laptop CPU really hard, for example compressing Gigs of data or running QEMU, and it would be useful to see how hot my processor is actually getting. This is where sensors-applet is useful - it has the ability to show the core temperature of the CPU and HDD if one has the appropriate hardware sensors and drivers installed. However, getting it configured requires a little bit of hand-holding to get it working.

Firstly, install sensors-applet using:

sudo apt-get install sensors-applet

..this will also install the lm-sensors tools.

Next, one needs to probe the H/W to find the appropriate drivers required to be able to sense CPU and HDD temperatures. To do this use:

sudo sensors-detect

This will ask you if you want to probe and scan various I2C, PCI and SMBus adaptors, so answer the probing questions with respect to the hardware you have in your machine. On my machine I answered "YES" to every question, your mileage may vary.

At the end of the probing, sensors-detect will print out some lines that you need to add to /etc/modules. Using sudo, edit /etc/modules and add these lines. Then reboot your machine.

Once you are logged in again, right click on the top Gnome panel and select "Add to Panel.." and scroll down and select the "Hardware Sensors Monitor". Once it's added to the panel, right click on it and select "Preferences". On the Sensors Applet Preferences panel, select the "Sensors" tab and then select the appropriate CPU and HDD devices to monitor.

Once this is done, you will hopefully be able to see your CPU and HDD temperatures rise and fall as you work on your machine:

From the command line one can also get the current sensor data using the sensors command, e.g.:

$ sensors
Adapter: ISA adapter
Core 0: +58.0°C (high = +85.0°C, crit = +85.0°C)

Adapter: ISA adapter
Core 1: +57.0°C (high = +85.0°C, crit = +85.0°C)

Hopefully I won't see my CPU get to 85 degrees C, but now at least I can keep my eye on how hot it's getting :-)

Read more