We discovered an I/O related performance problem when using the default SLES11 SP2 kernel. Our same application on the same hardware had no issue with (the admittedly ancient SLES9 SP3).
We had a feeling it has something to do with the kernel and have done the following experiment so far:
- Downloaded and unpacked the kernel.org
3.0.13source tree. - Installed Novell's
kernel-source-3.0.13-0.27.1package. This is the3.0.13source with all of Novell's patches/tweaks already applied. - In the
3.0.13tree did amake x86_64_defconfigto generate a default config file, then usedmake menuconfigto turn on the drive controller and network card drivers our hardware needed and built the kernel. This kernel hadnoperformance problem. - In the
3.0.13-0.27.1tree we got the config file that Novell ships with their compiled kernel, built with it, and had the same performance problem we had with the Novell-compiled kernel. - In the
3.0.13-0.27.1tree, took the config file used in (3), built with it (accepting the additional config defaults that Novell's version of the configurator wanted to add). There was no performance problem. - In the
3.0.13tree, took the config file used in (4), built with it (losing the Novell-only config options in the file). There was no performance problem. - In the
3.0.13-0.27.1tree, took the config file used in (4), and made the device drivers (but nothing else) match the config file in (3) (i.e. turned off lots and lots of device drivers that weren't being used anyways). There was no performance problem.
So the kernel.org 3.0.13 tree had no performance problem either way. Novell's 3.0.13-0.27.1 tree only had the problem when we built with Novell's config file or a file that was identical to Novell's but with most of the device drivers de-configured.
Given that, it would seem to be that there's either some Novell-only config option (and implicitly the Novell code it activates) causing the problem, or some bad interaction between a Novell patch and some standard option (given that the as-close-as-possible-to-Novell's-config did not have the problem when built against the virgin kernel.org source).
While we continue to investigate, since I am no kernel hacker (I know how to build the kernel but that's about it) I was wondering what "families" of config options would be the best ones to focus attention on. The idea would be to change some options from the "Novell" setting to the "kernel.org" setting and see if the problem goes away, then set them back and try another set, and so on.
But I'd like to narrow that down -- hence the question about which config options would be good bets to play with first.
diffon the generated .config files to see what the differences are.