Profiling Kernel Modules using Oprofile

On July 10, 2014, in How To, Linux, by erik

Author: Dennis Mantz

Description

oprofile is a system-wide profiler which can be used to profile any user-space application as well as the kernel itself and its modules. It has the capability to produce annotated source from binaries which are compiled with debug symbols (gcc -g).

System Used

I used an Ubuntu 12.04 LTS, but the steps should be the same on other systems. In order to setup a bridge you’ll need to physical network interfaces. I used the laptop’s build in nic and a usb-to-nic adapter. The build in nic (eth0) is connected to the network and the usb-to-nic adapter is connected to a second laptop which connects to the network via the bridge.

Preparations

In order to profile the bridge module we need to recompile the kernel with debug symbols and profiling support. We also need to compile and install oprofile itself if it’s not already installed.

Recompile the Kernel

Download the kernel sources for your kernel version (or a newer version if you like).
I used git (if git is not yet installed on your system use: $ sudo apt-get install git)

>$ git clone git://kernel.ubuntu.com/ubuntu/ubuntu-precise.git
$ cd ubuntu-precise
$ git tag -l

With the last command we printed out a list of tags we can choose from. I chose the one which is closest to my running kernel to keep complications to a minimum:

$ git checkout -b mybranch Ubuntu-lts-3.8.0-34.49_precise1

To compile the kernel we need additional packages (I may have forgotten some)

$ sudo apt-get install build-essential binutils libncurses5-dev

Then we copy the config of our running kernel into the source root:

$ cp /boot/config-`uname -r` .config

Now we can activate the config options we need for profiling to work:

$ make menuconfig

Set the following options:

  • General setup -> Profiling support = y
  • General setup -> OProfile system profiling = y
  • Kernel hacking -> Strip assembler-generated symbols during link = n
  • Kernel hacking -> Compile the kernel with debug info = y
  • Networking support -> Networking options -> 802.1d Ethernet Bridging = m

Exit and save changes. Then we build and install the new Kernel (this will take a while).
If you have a cpu with multiple cores, you can specify -j 1+<number of cores> to speed the build up. I have a quad-core cpu, so I’ll build with -j 5:

$ make -j 5
$ sudo make deb-pkg
$ cd .. 
$ sudo dpkg -i linux-image-3.8.13.12_3.8.13.12-2_amd64.deb 
$ sudo dpkg -i linux-headers-3.8.13.12_3.8.13.12-2_amd64.deb 
$ sudo dpkg -i linux-firmware-image_3.8.13.12-2_amd64.deb 
$ sudo dpkg -i linux-libc-dev_3.8.13.12-2_amd64.deb

Finally we can reboot our system and start the new kernel!

Install and Configure the Bridge

First we need to install the bridge utilities:

$ sudo apt-get install bridge-utils

Then we add a new logical bridge interface (br0) and add the two physical interfaces (eth0 and eth1) to it. By the way: I strongly recommend to deactivate the network manager before setting up the bridge!

$ sudo ip addr flush eth0
$ sudo ip addr flush eth1
$ sudo brctl addbr br0
$ sudo brctl addif br0 eth0 eth1
$ sudo ip link set dev br0 up

Now the bridge should be up and running. If you have a dhcp-server in your network, you can run dhclient on the bridge-interface to assign it an IP address (otherwise do a static IP configuration on the bridge):

$ sudo dhclient br0

Install oprofile and start profiling

To install oprofile we will download the sources of the newest version from the website (at the time this howto was written this was version 0.9.9) and compile them. Before we do so, we have install some more dependencies.

$ sudo apt-get install libpopt-dev binutils-dev
$ wget http://prdownloads.sourceforge.net/oprofile/oprofile-0.9.9.tar.gz
$ tar -xvf oprofile-0.9.9.tar.gz
$ cd oprofile-0.9.9
$ ./configure
$ make
$ make install

Now we’re finally ready to start the profiling:

$ sudo opcontrol --init
$ sudo opcontrol --start --vmlinux=/home/user/ubuntu-precise/vmlinux

If the watchdog service is using the NMI on the machine, oprofile will exit with an error and tell you to deactivate watchdog.
If this happens do the following:

$ sudo opcontrol --deinit
$ su root
$ echo 0 > /proc/sys/kernel/nmi_watchdog
$ exit

Then do the init and start commands again. The profiling is now running in the background. In order to output the results you should run these two commands:

$ sudo opcontrol --dump   	# This dumps all collected profiling data to the hard drive.
$ opreport			# This generates a profiling report of the whole system.

Running the opreport command with -l will generate a more detailed report with all symbols separated. To get only the information about the binary you care about, you have to specify the name of the binary:

$ opcontrol -l /usr/bin/firefox

or in case of a kernel module (you have to give the path to the kernel modules):

$ opcontrol -l --image-path=/lib/modules/`uname -r`/kernel  /lib/modules/`uname -r`/kernel/net/bridge/bridge.ko

There is also the possibility to generate annotated source code with the opannotate command:

$ opannotate --image-path=/lib/modules/`uname -r`/kernel --output-dir=~/profiling-output

After this completes the profiling-output directory contains all source files with annotations.

Examples and Tips for getting started

OProfile could be configured in many different ways, to profile exactly the things you really want to see.
The next view lines generate some interesting and basic outputs.

Callgraph
To generate a callgraph, run opcontrol and opreport with the –callgraph option:

$ sudo operf --start --callgraph --vmlinux /home/dxm02271/ubuntu-precise/vmlinux
$ opreport --callgraph /lib/modules/3.8.13.12/kernel/net/bridge/bridge.ko --merge all --image-path=/lib/modules/3.8.13.12/kernel/

This will give you a callgraph that looks like this:

-------------------------------------------------------------------------------
  90       100.000  bridge.ko                br_handle_frame
  90       14.2631  bridge.ko                br_handle_frame
  3680     95.3121  vmlinux                  nf_hook_slow
  90        2.3310  bridge.ko                br_handle_frame
  90        2.3310  bridge.ko                br_handle_frame [self]
  1         0.0259  vmlinux                  nf_iterate
-------------------------------------------------------------------------------

Notice there’s one line, that isn’t intended. That’s the function which is in focus. All lines above are functions calling it and all lines beneath are functions getting called by it.

For whatever reason my VM suddenly crashed in Workstation 9. Quite annoying as I was in the middle of something. It eventually shutdown and when I attempted to start it again I was blessed with a wonderful error message – Transport (VMDB) error -14: Pipe connection has been broken.

vmware-vmdb-error

Well how nice! The culprit actually is dealing with some of the kernel modules that VMware uses to interface with your network drives etc. This means a full restart is required, including removal of those kernel modules. There are a few steps that ended up working for me:

Step 1 – Stop the Vmware processes

I am old school and use the sysV start/stop commands. You could probably do a service stop if you are on CentOS or Fedora…a non Debian based distro (how terrible).

$ /etc/init.d/vmware stop

This will do a processes listing, grepping for any vm related materials, then it grabs the actually process ID for the kill command to … kill.

sudo kill -9 `ps | grep vm | awk '{print $2;}'`

If you do not do this first, you will be unable to remove the kernel modules.

Step 2 – Remove the vmware kernel objects

Lets first get an idea of which modules we are dealing with:

$ lsmod | grep vm
vmnet                  51830  0 
vmci                   82382  0 
vmmon                  76214  0

These all must go! So, as long as all of the VMware processes have stopped, we should be able to remove them.

$ sudo rmmod vm*
$ lsmod | grep vm

As long as your ‘lsmod’ command displays no lines, then this means lsmod can no longer find any VMware specific kernel modules loaded. You may have other modules that start with “vm” but it is unlikely.

Step 3 – Restart VM and Go

Now that there are no processes and no kernel modules currently loaded against your kernel version, its time to restart vmware which in turn reloads the pertinent kernel modules.

$ sudo /etc/init.d/vmware start
Starting VMware services:
   Virtual machine monitor                                             done
   Virtual machine communication interface                             done
   VM communication interface socket family                            done
   Blocking file system                                                done
   Virtual ethernet                                                    done
   VMware Authentication Daemon                                        done
   Shared Memory Available                                             done

I then went back to the Workstation GUI and started my VM – voila! Success. Hopefully this has helped get you out of the woods too!