14

Is it possible with perf to collect hardware counter statistics for only part of a program's execution? If so, how?

likwid offers the feature of being able to define named regions, but it would be great if this was possible on systems with just perf installed.

Some previous questions have returned relevant answers, but there are still some shortcomings:

  • Using probe I get the same error and I'm using a slightly newer kernel (3.13). Are these fixes available in a newer version?
  • Using perf_event_open I would like to maintain the flexibility to define events on the command line. I also took a peek at the code for perf stat itself, but it seems it doesn't set things up by calling perf_event_open.
4
  • 1
    Yes you could o it with perf_event_open. perf stat does call it (run_perf_stat → __run_perf_stat → create_perf_stat_counter → perf_evsel__open_per_thread → __perf_evsel__open). Commented Nov 9, 2014 at 1:42
  • 1
    Perf have no library to integrate counters into or define regions within the program (it only recently got normal JIT agent interface lwn.net/Articles/633846 better than /tmp/perf-$pid.map files). You can try some library like libpfm4 or PAPI (which may use libpfm4) to do hardware performance counting from your program. They will program perf_event_open for you, libpfm4 also has tables of event names, and there are some programming ways to use env vars/cmdline args to specify the event names. Commented May 30, 2017 at 3:38
  • 3
    More recent perf has a feature to let you start/stop measurement by writing to a pipe: Enable/disable perf event collection programmatically Commented Nov 21, 2022 at 8:47
  • There are plenty libraries that wrap around the perf_event_open system call, e.g., PAPI, PerfEvent, and perf-cpp Commented Nov 20, 2024 at 16:43

2 Answers 2

14

Spawn a child process to run perf stat.
Attach perf stat to the parent.
Kill the child process from parent as and when required.

#include <unistd.h> #include <stdio.h> #include <signal.h> int main() { int pid= getpid(); int cpid = fork(); if( cpid == 0) { // child process . Run your perf stat char buf[50]; sprintf(buf, "perf stat -p %d > stat.log 2>&1",pid); execl("/bin/sh", "sh", "-c", buf, NULL); } else { // set the child the leader of its process group setpgid(cpid, 0); ////////////////////////////////////////////// // part of program you wanted to perf stat sleep(3); //////////////////////////////////////////////// //////////////////////////////////////////////////////////////// // stop perf stat by killing child process and all its descendants(sh, perf stat etc ) kill(-cpid, SIGINT); //////////////////////////////////////////////////////////////////// // rest of the program sleep(2); } } 
Sign up to request clarification or add additional context in comments.

Comments

1

You could use libpfc or jevents both of which are Linux-compatible libraries that allow programming and reading of performance counters via rdpmc at arbitrary points in the userland program.

This won't help directly with your request to specify events on the command line, but you could back something together perhaps based on the ocperf.py code, or libpfm4.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.