I have an application which normally reports (time command reports):
real 1.59 user 1.42 sys 4.73 But when I load a shared library and run it then the time goes up quite high (time command reports):
real 28.51 user 106.22 sys 5.23 While a certain level of increase (2 to 4 times is reported on CentOS and Ubuntu -- which is as expected) in run is expected due to my shared library's work, the above timing reported on Fedora 24 is too high.
I attempted to use perf which reported:
255352.948615 task-clock:u (msec) # 3.895 CPUs utilized 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 18,127 page-faults:u # 0.071 K/sec 664,852,184,198 cycles:u # 2.604 GHz (50.03%) 19,323,811,463 stalled-cycles-frontend:u # 2.91% frontend cycles idle (50.02%) 578,178,881,331 stalled-cycles-backend:u # 86.96% backend cycles idle (50.02%) 110,595,196,687 instructions:u # 0.17 insn per cycle # 5.23 stalled cycles per insn (50.00%) 28,361,633,658 branches:u # 111.068 M/sec (50.01%) 777,249,031 branch-misses:u # 2.74% of all branches (50.01%) 65.564158710 seconds time elapsed This appears to say that CPU is idle for plenty of time. But I am trying to find where that happens in the code (I have access to the entire source code both of my application and the shared library in question). I have also seen perf report which reports the time spent in percentages in functions/system calls. But I am interested in even finer level i.e. which line(s) in those functions so that I can understand why.
I appreciate it's not easy to provide any concrete advice given that I haven't provided much info on my application/shared library. I am only looking for suggestions/tools/ideas to figure out where the CPU is spending most of its time in the code (or being idle).
It's a Fedora 24 Linux/x86_64 with glibc 2.23 (both my application and shared library are compiled with gcc 6.1.1 and glibc 2.23).