I am working with an opencv application,The main process uses 3 threads . I want to know the % of CPU consumed by each threads in the process?
- One of best questions in a whileuser4580220– user45802202015-10-06 04:18:32 +00:00Commented Oct 6, 2015 at 4:18
- Yeah, and there is still no satisfactory answer I can find that does NOT suggest to parse data from /proc ...RJVB– RJVB2019-01-01 09:24:30 +00:00Commented Jan 1, 2019 at 9:24
2 Answers
Read time(7) then clock_gettime(2), notably with CLOCK_THREAD_CPUTIME_ID and CLOCK_REALTIME. You probably want to compute the variation of these clocks (from e.g. the start of the thread) and the ratio of their variations. You probably want to convert the result of clock_gettime (or their delta) into a double (see this), since a struct timespec is often larger than a long long or any integral type on your machine. See also pthread_getcpuclockid(3).
Notice that a thread can be migrated by the kernel scheduler from one core to another one. See however sched_setaffinity(2) used by pthread_setaffinity_np(3).
See also proc(5). You might be tempted to parse /proc/self/stat and /proc/self/status etc...
Look also into times(2) & getrusage(2) & pthreads(7)
Comments
For a quick and possibly dirty (but meaningful) estimate at the process level I get the user+system time (from getrusage()) plus the actual real time elapsed, then divide the latter by the former.
Sadly I haven't been able to figure out how to do this at the thread level and that's exactly how I came across this question. There is getrusage(RUSAGE_THREAD,...) and clock_gettime(CLOCK_THREAD_CPUTIME_ID,..) on Linux but I'm getting strange results when I use the results from those functions as outlined above. The estimate does approach 100% CPU for threads that ought all be busy doing simple things (like reading 1-file-per-thread contents into 2 buffers then comparing them). However, this does not depend on the number of threads I spawn so I can get 8*100%CPU on a machine with 4 cores which shouldn't be possible. Indeed, top and the shell's time command show I'm far from full CPU load, more at about 250% on average (for that 8 threads on a 4-core machine example).
If I were to guess I'd say that the per-and-in thread measurement functions only measure actual time spent running the thread, excluding most if not all overhead like context switches. That should apply at least to the result returned by CLOCK_THREAD_CPUTIME_ID (including thread overhead in getrusage's system time estimate seems justifiable).