I am writing a program in c to find the time required to perform a task in terms of CPU cycles. I am avoiding the time conversion, Time in sec = 1/clock cycles as CPU frequency changes while low load in server to save the power consumption.
Program 1 :
///////////////////////// RDTSC Functions ///////////////////////// inline void start_rdtsc_rdtscp_ia64() { asm volatile ("CPUID\n\t" "RDTSC\n\t" "mov %%edx, %0\n\t" "mov %%eax, %1\n\t": "=r" (cycles_high), "=r" (cycles_low):: "%rax", "%rbx", "%rcx", "%rdx"); } inline void end_rdtsc_rdtscp_ia64() { asm volatile("RDTSCP\n\t" "mov %%edx, %0\n\t" "mov %%eax, %1\n\t" "CPUID\n\t": "=r" (cycles_high1), "=r" (cycles_low1):: "%rax", "%rbx", "%rcx", "%rdx"); } inline void warmup_rdtsc_rdtscp_ia64() { start_rdtsc_rdtscp_ia64(); end_rdtsc_rdtscp_ia64(); start_rdtsc_rdtscp_ia64(); end_rdtsc_rdtscp_ia64(); start_rdtsc_rdtscp_ia64(); end_rdtsc_rdtscp_ia64(); } inline uint64_t get_start_ia64() { return (((uint64_t) cycles_high << 32) | cycles_low); } inline uint64_t get_end_ia64() { return (((uint64_t) cycles_high1 << 32) | cycles_low1); } ///////////////////////// RDTSC Timer Functions ///////////////////////// inline void start_timer() { warmup_rdtsc_rdtscp_ia64(); start_rdtsc_rdtscp_ia64(); } inline void end_timer() { end_rdtsc_rdtscp_ia64(); start = get_start_ia64(); end = get_end_ia64(); } inline uint64_t get_cycles_count() { return end - start; } // measuring time here start_timer(); perform a task for length K //Let large K means more computation end_timer(); time in ticks= get_cycles_count() Program 2
int main() { while(1); } I have used warmup_rdtsc_rdtscp_ia64() function so that my rdtsc and cpuid get ready as per intel document it is required to get correct reading.
Without presence of Program2, I am getting higher cycles reading and I am unable to find a reason and relationship between execution time and length K.
With presence of Program2, I am getting expected result- means I can correlate execution time and length of K. Getting higher clock cycles execution time with Higher length K.
I only understand, Program2 prevent the CPU to go into power saving mode and so my CPU always runs into highest CPU Frequency, whereas without program2 my CPU goes into Power saving mode to save power and run into possible lowest Frequency .
So, my doubt are as follows
Without presence of Progra2, CPU goes into power saving mode (lower CPU frequency ) to save power. Although CPU runs in lower frequency, but still I am expecting almost similar range of clock cycles . I am not using conversion for the same reason Time_in_sec= 1/ Frequency . What is the reason I am getting higher clock cycles ????
Can anyone explain - what is the relationship between timing required to complete a task in clock cycle with respect to different Frequency level ( Power save mode, On-demand mode, Performance mode)
I am using Linux and both gcc and g++.
I need your assistance to understand the relationship between clock cycles required to complete a task in different power mode ( Power save mode, On-demand mode, Performance mode)
Thanks in advance.