Why is there inconsistency in resolution of Python's time functions (such as perf_counter_ns)?

Question

Background

I have been trying to write a reliable timer with resolution of at least microseconds in Python (3.7). The purpose is to run some specific task every few us, continuously over long period of time. After some research I settled with perf_counter_ns because of its higher consistency and tested resolution among others (monotonic_ns, time_ns, process_time_ns, and thread_time_ns), details of which can be found in the time module documentation and PEP 564

Test

To ensure the precision (and accuracy) of perf_counter_ns, I set up a test to collect the delays between consecutive timestamps, as shown below.

import time import statistics as stats # import resource def practical_res_test(clock_timer_ns, count, expected_res): counter = 0 diff = 0 timestamp = clock_timer_ns() # initial timestamp diffs = [] while counter < count: new_timestamp = clock_timer_ns() diff = new_timestamp - timestamp if (diff > 0): diffs.append(diff) timestamp = new_timestamp counter += 1 print('Mean: ', stats.mean(diffs)) print('Mode: ', stats.mode(diffs)) print('Min: ', min(diffs)) print('Max: ', max(diffs)) outliers = list(filter(lambda diff: diff >= expected_res, diffs)) print('Outliers Total: ', len(outliers)) if __name__ == '__main__': count = 10000000 # ideally, resolution of at least 1 us is expected # but let's just do 10 us for the sake of this test expected_res = 10000 practical_res_test(time.perf_counter_ns, count, expected_res) # other method benchmarks # practical_res_test(time.time_ns, count, expected_res) # practical_res_test(time.process_time_ns, count, expected_res) # practical_res_test(time.thread_time_ns, count, expected_res) # practical_res_test( # lambda: int(resource.getrusage(resource.RUSAGE_SELF).ru_stime * 10**9), # count, # expected_res # )

Problem and Question

Question: Why are there occasional significant skips in time between timestamps? Multiple tests with 10,000,000 count on my Raspberry Pi 3 Model B V1.2 yielded similar results, one of which is as follows (time is of course in nano seconds):

Mean: 2440.1013097 Mode: 2396 Min: 1771 Max: 1450832 # huge skip as I mentioned Outliers Total: 8724 # delays that are more than 10 us

Another test on my Windows desktop:

Mean: 271.05812 # higher end machine - better resolution Mode: 200 Min: 200 Max: 30835600 # but there're still skips, even more significant Outliers Total: 49021

Although I am aware that resolution will differ on different systems, it is easy to notice a much lower resolution in my test compared to what is rated in PEP 564. Most importantly, occasional skips are observed.

Please let me know if you have any insight into why this is happening. Does it have anything to do with my test, or is perf_counter_ns bound to fail in such use cases? If so do you have any suggestions for a better solution? Do let me know if there is any other info I need to provide.

Additional Info

For completion, here is the clock info from time.get_clock_info()

On my raspberry pi:

Clock: perf_counter Adjustable: False Implementation: clock_gettime(CLOCK_MONOTONIC) Monotonic: True Resolution(ns): 1

On my Windows desktop:

Clock: perf_counter Adjustable: False Implementation: QueryPerformanceCounter() Monotonic: True Resolution(ns): 100

It is also worth mentioning that I am aware of time.sleep() but from my tests and use case it is not particularly reliable as others have discussed here

I am not a python dev, but could it have something to with GC, scheduling or simply the fact that python is not made as a real time language? — munHunger
– munHunger, Commented Jul 29, 2019 at 7:27
@munHunger thanks for the input. By GC you mean garbage collection? I agree that python might not be efficient for some real-time applications. — vnphanquang
– vnphanquang, Commented Jul 29, 2019 at 7:38
yes, I mean garbage collection. as said I am not a python dev, so I am not 100% sure about how it is actually done, but if we assume that it implements a "stop-the-world" GC, then it might explain the issue — munHunger
– munHunger, Commented Jul 29, 2019 at 7:39
You can disable cycle collection by importing gc at the top of your module, then calling gc.disable() just before beginning your tests (and gc.enable() after). — ShadowRanger
– ShadowRanger, Commented Jul 29, 2019 at 15:57
Also, if you want to get the most precision possible in pure Python, I would consider taking out all code except calls to perf_counter_ns from the timer loop, because at the nanosecond scale, everything you do takes up noticeable time. — jirassimok
– jirassimok, Commented Jul 29, 2019 at 16:13

Hans Van Ingelgom · Accepted Answer · 2020-10-06 09:53:31Z

If you plot the list of time differences, you will see a rather low baseline with peaks that increase over time.

This is caused by the append() operation that occasionally has to reallocate the underlying array (which is how the Python list is implemented). By pre-allocating the array, the result will improve:

import time import statistics as stats import gc import matplotlib.pyplot as plt def practical_res_test(clock_timer_ns, count, expected_res): counter = 0 diffs = [0] * count gc.disable() timestamp = clock_timer_ns() # initial timestamp while counter < count: new_timestamp = clock_timer_ns() diff = new_timestamp - timestamp if diff > 0: diffs[counter] = diff timestamp = new_timestamp counter += 1 gc.enable() print('Mean: ', stats.mean(diffs)) print('Mode: ', stats.mode(diffs)) print('Min: ', min(diffs)) print('Max: ', max(diffs)) outliers = list(filter(lambda diff: diff >= expected_res, diffs)) print('Outliers Total: ', len(outliers)) plt.plot(diffs) plt.show() if __name__ == '__main__': count = 10000000 # ideally, resolution of at least 1 us is expected # but let's just do 10 us for the sake of this test expected_res = 10000 practical_res_test(time.perf_counter_ns, count, expected_res)

These are the results I get:

Mean: 278.6002 Mode: 200 Min: 200 Max: 1097700 Outliers Total: 3985

In comparison, these are the results on my system with the original code:

Mean: 333.92254 Mode: 300 Min: 200 Max: 50507300 Outliers Total: 2590

To get even better performance, you might want to run on Linux and use SCHED_FIFO. But always remember that real-time tasks with microsecond precision are not done in Python. If your problem is soft real-time, you can get away with it but it all depends on the penalty for missing a deadline and your understanding of the time complexities of both your code and the Python interpreter.

Hey there thanks for the feedback. Allocation of the array beforehand is definitely an improvement; I did not think of that. And yes I understand Python is by no means best for this particular real-time task. However, in this project, Python has really surprised me of what it can achieve if you carefully think about the performance of every line of code you would write. Anyway my original question here is more for the curiosity of such inconsistency in these time functions. Cheers!

Collectives™ on Stack Overflow

Why is there inconsistency in resolution of Python's time functions (such as perf_counter_ns)?

Background

Test

Problem and Question

Additional Info

1 Answer 1

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

Background

Test

Problem and Question

Additional Info

1 Answer 1

1 Comment

Linked

Related