Accurate timestamping in Python logging

Question

I've been building an error logging app recently and was after a way of accurately timestamping the incoming data. When I say accurately I mean each timestamp should be accurate relative to each other (no need to sync to an atomic clock or anything like that).

I've been using datetime.now() as a first stab, but this isn't perfect:

>>> for i in range(0,1000): ... datetime.datetime.now() ... datetime.datetime(2008, 10, 1, 13, 17, 27, 562000) datetime.datetime(2008, 10, 1, 13, 17, 27, 562000) datetime.datetime(2008, 10, 1, 13, 17, 27, 562000) datetime.datetime(2008, 10, 1, 13, 17, 27, 562000) datetime.datetime(2008, 10, 1, 13, 17, 27, 578000) datetime.datetime(2008, 10, 1, 13, 17, 27, 578000) datetime.datetime(2008, 10, 1, 13, 17, 27, 578000) datetime.datetime(2008, 10, 1, 13, 17, 27, 578000) datetime.datetime(2008, 10, 1, 13, 17, 27, 578000) datetime.datetime(2008, 10, 1, 13, 17, 27, 609000) datetime.datetime(2008, 10, 1, 13, 17, 27, 609000) datetime.datetime(2008, 10, 1, 13, 17, 27, 609000) etc.

The changes between clocks for the first second of samples looks like this:

uSecs difference 562000 578000 16000 609000 31000 625000 16000 640000 15000 656000 16000 687000 31000 703000 16000 718000 15000 750000 32000 765000 15000 781000 16000 796000 15000 828000 32000 843000 15000 859000 16000 890000 31000 906000 16000 921000 15000 937000 16000 968000 31000 984000 16000

So it looks like the timer data is only updated every ~15-32ms on my machine. The problem comes when we come to analyse the data because sorting by something other than the timestamp and then sorting by timestamp again can leave the data in the wrong order (chronologically). It would be nice to have the time stamps accurate to the point that any call to the time stamp generator gives a unique timestamp.

I had been considering some methods involving using a time.clock() call added to a starting datetime, but would appreciate a solution that would work accurately across threads on the same machine. Any suggestions would be very gratefully received.

I just posted a new answer, in Windows at least, using Python, you can get sub-microsecond-resolution (NOT accuracy) timestamps using the Windows QPC clock, as I demonstrate in the code linked in my answer. — Gabriel Staples
– Gabriel Staples, Commented Aug 9, 2016 at 2:01
Why on earth are you building your own logging framework? There are plenty already and timestamps are a solved issue (down to a certain level of accuracy). In the unlikely event you have a use case that no existing logging framework solves, can you pick the closest one and raise an issue on it and submit your code to it? — smci
– smci, Commented Jul 10, 2017 at 15:17
Because ~8.5 years ago (when I posted this) the options were somewhat more limited. I wasn't building an error logging framework, I was writing something to receive UDP data and log information from that. If there was a library available (and that I'd found) that would have done that I'd have been entirely open to making use of it ;-) — Jon Cage
– Jon Cage, Commented Jul 13, 2017 at 15:43

Thomas Wouters · Accepted Answer · 2008-10-01 12:54:25Z

time.clock() only measures wallclock time on Windows. On other systems, time.clock() actually measures CPU-time. On those systems time.time() is more suitable for wallclock time, and it has as high a resolution as Python can manage -- which is as high as the OS can manage; usually using gettimeofday(3) (microsecond resolution) or ftime(3) (millisecond resolution.) Other OS restrictions actually make the real resolution a lot higher than that. datetime.datetime.now() uses time.time(), so time.time() directly won't be better.

For the record, if I use datetime.datetime.now() in a loop, I see about a 1/10000 second resolution. From looking at your data, you have much, much coarser resolution than that. I'm not sure if there's anything Python as such can do, although you may be able to convince the OS to do better through other means.

I seem to recall that on Windows, time.clock() is actually (slightly) more accurate than time.time(), but it measures wallclock since the first call to time.clock(), so you have to remember to 'initialize' it first.

Indeed, here is what it looks on Debian/Linux: datetime.datetime(2008, 10, 1, 17, 11, 31, 875190) datetime.datetime(2008, 10, 1, 17, 11, 31, 875199) datetime.datetime(2008, 10, 1, 17, 11, 31, 875207)
I can confirm that clock is indeed more accurate on all the windows machines I've tried it on.

Brian · Accepted Answer · 2008-11-12 22:36:30Z

You're unlikely to get sufficiently fine-grained control that you can completely eliminate the possibility of duplicate timestamps - you'd need resolution smaller than the time it takes to generate a datetime object. There are a couple of other approaches you might take to deal with it:

Deal with it. Leave your timestamps non-unique as they are, but rely on python's sort being stable to deal with reordering problems. Sorting on timestamp first, then something else will retain the timestamp ordering - you just have to be careful to always start from the timestamp ordered list every time, rather than doing multiple sorts on the same list.
Append your own value to enforce uniqueness. Eg. include an incrementing integer value as part of the key, or append such a value only if timestamps are different. Eg.

The following will guarantee unique timestamp values:

 class TimeStamper(object): def __init__(self): self.lock = threading.Lock() self.prev = None self.count = 0 def getTimestamp(self): with self.lock: ts = str(datetime.now()) if ts == self.prev: ts +='.%04d' % self.count self.count += 1 else: self.prev = ts self.count = 1 return ts

For multiple processes (rather than threads), it gets a bit trickier though.

I realize this is a bit nitpicky, but you mean "strictly increasing integer" not "monotonically increasing integer". A monotonically increasing set means that it doesn't ever decrease, but could still have equal values.
All nitpicks gratefully accepted. You're absolutely right - I've fixed the sloppy wording.

Jon Cage · Accepted Answer · 2008-10-01 23:23:33Z

Thank you all for your contributions - they've all be very useful. Brian's answer seems closest to what I eventually went with (i.e. deal with it but use a sort of unique identifier - see below) so I've accepted his answer. I managed to consolidate all the various data receivers into a single thread which is where the timestamping is now done using my new AccurrateTimeStamp class. What I've done works as long as the time stamp is the first thing to use the clock.

As S.Lott stipulates, without a realtime OS, they're never going to be absolutely perfect. I really only wanted something that would let me see relative to each incoming chunk of data, when things were being received so what I've got below will work well.

Thanks again everyone!

import time class AccurateTimeStamp(): """ A simple class to provide a very accurate means of time stamping some data """ # Do the class-wide initial time stamp to synchronise calls to # time.clock() to a single time stamp initialTimeStamp = time.time()+ time.clock() def __init__(self): """ Constructor for the AccurateTimeStamp class. This makes a stamp based on the current time which should be more accurate than anything you can get out of time.time(). NOTE: This time stamp will only work if nothing has called clock() in this instance of the Python interpreter. """ # Get the time since the first of call to time.clock() offset = time.clock() # Get the current (accurate) time currentTime = AccurateTimeStamp.initialTimeStamp+offset # Split the time into whole seconds and the portion after the fraction self.accurateSeconds = int(currentTime) self.accuratePastSecond = currentTime - self.accurateSeconds def GetAccurateTimeStampString(timestamp): """ Function to produce a timestamp of the form "13:48:01.87123" representing the time stamp 'timestamp' """ # Get a struct_time representing the number of whole seconds since the # epoch that we can use to format the time stamp wholeSecondsInTimeStamp = time.localtime(timestamp.accurateSeconds) # Convert the whole seconds and whatever fraction of a second comes after # into a couple of strings wholeSecondsString = time.strftime("%H:%M:%S", wholeSecondsInTimeStamp) fractionAfterSecondString = str(int(timestamp.accuratePastSecond*1000000)) # Return our shiny new accurate time stamp return wholeSecondsString+"."+fractionAfterSecondString if __name__ == '__main__': for i in range(0,500): timestamp = AccurateTimeStamp() print GetAccurateTimeStampString(timestamp)

S.Lott · Accepted Answer · 2008-10-01 14:24:59Z

"timestamp should be accurate relative to each other "

Why time? Why not a sequence number? If it's any client of client-server application, network latency makes timestamps kind of random.

Are you matching some external source of information? Say a log on another application? Again, if there's a network, those times won't be too close.

If you must match things between separate apps, consider passing GUID's around so that both apps log the GUID value. Then you could be absolutely sure they match, irrespective of timing differences.

If you want the relative order to be exactly right, maybe it's enough for your logger to assign a sequence number to each message in the order they were received.

I needed time stamps because I need to know when the data is collected and to see when there are gaps in the data being produced.
If you solution depends on clock accuracy, you'll have to find an OS that guarantees that your process is always the first thing that happens when the collected data arrives. Otherwise OS scheduling will bollix this up.

Community · Accepted Answer · 2017-05-23 12:01:27Z

2

Here is a thread about Python timing accuracy:

Python - time.clock() vs. time.time() - accuracy?

edited May 23, 2017 at 12:01

CommunityBot

11 silver badge

answered Oct 1, 2008 at 13:44

Corey Goldberg

61.4k30 gold badges135 silver badges147 bronze badges

1 Comment

Jon Cage Over a year ago

Yeah, I'd already seen that one, but those are relative to a process starting or the call to clock rather than an absolute(ish) time.

Jonathan Livni · Accepted Answer · 2014-03-05 09:46:44Z

A few years past since the question has been asked and answered, and this has been dealt with, at least for CPython on Windows. Using the script below on both Win7 64bit and Windows Server 2008 R2, I got the same results:

datetime.now() gives a resolution of 1ms and a jitter smaller than 1ms
time.clock() gives a resolution of better than 1us and a jitter much smaller than 1ms

The script:

import time import datetime t1_0 = time.clock() t2_0 = datetime.datetime.now() with open('output.csv', 'w') as f: for i in xrange(100000): t1 = time.clock() t2 = datetime.datetime.now() td1 = t1-t1_0 td2 = (t2-t2_0).total_seconds() f.write('%.6f,%.6f\n' % (td1, td2))

The results visualized: enter image description here

V Schmidt · Accepted Answer · 2008-11-12 15:51:52Z

I wanted to thank J. Cage for this last post.

For my work, "reasonable" timing of events across processes and platforms is essential. There are obviously lots of places where things can go askew (clock drift, context switching, etc.), however this accurate timing solution will, I think, help to ensure that the time stamps recorded are sufficiently accurate to see the other sources of error.

That said, there are a couple of details I wonder about that are explained in When MicroSeconds Matter. For example, I think time.clock() will eventually wrap. I think for this to work for a long running process, you might have to handle that.

Community · Accepted Answer · 2017-05-23 11:53:35Z

If you want microsecond-resolution (NOT accuracy) timestamps in Python, in Windows, you can use Windows's QPC timer, as demonstrated in my answer here: How to get millisecond and microsecond-resolution timestamps in Python. I'm not sure how to do this in Linux yet, so if anyone knows, please comment or answer in the link above.

Collectives™ on Stack Overflow

Accurate timestamping in Python logging

8 Answers 8

2 Comments

2 Comments

Comments

2 Comments

1 Comment

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

8 Answers 8

2 Comments

2 Comments

Comments

2 Comments

1 Comment

Comments

Comments

Comments

Linked

Related