1

Let's start with an example. I've got an application that logs a point every minute. When I aggregate this data into a query where I'll have the count of logs per day, I get the following result. Keep in mind, that the query and the data is in UTC.

SELECT count(int_3) FROM data WHERE time >= '2023-10-26T00:00:00Z' AND time < '2023-10-31T23:59:59Z' GROUP BY time(1d) fill(none) 
2023-10-26T00:00:00Z 1440 2023-10-27T00:00:00Z 1440 2023-10-28T00:00:00Z 1440 2023-10-29T00:00:00Z 1440 2023-10-30T00:00:00Z 1440 2023-10-31T00:00:00Z 1440 

This is the expected, result. Also when I use a timezone in the query I get the expected result. Keep in mind that this query includes daylight saving, which results in one day having 1 bucket more.

SELECT count(int_3) FROM data WHERE time >= '2023-10-26T00:00:00+02:00' AND time < '2023-10-31T23:59:59+01:00' GROUP BY time(1d) fill(none) TZ('Europe/Amsterdam') 
2023-10-26T00:00:00+02:00 1320 2023-10-27T00:00:00+02:00 1440 2023-10-28T00:00:00+02:00 1440 2023-10-29T00:00:00+02:00 1500 2023-10-30T00:00:00+01:00 1440 2023-10-31T00:00:00+01:00 1440 

So far so good right. Let's focus on the issue I'm facing. When ever somebody want to move the start time of the query, in the example (2023-10-26T00:00:00Z), we need to add an offset to the GROUP BY time() clause, as InfluxDB normally start bucketing from the unixtimestamp 0.

According to their documentation page you can use a offset_interval to fix the start time of the query results

The offset_interval is a duration literal. It shifts forward or back the InfluxDB database’s preset time boundaries. The offset_interval can be positive or negative.

For example let's use the first query (UTC) again and shift the start time and endtime one hour back.

SELECT count(int_3) FROM data WHERE time >= '2023-10-25T23:00:00Z' AND time < '2023-10-31T22:59:59Z' GROUP BY time(1d) fill(none) 

Without using a offset_interval, the results are still timestamped on midnight.

2023-10-26T00:00:00Z 1440 2023-10-27T00:00:00Z 1440 2023-10-28T00:00:00Z 1440 2023-10-29T00:00:00Z 1440 2023-10-30T00:00:00Z 1440 2023-10-31T00:00:00Z 1380 

Because the starttime of the query is 2023-10-25T23:00:00Z I also want the results to reflect that. I need to use an offset of 23 hours (calculated with this post https://stackoverflow.com/a/54807316/5686835)

SELECT count(int_3) FROM data WHERE time >= '2023-10-25T23:00:00Z' AND time < '2023-10-31T22:59:59Z' GROUP BY time(1d,23h) fill(none) 
2023-10-25T23:00:00Z 1380 2023-10-26T23:00:00Z 1440 2023-10-27T23:00:00Z 1440 2023-10-28T23:00:00Z 1440 2023-10-29T23:00:00Z 1440 2023-10-30T23:00:00Z 1440 

Perfect, now with the Timezone I want.

SELECT count(int_3) FROM data WHERE time >= '2023-10-25T23:00:00+02:00' AND time < '2023-10-31T22:59:59+01:00' GROUP BY time(1d,23h) fill(none) TZ('Europe/Amsterdam') 
2023-10-25T23:00:00+02:00 1260 2023-10-26T23:00:00+02:00 1440 2023-10-27T23:00:00+02:00 1440 2023-10-28T23:00:00+02:00 1440 2023-10-29T23:00:00+01:00 1320 2023-10-30T23:00:00+01:00 1440 

And there is the problem. 2023-10-29 should have one hour of point (60 points) more than the other days, but actually got 2 hours (120 points) less, which I find difficult to explain.

Does somebody have an explanation or a solution for this?

I'm using InfluxDB v1.8.10 and v1.11.1.

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.