Kafka - Reset partition errors

Question

I use Kafka Streams for data processing.

Kafka 1.1

I am generating the data rate 35k rps in kafka and the streams are also consuming at the same rate as seen through the kafka consumer metrics. But very often I see the reset partition errors which says that the fetch offset is out of range. This basically means that my consumer is slower than the log deletion which is happening very aggressively.

My log retention hours is 168 and log retention bytes is 10G. Here are the logs which I get in my consumer very often.

I tried to search for this related problem online, but I could not find any.

[sample-app-deploy-8c4fd5697-4xxbk sample-app] 09:28:24.291 [sample-app-0cf78aad-5faa-4197-853b-bfc08bb38f66-StreamThread-1] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=sample-app-0cf78aad-5faa-4197-853b-bfc08bb38f66-StreamThread-1-consumer, groupId=sample-app] Fetch offset 116411050 is out of range for partition sample-topic-4, resetting offset [sample-app-deploy-8c4fd5697-4xxbk sample-app] 09:28:24.292 [sample-app-0cf78aad-5faa-4197-853b-bfc08bb38f66-StreamThread-1] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=sample-app-0cf78aad-5faa-4197-853b-bfc08bb38f66-StreamThread-1-consumer, groupId=sample-app] Resetting offset for partition sample-topic-4 to offset 116411058. [sample-app-deploy-8c4fd5697-qmjnd sample-app] 09:28:24.306 [sample-app-e10caa03-b881-47f2-b1ce-e9513c12a98c-StreamThread-1] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=sample-app-e10caa03-b881-47f2-b1ce-e9513c12a98c-StreamThread-1-consumer, groupId=sample-app] Fetch offset 237000869 is out of range for partition sample-topic-7, resetting offset [sample-app-deploy-8c4fd5697-qmjnd sample-app] 09:28:24.307 [sample-app-e10caa03-b881-47f2-b1ce-e9513c12a98c-StreamThread-1] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=sample-app-e10caa03-b881-47f2-b1ce-e9513c12a98c-StreamThread-1-consumer, groupId=sample-app] Resetting offset for partition sample-topic-7 to offset 237000871. [sample-app-deploy-8c4fd5697-n5pw8 sample-app] 09:29:56.808 [sample-app-1db56df6-1dab-40d2-94c2-e412eff0ee09-StreamThread-1] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=sample-app-1db56df6-1dab-40d2-94c2-e412eff0ee09-StreamThread-1-consumer, groupId=sample-app] Fetch offset 471945398 is out of range for partition sample-topic-0, resetting offset [sample-app-deploy-8c4fd5697-n5pw8 sample-app] 09:29:56.810 [sample-app-1db56df6-1dab-40d2-94c2-e412eff0ee09-StreamThread-1] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=sample-app-1db56df6-1dab-40d2-94c2-e412eff0ee09-StreamThread-1-consumer, groupId=sample-app] Resetting offset for partition sample-topic-0 to offset 471945403. [sample-app-deploy-8c4fd5697-n5pw8 sample-app] 09:34:56.804 [sample-app-1db56df6-1dab-40d2-94c2-e412eff0ee09-StreamThread-1] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=sample-app-1db56df6-1dab-40d2-94c2-e412eff0ee09-StreamThread-1-consumer, groupId=sample-app] Fetch offset 474036996 is out of range for partition sample-topic-0, resetting offset [sample-app-deploy-8c4fd5697-n5pw8 sample-app] 09:34:56.805 [sample-app-1db56df6-1dab-40d2-94c2-e412eff0ee09-StreamThread-1] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=sample-app-1db56df6-1dab-40d2-94c2-e412eff0ee09-StreamThread-1-consumer, groupId=sample-app] Resetting offset for partition sample-topic-0 to offset 474036997. [sample-app-deploy-8c4fd5697-cjccm sample-app] 09:39:10.659 [sample-app-a7e7c388-0dd4-45e8-8d5a-3a84effb7dfd-StreamThread-1] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=sample-app-a7e7c388-0dd4-45e8-8d5a-3a84effb7dfd-StreamThread-1-consumer, groupId=sample-app] Fetch offset 236702790 is out of range for partition sample-topic-5, resetting offset

Can anyone help in pointing out the area where the problem is happening?

@SantoshTulasiram looks like there is a similar issue issues.apache.org/jira/browse/KAFKA-6189 could you check if its related? — aishwarya kumar
– aishwarya kumar, Commented Sep 12, 2019 at 14:42
The cleanup policy is delete. I found that the cleanup was happening aggressively due to the retention time and not retention size as I expected. Seems like issue with the event times that we are pushing. I will update here once the problem gets fixed. — Santosh Tulasiram
– Santosh Tulasiram, Commented Sep 13, 2019 at 9:20
The problem was due to the timestamp. We were sending the event times in seconds instead of millis which caused the time retention to be aggressive. — Santosh Tulasiram
– Santosh Tulasiram, Commented Sep 13, 2019 at 10:35
@SantoshTulasiram Can you answer your question and accept then? Thanks. — Jacek Laskowski
– Jacek Laskowski, Commented Sep 13, 2019 at 18:11

Santosh Tulasiram · Accepted Answer · 2019-09-16 06:56:30Z

The problem was due to the timestamp. We were sending the event times in seconds instead of millis to the partition which caused the time retention to be aggressive.

Collectives™ on Stack Overflow

Kafka - Reset partition errors

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related