1

If I seek back to the start of my topic, I may have millions of messages, I may want to process these in batches not all at once and commit the offset after each batch. How can I do this given that poll seems to fetch everything after the current offset and commit commits the offset at the end of what poll returned?

2
  • What client are you using Commented Jan 26, 2016 at 13:43
  • The new java KafkaConsumer Commented Jan 26, 2016 at 14:38

1 Answer 1

1

You can put an upper bound in the data that comes from each partition using max.partition.fetch.bytes the only downside is that records can only be that big, so if you don't know how big the records can be maybe this is not the best solution.

Each record that is returned from Kafka has the topic, partition and offset on that partition, so when you process the entire batch (or maybe you want to do this after processing each message so if your consumer goes down you don't process messages twice) you can sync or async commit the offset.

Sign up to request clarification or add additional context in comments.

4 Comments

This sounds cool, what happens to a message that is half way across this limit? I.e. if your 10th message in the poll batch pushes the total size over the limit is it cleanly ignored from what is fetched? Will you get any sort of warning if you have a massive message that won't fit into this? Presumably there's a max message size setting for the producer somewhere?
RecordTooLargeException is what you get if you push something too big. max.request.size is the property of the producer. I didn't understand the other questions.
So say I have max.partition.fetch.bytes set to 1000. I have messages that are all 110 bytes. The 10th message will be over that 1000 bytes so does it just get left behind correctly, I get 9 messages in the .poll result and when I commit offsets and .poll again I will get the 10th message? Seems obvious I just wanted to be sure.
just found confluent.io/blog/… which answers all my questions

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.