Revision af2d62be-ae3f-4b6f-8895-383b59567ac6 - Software Engineering Stack Exchange

When processing data sets in batches I usually can think of the following three implementations.

Which one do you consider better than the other and why?

Notes:
 - The implementation is in C# but the question is about the algorithm.
 - The `GetBatchedData` works with a fixed batch size
 - The `Process` method can take an empty batch as argument, which means nothing has to be 
processed.
 - In case of `EmptyBatch`, `Items` is empty and `HasMoreData` returns `true`

**Option A**

As @Flater pointed out, this approach has a bug!

 var batchIndex = 0;
 var batch = GetBatchedData(batchIndex++);
 while (batch.HasMoreData)
 {
 Process(batch.Items);
 batch = GetBatchedData(batchIndex++);
 }

**Option B**

 var batchIndex = 0;
 var batch = GetBatchedData(batchIndex++);
 do
 {
 Process(batch.Items);
 batch = GetBatchedData(batchIndex++);
 } while (batch.HasMoreData)

**Option C**

 var batchIndex = 0;
 var batch = new EmptyBatch();
 do
 {
 Process(batch.Items);
 batch = GetBatchedData(batchIndex++);
 } while (batch.HasMoreData)

Additional approaches suggested only in comments but not in responses to the question:

**Suggestion A**

The `GetBatchedData` method returns `IEnumerable<Data[]>`, where every `Data[]` batch is `yield return`-ed

 foreach (var batch in GetBatchedData())
 {
 Process(batch)
 }