Skip to main content
deleted 4 characters in body
Source Link

When processing data sets in batches I usually can think of the following three implementations.

Which one do you consider better than the other and why?

Notes:

  • The implementation is in C# but the question is about the algorithm.
  • The GetBatchedData works with a fixed batch size
  • The Process method can take an empty batch as argument, which means nothing has to be processed.
  • In case of EmptyBatch, Items is empty and HasMoreData returns true

Option A

As @Flater pointed out, this approach has a bug!

var batchIndex = 0; var batch = GetBatchedData(batchIndex++); while (batch.HasMoreData) { Process(batch.Items);   var batch = GetBatchedData(batchIndex++); } 

Option B

var batchIndex = 0; var batch = GetBatchedData(batchIndex++); do { Process(batch.Items); batch = GetBatchedData(batchIndex++); } while (batch.HasMoreData) 

Option C

var batchIndex = 0; var batch = new EmptyBatch(); do { Process(batch.Items); batch = GetBatchedData(batchIndex++); } while (batch.HasMoreData) 

Additional approaches suggested only in comments but not in responses to the question:

Suggestion A

The GetBatchedData method returns IEnumerable<Data[]>, where every Data[] batch is yield return-ed

foreach (var batch in GetBatchedData()) { Process(batch) } 

When processing data sets in batches I usually can think of the following three implementations.

Which one do you consider better than the other and why?

Notes:

  • The implementation is in C# but the question is about the algorithm.
  • The GetBatchedData works with a fixed batch size
  • The Process method can take an empty batch as argument, which means nothing has to be processed.
  • In case of EmptyBatch, Items is empty and HasMoreData returns true

Option A

var batchIndex = 0; var batch = GetBatchedData(batchIndex++); while (batch.HasMoreData) { Process(batch.Items);   var batch = GetBatchedData(batchIndex++); } 

Option B

var batchIndex = 0; var batch = GetBatchedData(batchIndex++); do { Process(batch.Items); batch = GetBatchedData(batchIndex++); } while (batch.HasMoreData) 

Option C

var batchIndex = 0; var batch = new EmptyBatch(); do { Process(batch.Items); batch = GetBatchedData(batchIndex++); } while (batch.HasMoreData) 

Additional approaches suggested only in comments but not in responses to the question:

Suggestion A

The GetBatchedData method returns IEnumerable<Data[]>, where every Data[] batch is yield return-ed

foreach (var batch in GetBatchedData()) { Process(batch) } 

When processing data sets in batches I usually can think of the following three implementations.

Which one do you consider better than the other and why?

Notes:

  • The implementation is in C# but the question is about the algorithm.
  • The GetBatchedData works with a fixed batch size
  • The Process method can take an empty batch as argument, which means nothing has to be processed.
  • In case of EmptyBatch, Items is empty and HasMoreData returns true

Option A

As @Flater pointed out, this approach has a bug!

var batchIndex = 0; var batch = GetBatchedData(batchIndex++); while (batch.HasMoreData) { Process(batch.Items); batch = GetBatchedData(batchIndex++); } 

Option B

var batchIndex = 0; var batch = GetBatchedData(batchIndex++); do { Process(batch.Items); batch = GetBatchedData(batchIndex++); } while (batch.HasMoreData) 

Option C

var batchIndex = 0; var batch = new EmptyBatch(); do { Process(batch.Items); batch = GetBatchedData(batchIndex++); } while (batch.HasMoreData) 

Additional approaches suggested only in comments but not in responses to the question:

Suggestion A

The GetBatchedData method returns IEnumerable<Data[]>, where every Data[] batch is yield return-ed

foreach (var batch in GetBatchedData()) { Process(batch) } 
added 303 characters in body
Source Link

When processing data sets in batches I usually can think of the following three implementations.

Which one do you consider better than the other and why?

Notes:

  • The implementation is in C# but the question is about the algorithm.
  • The GetBatchedData works with a fixed batch size
  • The Process method can take an empty batch as argument, which means nothing has to be processed.
  • In case of EmptyBatch, Items is empty and HasMoreData returns true

Option A

var batchIndex = 0; var batch = GetBatchedData(batchIndex++); while (batch.HasMoreData) { Process(batch.Items); var batch = GetBatchedData(batchIndex++); } 

Option B

var batchIndex = 0; var batch = GetBatchedData(batchIndex++); do { Process(batch.Items); batch = GetBatchedData(batchIndex++); } while (batch.HasMoreData) 

Option C

var batchIndex = 0; var batch = new EmptyBatch(); do { Process(batch.Items); batch = GetBatchedData(batchIndex++); } while (batch.HasMoreData) 

Additional approaches suggested only in comments but not in responses to the question:

Suggestion A

The GetBatchedData method returns IEnumerable<Data[]>, where every Data[] batch is yield return-ed

foreach (var batch in GetBatchedData()) { Process(batch) } 

When processing data sets in batches I usually can think of the following three implementations.

Which one do you consider better than the other and why?

Notes:

  • The implementation is in C# but the question is about the algorithm.
  • The GetBatchedData works with a fixed batch size
  • The Process method can take an empty batch as argument, which means nothing has to be processed.
  • In case of EmptyBatch, Items is empty and HasMoreData returns true

Option A

var batchIndex = 0; var batch = GetBatchedData(batchIndex++); while (batch.HasMoreData) { Process(batch.Items); var batch = GetBatchedData(batchIndex++); } 

Option B

var batchIndex = 0; var batch = GetBatchedData(batchIndex++); do { Process(batch.Items); batch = GetBatchedData(batchIndex++); } while (batch.HasMoreData) 

Option C

var batchIndex = 0; var batch = new EmptyBatch(); do { Process(batch.Items); batch = GetBatchedData(batchIndex++); } while (batch.HasMoreData) 

When processing data sets in batches I usually can think of the following three implementations.

Which one do you consider better than the other and why?

Notes:

  • The implementation is in C# but the question is about the algorithm.
  • The GetBatchedData works with a fixed batch size
  • The Process method can take an empty batch as argument, which means nothing has to be processed.
  • In case of EmptyBatch, Items is empty and HasMoreData returns true

Option A

var batchIndex = 0; var batch = GetBatchedData(batchIndex++); while (batch.HasMoreData) { Process(batch.Items); var batch = GetBatchedData(batchIndex++); } 

Option B

var batchIndex = 0; var batch = GetBatchedData(batchIndex++); do { Process(batch.Items); batch = GetBatchedData(batchIndex++); } while (batch.HasMoreData) 

Option C

var batchIndex = 0; var batch = new EmptyBatch(); do { Process(batch.Items); batch = GetBatchedData(batchIndex++); } while (batch.HasMoreData) 

Additional approaches suggested only in comments but not in responses to the question:

Suggestion A

The GetBatchedData method returns IEnumerable<Data[]>, where every Data[] batch is yield return-ed

foreach (var batch in GetBatchedData()) { Process(batch) } 
added 55 characters in body
Source Link

When processing data sets in batches I usually can think of the following three implementations.

Which one do you consider better than the other and why?

Notes:

  • The implementation is in C# but the question is about the algorithm.
  • The GetBatchedData works with a fixed batch size
  • The Process method can take an empty batch as argument, which means nothing has to be processed.
  • In case of EmptyBatch, Items is empty and HasMoreData returns true

Option A

var batchIndex = 0; var batch = GetBatchedData(batchIndex++); while (batch.HasMoreData) { Process(batch.Items); var batch = GetBatchedData(batchIndex++); } 

Option B

var batchIndex = 0; var batch = GetBatchedData(batchIndex++); do { Process(batch.Items); batch = GetBatchedData(batchIndex++); } while (batch.HasMoreData) 

Option C

var batchIndex = 0; var batch = new EmptyBatch(); do { Process(batch.Items); batch = GetBatchedData(batchIndex++); } while (batch.HasMoreData) 

When processing data sets in batches I usually can think of the following three implementations.

Which one do you consider better than the other and why?

Notes:

  • The implementation is in C# but the question is about the algorithm.
  • The Process method can take an empty batch as argument, which means nothing has to be processed.
  • In case of EmptyBatch, Items is empty and HasMoreData returns true

Option A

var batchIndex = 0; var batch = GetBatchedData(batchIndex++); while (batch.HasMoreData) { Process(batch.Items); var batch = GetBatchedData(batchIndex++); } 

Option B

var batchIndex = 0; var batch = GetBatchedData(batchIndex++); do { Process(batch.Items); batch = GetBatchedData(batchIndex++); } while (batch.HasMoreData) 

Option C

var batchIndex = 0; var batch = new EmptyBatch(); do { Process(batch.Items); batch = GetBatchedData(batchIndex++); } while (batch.HasMoreData) 

When processing data sets in batches I usually can think of the following three implementations.

Which one do you consider better than the other and why?

Notes:

  • The implementation is in C# but the question is about the algorithm.
  • The GetBatchedData works with a fixed batch size
  • The Process method can take an empty batch as argument, which means nothing has to be processed.
  • In case of EmptyBatch, Items is empty and HasMoreData returns true

Option A

var batchIndex = 0; var batch = GetBatchedData(batchIndex++); while (batch.HasMoreData) { Process(batch.Items); var batch = GetBatchedData(batchIndex++); } 

Option B

var batchIndex = 0; var batch = GetBatchedData(batchIndex++); do { Process(batch.Items); batch = GetBatchedData(batchIndex++); } while (batch.HasMoreData) 

Option C

var batchIndex = 0; var batch = new EmptyBatch(); do { Process(batch.Items); batch = GetBatchedData(batchIndex++); } while (batch.HasMoreData) 
Source Link
Loading