Skip to content

Chunked Data Processing#68

Open
ssh-meister wants to merge 4 commits intomainfrom
chunks
Open

Chunked Data Processing#68
ssh-meister wants to merge 4 commits intomainfrom
chunks

Conversation

@ssh-meister
Copy link
Collaborator

The PR is dedicated to creating a processor that integrates other processors into a pipeline, which can process data in batches.

Signed-off-by: Sasha Meister <ameister@nvidia.com>
Signed-off-by: Sasha Meister <ameister@nvidia.com>
Signed-off-by: Sasha Meister <ameister@nvidia.com>
Signed-off-by: Sasha Meister <ameister@nvidia.com>
from sdp.utils.chunk_processing import ChunkProcessingPipeline


class GroupProcessors(BaseProcessor):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a docstring describing this class

def get_last_output_manifest_file_in_group(group_processors_cfg):
return group_processors_cfg[-1].get("output_manifest_file", None)

class ChunkedProcessor:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a dosctring what is this class for

self.processor.process()


class СhunkRunner:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a description to this class

processor.append_chunk_to_agg_output()


class ChunkProcessingPipeline:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a description to this class

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants