Skip to content
This repository was archived by the owner on Mar 20, 2026. It is now read-only.
This repository was archived by the owner on Mar 20, 2026. It is now read-only.

Excess property in audio datasets and confusing argument description in audio pretraining task #3178

@gazay

Description

@gazay

📚 Documentation

There is a min_sample_size argument in audio pretraining task with confusing description (min sample size to crop to for batching). It is used only to set internal min_length property of dataset to filter small examples.

I propose to remove min_length property in favor of min_sample_size in audio datasets and improve description of this parameter.

I prepared branch, which I can submit as PR if this issue makes sense: https://github.com/pytorch/fairseq/compare/master...gazay:min_max_sample_size?expand=1

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions