Why do we use multiple recordings from the same set in a dataset?

Question

In the following dataset:

https://physionet.org/content/mitdb/1.0.0/

Has been written the following text:

"the remaining 25 recordings were selected from the same set to include less common but clinically significant arrhythmias that would not be well-represented in a small random sample."

I can not understand the reason of using multiple recordings from the same set?

noe · Accepted Answer · 2022-11-28 10:09:10Z

The whole paragraph is this:

The MIT-BIH Arrhythmia Database contains 48 half-hour excerpts of two-channel ambulatory ECG recordings, obtained from 47 subjects studied by the BIH Arrhythmia Laboratory between 1975 and 1979. Twenty-three recordings were chosen at random from a set of 4000 24-hour ambulatory ECG recordings collected from a mixed population of inpatients (about 60%) and outpatients (about 40%) at Boston's Beth Israel Hospital; the remaining 25 recordings were selected from the same set to include less common but clinically significant arrhythmias that would not be well-represented in a small random sample.

As we can see, what they mean is that some recordings were chosen randomly, while those 25 were selected by hand by some expert to ensure the final dataset contained data of certain types.

If they had just chosen randomly, the clinically significant data could or could not be among the selected recordings. Having a part randomly selected and another part chosen by hand ensures data diversity while also having the not-so-frequent-but-very-relevant pieces of data.

Stack Exchange Network

Why do we use multiple recordings from the same set in a dataset?

1 Answer 1

Hot Network Questions

Why do we use multiple recordings from the same set in a dataset?

1 Answer 1

Related

Hot Network Questions