Does it really make sense to attribute spurious tags to all types?

I'm looking at the implications from this line:

Line 265 in 36fd20e

for true in spurious_tags:

In particular, I have a fair amount of data that's really unbalanced, so assigning a spurious prediction to every class completely washes out the precision metrics for rare classes. For example, if I end up with 100 spurious tags out of 2000 true entities, and one of my classes only has 20 examples, the precision on that class is now taken out of denominator of 120, regardless of which predicted classes comprised the 100 spurious tags.

Shouldn't the spurious tag just be a false positive for the predicted class, and a false negative for the "outside" class (which we kind of don't care about)? I could maybe be convinced otherwise, but wanted to suggest the change, because this is how I'm currently using this code privately.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does it really make sense to attribute spurious tags to all types? #66

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Does it really make sense to attribute spurious tags to all types? #66

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions