- Notifications
You must be signed in to change notification settings - Fork 27
Description
Overlapping Ratio
Currently, find_overlap will be True when any single overlap occurs.
nervaluate/src/nervaluate/evaluate.py
Line 330 in df0e695
| if find_overlap(true_range, pred_range) and true not in true_which_overlapped_with_pred: |
nervaluate/src/nervaluate/utils.py
Lines 85 to 104 in df0e695
| def find_overlap(true_range: range, pred_range: range) -> set: | |
| """Find the overlap between two ranges | |
| Find the overlap between two ranges. Return the overlapping values if | |
| present, else return an empty set(). | |
| Examples: | |
| >>> find_overlap((1, 2), (2, 3)) | |
| 2 | |
| >>> find_overlap((1, 2), (3, 4)) | |
| set() | |
| """ | |
| true_set = set(true_range) | |
| pred_set = set(pred_range) | |
| overlaps = true_set.intersection(pred_set) | |
| return overlaps |
However, in most cases, we hope there could be an overlapping ratio threshold.
That is something like this
pred = {'start':10, 'end':15} label = {'start':12, 'end':18} # calculate union and intersection union = {'start':10, 'end':18} intersection = {'start':12, 'end':15} #calculate ratio ratio = (15-12) / (18-10) return ratio > thresholdThe current find_overlap uses set operation to find overlaps, which seems to be time inefficient. It would be directly obtained via start and end values:
Here's my implementation:
def find_overlap(self, true: dict[str, int | str], pred: dict[str, int | str]) -> bool: start_max = max(true['start'], pred['start']) end_min = min(true['end'], pred['end']) if start_max >= end_min: return False start_min = min(true['start'], pred['start']) end_max = max(true['end'], pred['end']) overlap_ratio = (end_min - start_max) / (end_max - start_min) return overlap_ratio > self.overlap_ratio_thresholdLast Character excluded
I wonder why we consider the last token, which is very counter-intuition. This comes from #32. Maybe @aflueckiger could provide any explanation on this? Does your data end includes the last character?
I think for most data, the start and end are the offsets in the original text string:
text[start:end] which means the last character is excluded. text[1:3] and text[3:5] don't have any overlapping.
nervaluate/src/nervaluate/evaluate.py
Lines 294 to 296 in df0e695
| # overlapping needs to take into account last token as well | |
| pred_range = range(pred["start"], pred["end"] + 1) | |
| true_range = range(true["start"], true["end"] + 1) |
Any support for huggingface Evaluate?
Would the maintainers consider using the standard of huggingface Evaluate? which means inheriting evaluate.Metric and pushing to huggingface hub. Afterwards, users could directly call metric = evaluate.load('{hub_url}')
Example: https://huggingface.co/spaces/evaluate-metric/glue/blob/main/glue.py