Skip to content

Build threshold sweep and cluster analysis on stored distances, and add threshold stability UI #1024

@mossy426-cdc

Description

@mossy426-cdc

Problem:

MT already has the pairwise distance data it needs. The performance issue is that when users change the distance cutoff, MT was doing too much repeated work to rebuild clusters and summary statistics from scratch. We also did not have a user-facing way to automatically scan across many cutoff values and show where the clustering pattern stays stable.

Scope

  • Build a sorted stored-distance edge cache by active metric.
  • Implement a union-find based threshold sweep summary over stored edges.
  • Replace the current visible cluster-tagging path with a summary built from stored links instead of nested node scans.
  • Cache threshold sweep summaries by analysis version and metric.
  • Update the threshold sparkline/histogram to reuse cached stored-distance analysis.
  • Add a filtering-panel UI that shows current threshold metrics, stable threshold regions, and one-click threshold application.

Notes

  • Stable regions are heuristic guidance only. They indicate ranges where cluster count stays flat as threshold increases.
  • This ticket is about fast reuse of stored distances plus a threshold-exploration UI, not automatic scientific threshold selection.

Acceptance Criteria

  • Threshold sweep metrics are derived from stored distances and reused across UI interactions.
  • Cluster tagging no longer depends on the previous nested node-scan path.
  • Threshold stability data is visible in the filtering UI.
  • Users can apply a suggested threshold directly from the UI.
  • Cluster count, singleton count, and largest-cluster behavior remain consistent on benchmark flows.
  • Angular build and targeted Cypress coverage pass.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions