Remove usage of IndexSearcher#search(Query, Collector) from join package by msfroh · Pull Request #13747 · apache/lucene

msfroh · 2024-09-09T23:54:51Z

Description

Relates to #12892

For global ordinal-based join, we can support concurrent search. For numeric and term-based joins, we fail if we're called from a multithreaded searcher.

I can implement concurrent versions of the other join collectors, but wanted to get a first pass that removes the uses of IndexSearcher#search(Query, Collector).

Note that for the cases that still assume a single-threaded searcher, I used a version of CollectorManager#wrap(Collector) from #13735, with my guess for where it will end up based on feedback so far.

javanna

I think it's a good approach. Let's prioritize removing those leftover usages and add the missing collector managers as a next step.

lucene/join/src/java/org/apache/lucene/search/join/JoinUtil.java

javanna · 2024-09-10T11:36:12Z

@msfroh thanks a lot for picking this up!!!

msfroh · 2024-09-11T01:25:27Z

@javanna -- I've managed to get the remaining numeric / terms collectors in the Join module working with multiple search threads.

I can add them to this PR, but the diff is pretty massive. I'm thinking of holding off for another PR, but I'm happy to go either way.

There is probably value in "atomically" jumping from the current "always single-threaded" mode straight to "everything works with a multithreaded searcher", versus this PR's current state where global ordinal-based joins work with a multithreaded searcher but numeric/term-based joins don't.

Thanks a lot for the review!

javanna

I left a couple more comments, I am fine getting this in and focusing on the missing parallel collector managers as a follow-up. Thanks for the work you put into this!

lucene/join/src/java/org/apache/lucene/search/join/MergeableCollector.java

For global ordinal-based join, we can support concurrent search. For others, we fail if we're called from a multithreaded searcher.

Since I plan to implement numeric and term collectors that support merging all collectors into a single collector, it makes sense to move MergeableCollectorManager into its own top-level class.

This change removes MergeableCollector (and MergeableCollectorManager), wraps all of the custom Collector classes in their own CollectorManager, and removes all remaining occurrences of CollectorManager.forSequentialExecution from the tests. This also adds all of the other join collectors, bringing the join module fully into the CollectorManager club.

msfroh · 2024-09-20T02:14:58Z

Okay -- I wrapped all of the Collectors in CollectorManagers, and managed to remove all uses of CollectorManager.forSequentialExecution. I also went ahead and added the remaining Collectors to this PR.

github-actions · 2024-10-05T00:22:04Z

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the dev@lucene.apache.org list. Thank you for your contribution!

msfroh · 2025-01-10T07:54:52Z

@javanna -- obviously we missed the Lucene 10 cutoff for this, but it doesn't break any public APIs. Do you think it makes sense to merge this as an incremental improvement?

javanna · 2025-01-20T09:58:10Z

Hey @msfroh I need to review this once again and resume context on it. I will try to get to it this week.

github-actions · 2025-02-04T00:22:39Z

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the dev@lucene.apache.org list. Thank you for your contribution!

github-actions · 2025-10-16T00:29:01Z

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the dev@lucene.apache.org list. Thank you for your contribution!

msfroh mentioned this pull request Sep 9, 2024

Remove all deprecated IndexSearcher#search(Query, Collector) usage / methods in the next major release #12892

Open

13 tasks

javanna reviewed Sep 10, 2024

View reviewed changes

lucene/join/src/java/org/apache/lucene/search/join/JoinUtil.java Outdated Show resolved Hide resolved

msfroh force-pushed the join_collectormanager branch from 048eb29 to 66457ff Compare September 10, 2024 23:09

javanna reviewed Sep 12, 2024

View reviewed changes

lucene/join/src/java/org/apache/lucene/search/join/MergeableCollector.java Show resolved Hide resolved

msfroh added 3 commits September 19, 2024 18:53

Remove usage of IndexSearcher#search(Query, Collector) from join package

148f5cc

For global ordinal-based join, we can support concurrent search. For others, we fail if we're called from a multithreaded searcher.

Reuse pairwise merging collector logic

483372c

Since I plan to implement numeric and term collectors that support merging all collectors into a single collector, it makes sense to move MergeableCollectorManager into its own top-level class.

msfroh force-pushed the join_collectormanager branch from 66457ff to 7ff4804 Compare September 20, 2024 02:11

github-actions bot added the Stale label Oct 5, 2024

github-actions bot removed the Stale label Jan 11, 2025

github-actions bot added the Stale label Feb 4, 2025

github-actions bot removed the Stale label Oct 1, 2025

github-actions bot added the Stale label Oct 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove usage of IndexSearcher#search(Query, Collector) from join package#13747

Remove usage of IndexSearcher#search(Query, Collector) from join package#13747
msfroh wants to merge 3 commits intoapache:mainfrom
msfroh:join_collectormanager

msfroh commented Sep 9, 2024

javanna left a comment

Uh oh!

javanna commented Sep 10, 2024

msfroh commented Sep 11, 2024

javanna left a comment

Uh oh!

msfroh commented Sep 20, 2024

github-actions bot commented Oct 5, 2024

msfroh commented Jan 10, 2025

javanna commented Jan 20, 2025

github-actions bot commented Feb 4, 2025

github-actions bot commented Oct 16, 2025

Labels

2 participants

Conversation

msfroh commented Sep 9, 2024

Description

javanna left a comment

Choose a reason for hiding this comment

Uh oh!

javanna commented Sep 10, 2024

msfroh commented Sep 11, 2024

javanna left a comment

Choose a reason for hiding this comment

Uh oh!

msfroh commented Sep 20, 2024

github-actions bot commented Oct 5, 2024

msfroh commented Jan 10, 2025

javanna commented Jan 20, 2025

github-actions bot commented Feb 4, 2025

github-actions bot commented Oct 16, 2025

Labels

2 participants