Skip to content

Conversation

@benwtrent
Copy link
Member

In an effort to improve performance and continue to provide unique seeded scores for documents in the same index, we are switching from _id to _seq_no.

Requiring a field that is "unique" for a field and to help with random scores is burdensome for the user. So, we should default to a unique field (per index) when the user provides a seed.

Using _seq_no should be better as:

  • We don't have to grab stored fields values
  • Bytes used are generally smaller

Additionally this removes the deprecation warning.

Marking as "breaking" as it does change the scores & behavior, but the API provide is the same.

@benwtrent benwtrent added >breaking :Search Relevance/Search Catch all for Search Relevance v9.0.0 labels Dec 13, 2024
@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Dec 13, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine
Copy link
Collaborator

Hi @benwtrent, I've created a changelog YAML for you. Note that since this PR is labelled >breaking, you need to update the changelog YAML to fill out the extended information sections.

@benwtrent benwtrent requested a review from a team as a code owner December 13, 2024 20:18
Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM I just reviewed the leftover deprecation logger usages and this looked like one that can be addressed, thanks for taking care of it.

@benwtrent benwtrent added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Dec 17, 2024
@elasticsearchmachine elasticsearchmachine merged commit a5c57ba into elastic:main Dec 17, 2024
16 checks passed
@benwtrent benwtrent deleted the breaking/default_random_score_function_seq_no branch December 17, 2024 18:31
rjernst pushed a commit to rjernst/elasticsearch that referenced this pull request Dec 18, 2024
In an effort to improve performance and continue to provide unique seeded scores for documents in the same index, we are switching from _id to _seq_no. Requiring a field that is "unique" for a field and to help with random scores is burdensome for the user. So, we should default to a unique field (per index) when the user provides a seed. Using `_seq_no` should be better as: - We don't have to grab stored fields values - Bytes used are generally smaller Additionally this removes the deprecation warning. Marking as "breaking" as it does change the scores & behavior, but the API provide is the same.
@leemthompo
Copy link
Contributor

@benwtrent is this PR relevant to the serverless changelog? [FYI this question is based on 9.0 breaking changes]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) >breaking :Search Relevance/Search Catch all for Search Relevance Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.0.0

5 participants