I'm getting wildly different ranking/scoring between SQL and Lucene given the following query:
[pseudo code] (statut* within 3 of interpret*) AND contradict
I don't think this is an issue with the parser because all of the results seem to conform to the query requirements. However, in the top 1000 of results, I only get 172 common results. Since all the results from both Lucene and SQL both seem to conform to the query requirements, my only remaining guess is that the scoring is somehow radically different. I've had trouble finding any information on how SQL handles scoring, nor on comparing SQL and Lucene scoring. I'm not necessarily expecting the same results set from the two engines, but I was expecting more than 10% similarity and I need to at least be able to explain the huge discrepancy.
How can I explain this significant discrepancy?
How can I explain this significant discrepancy?What makes you think that two different products should use a similar ranking/scoring algorithm? Just use whatever you like. Or develop your own scoring algorithm and use it in Lucene.Net(I don't say it would be easy)