Solr Query Syntax exact match

Question

I have a field configured like

 <fieldType name="gtext" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StandardFilterFactory"/> <!--Needed for efficient trailling wildcard queries--> <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="25" side="front"/> <filter class="solr.ReversedWildcardFilterFactory" withOriginal="true" maxPosAsterisk="2" maxPosQuestion="1" minTrailing="2" maxFractionAsterisk="0"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="1" stemEnglishPossessive="1" catenateAll="0" preserveOriginal="1" /> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StandardFilterFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="1" stemEnglishPossessive="1" catenateAll="0" preserveOriginal="1" /> </analyzer> </fieldType>

So when I search for example fun, it will also return funny. How can I avoid this behavior and have only fun matched? Is it because of reverse wildcards?

Jayendra · Accepted Answer · 2013-06-26 09:49:28Z

This is cause of the EdgeNGramFilterFactory filter

<filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="25" side="front"/>

EdgeNGramFilterFactory generates edge grams for the token e.g.

funny would generate -> f, fu, fun, funn, funny .....

So when you search for fun, documents with funny would match

ReversedWildcardFilterFactory does not cause this issue, it will only enhance the prefix query search.

for e.g. funny would be stored as ynnuf

And prefix queries *nny would be converted to ynn* which is more good for performance.

Are you sure? ngram filter is supposed to make trailing wildcard queries more efficient. Shall I get rid of it?
you need ngram for wildcards and revere for prefix queries. However, the issues mentioned by you is cause of ngrams as it would cause partial matches as well. You can use different fields with and without ngrams. w/o will not cause partial matches.

Collectives™ on Stack Overflow

Solr Query Syntax exact match

1 Answer 1

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Related