I have a text field, which contains titles of tv-series or movies. In several cases I want to perform a phrase query on what I'd say a pretty normal text field. This works fine for most phrase terms, but in some reproducable cases it doesn't, but simply returns nothing. It seems to be related to some "special" characters, but not all special characters I'd assume are affected.
Title:("Mission: Impossible")worksTitle:("Disney A.N.T.")doesn't workTitle:("Stephen King's Shining")doesn't workTitle:("Irgendwie L. A.")works
After trying several other titles I'd assume, that it is somehow related to dot . and apostroph ' and maybe other I don't know yet. I have no idea, where to look know
relevant schema.xml
<fieldType name="title" class="solr.TextField" sortMissingLast="true" positionIncrementGap="100" autoGeneratePhraseQueries="false"> <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.GermanNormalizationFilterFactory"/> <filter class="solr.ICUFoldingFilterFactory"/> <filter class="solr.WordDelimiterFilterFactory" preserveOriginal="1" splitOnCaseChange="0" splitOnNumerics="0" stemEnglishPossessive="0" generateWordParts="1" generateNumberParts="0" catenateWords="1" catenateNumbers="0" catenateAll="0" /> <filter class="solr.TrimFilterFactory" /> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.GermanNormalizationFilterFactory"/> <filter class="solr.ICUFoldingFilterFactory"/> <filter class="solr.TrimFilterFactory" /> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType>