Class Summary |
ArabicLetterTokenizerFactory |
|
ArabicNormalizationFilterFactory |
|
ArabicStemFilterFactory |
|
ASCIIFoldingFilterFactory |
|
BaseCharFilterFactory |
|
BaseTokenFilterFactory |
Simple abstract implementation that handles init arg processing. |
BaseTokenizerFactory |
Simple abstract implementation that handles init arg processing. |
BrazilianStemFilterFactory |
|
BufferedTokenStream |
Handles input and output buffering of TokenStream |
CapitalizationFilterFactory |
A filter to apply normal capitalization rules to Tokens. |
ChineseFilterFactory |
|
ChineseTokenizerFactory |
|
CJKTokenizerFactory |
|
CommonGramsFilter |
Construct bigrams for frequently occurring terms while indexing. |
CommonGramsFilterFactory |
Constructs a CommonGramsFilter |
CommonGramsQueryFilter |
Wrap a CommonGramsFilter optimizing phrase queries by only returning single
words when they are not a member of a bigram. |
CommonGramsQueryFilterFactory |
Construct CommonGramsQueryFilter
This is pretty close to a straight copy from StopFilterFactory |
DelimitedPayloadTokenFilterFactory |
|
DictionaryCompoundWordTokenFilterFactory |
|
DoubleMetaphoneFilter |
|
DoubleMetaphoneFilterFactory |
|
DutchStemFilterFactory |
|
EdgeNGramFilterFactory |
Creates new instances of EdgeNGramTokenFilter . |
EdgeNGramTokenizerFactory |
Creates new instances of EdgeNGramTokenizer . |
ElisionFilterFactory |
|
EnglishPorterFilterFactory |
Deprecated. Use SnowballPorterFilterFactory with language="English" instead |
FrenchStemFilterFactory |
|
GermanStemFilterFactory |
|
GreekLowerCaseFilterFactory |
|
HTMLStripCharFilter |
A CharFilter that wraps another Reader and attempts to strip out HTML constructs. |
HTMLStripCharFilterFactory |
|
HTMLStripReader |
Deprecated. Use HTMLStripCharFilter |
HTMLStripStandardTokenizerFactory |
Deprecated. Use HTMLStripCharFilterFactory and StandardTokenizerFactory |
HTMLStripWhitespaceTokenizerFactory |
Deprecated. Use HTMLStripCharFilterFactory and WhitespaceTokenizerFactory |
HyphenatedWordsFilter |
When the plain text is extracted from documents, we will often have many words hyphenated and broken into
two lines. |
HyphenatedWordsFilterFactory |
Factory for HyphenatedWordsFilter |
ISOLatin1AccentFilterFactory |
Factory for ISOLatin1AccentFilter
$Id: ISOLatin1AccentFilterFactory.java 591158 2007-11-01 22:37:42Z hossman $ |
KeepWordFilter |
A TokenFilter that only keeps tokens with text contained in the
required words. |
KeepWordFilterFactory |
|
KeywordTokenizerFactory |
|
LengthFilter |
Deprecated. use LengthFilter |
LengthFilterFactory |
|
LetterTokenizerFactory |
|
LowerCaseFilterFactory |
|
LowerCaseTokenizerFactory |
|
MappingCharFilterFactory |
|
NGramFilterFactory |
Creates new instances of NGramTokenFilter . |
NGramTokenizerFactory |
Creates new instances of NGramTokenizer . |
NumericPayloadTokenFilterFactory |
|
PatternReplaceFilter |
A TokenFilter which applies a Pattern to each token in the stream,
replacing match occurances with the specified replacement string. |
PatternReplaceFilterFactory |
|
PatternTokenizer |
This tokenizer uses regex pattern matching to construct distinct tokens
for the input stream. |
PatternTokenizerFactory |
This tokenizer uses regex pattern matching to construct distinct tokens
for the input stream. |
PersianNormalizationFilterFactory |
|
PhoneticFilter |
Create tokens for phonetic matches. |
PhoneticFilterFactory |
Create tokens based on phonetic encoders
http://jakarta.apache.org/commons/codec/api-release/org/apache/commons/codec/language/package-summary.html
This takes two arguments:
"encoder" required, one of "DoubleMetaphone", "Metaphone", "Soundex", "RefinedSoundex"
"inject" (default=true) add tokens to the stream with the offset=0 |
PorterStemFilterFactory |
|
PositionFilterFactory |
Set the positionIncrement of all tokens to the "positionIncrement", except the first return token which retains its
original positionIncrement value. |
RemoveDuplicatesTokenFilter |
A TokenFilter which filters out Tokens at the same position and Term
text as the previous token in the stream. |
RemoveDuplicatesTokenFilterFactory |
|
ReversedWildcardFilter |
This class produces a special form of reversed tokens, suitable for
better handling of leading wildcards. |
ReversedWildcardFilterFactory |
Factory for ReversedWildcardFilter -s. |
ReverseStringFilterFactory |
A FilterFactory which reverses the input. |
RussianCommon |
Deprecated. |
RussianLetterTokenizerFactory |
|
RussianLowerCaseFilterFactory |
|
RussianStemFilterFactory |
|
ShingleFilterFactory |
|
SnowballPorterFilterFactory |
Factory for SnowballFilters, with configurable language
Browsing the code, SnowballFilter uses reflection to adapt to Lucene... |
SolrAnalyzer |
|
SolrAnalyzer.TokenStreamInfo |
|
StandardFilterFactory |
|
StandardTokenizerFactory |
|
StopFilterFactory |
|
SynonymFilter |
SynonymFilter handles multi-token synonyms with variable position increment offsets. |
SynonymFilterFactory |
|
SynonymMap |
Mapping rules for use with SynonymFilter |
ThaiWordFilterFactory |
|
TokenizerChain |
|
TokenOffsetPayloadTokenFilterFactory |
|
TrieTokenizerFactory |
Tokenizer for trie fields. |
TrimFilter |
Trims leading and trailing whitespace from Tokens in the stream. |
TrimFilterFactory |
|
TypeAsPayloadTokenFilterFactory |
|
WhitespaceTokenizerFactory |
|
WordDelimiterFilterFactory |
|