com.ibm.text
Class StringSearch

java.lang.Object
  |
  +--com.ibm.text.SearchIterator
        |
        +--com.ibm.text.StringSearch

public final class StringSearch
extends SearchIterator

StringSearch is a SearchIterator that provides language-sensitive text searching based on the comparison rules defined in a RuleBasedCollator object. Instances of StringSearch function as iterators maintain a current position and scan over text returning the index of characters where the pattern occurs and the length of each match.

StringSearch uses a version of the fast Boyer-Moore search algorithm that has been adapted to work with the large character set of Unicode. See "Efficient Text Searching in Java", to be published in Java Report in February, 1999, for further information on the algorithm.

Consult the SearchIterator documentation for information on and examples of how to use instances of this class to implement text searching. SearchIterator provides all of the necessary API; this class only provides constructors and internal implementation methods.

Version:
1.0
Author:
Laura Werner
See Also:
SearchIterator, RuleBasedCollator

Fields inherited from class com.ibm.text.SearchIterator
DONE
 
Constructor Summary
StringSearch(java.lang.String pattern, java.text.CharacterIterator target, java.util.Locale loc)
          Construct a StringSearch object using the collator and character boundary detection rules for a given locale
StringSearch(java.lang.String pattern, java.text.CharacterIterator target, java.text.RuleBasedCollator collator)
          Construct a StringSearch object using a specific collator.
StringSearch(java.lang.String pat, java.text.CharacterIterator target, java.text.RuleBasedCollator coll, java.text.BreakIterator breaker)
          Construct a StringSearch object using a specific collator and set of boundary-detection rules.
StringSearch(java.lang.String pattern, java.lang.String target)
          Construct a StringSearch object using the collator for the default locale
 
Method Summary
 java.text.RuleBasedCollator getCollator()
          Return the RuleBasedCollator being used for this string search.
 java.lang.String getPattern()
          Returns the pattern for which this object is searching.
 int getStrength()
          Returns this object's strength property, which indicates what level of differences are considered significant during a search.
protected  int handleNext(int start)
          Search forward for matching text, starting at a given location.
protected  int handlePrev(int start)
          Search backward for matching text ,starting at a given location.
 void setCollator(java.text.RuleBasedCollator coll)
          Set the collator to be used for this string search.
 void setPattern(java.lang.String pat)
          Set the pattern for which to search.
 void setStrength(int newStrength)
          Sets this object's strength property.
 void setTarget(java.text.CharacterIterator target)
          Set the target text which should be searched and resets the iterator's position to point before the start of the new text.
 
Methods inherited from class com.ibm.text.SearchIterator
first, following, getBreakIterator, getIndex, getMatchedText, getMatchLength, getTarget, isOverlapping, last, next, preceding, previous, setBreakIterator, setMatchLength, setOverlapping
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

StringSearch

public StringSearch(java.lang.String pat,
                    java.text.CharacterIterator target,
                    java.text.RuleBasedCollator coll,
                    java.text.BreakIterator breaker)
Construct a StringSearch object using a specific collator and set of boundary-detection rules.

Parameters:
pat - The text for which this object will search.
target - The text in which to search for the pattern.
coll - A RuleBasedCollator object which defines the language-sensitive comparison rules used to determine whether text in the pattern and target matches.
breaker - A BreakIterator object used to constrain the matches that are found. Matches whose start and end indices in the target text are not boundaries as determined by the BreakIterator are ignored. If this behavior is not desired, null can be passed in instead.

StringSearch

public StringSearch(java.lang.String pattern,
                    java.text.CharacterIterator target,
                    java.text.RuleBasedCollator collator)
Construct a StringSearch object using a specific collator.

Parameters:
pattern - The text for which this object will search.
target - The text in which to search for the pattern.
collator - A RuleBasedCollator object which defines the language-sensitive comparison rules used to determine whether text in the pattern and target matches.

StringSearch

public StringSearch(java.lang.String pattern,
                    java.text.CharacterIterator target,
                    java.util.Locale loc)
Construct a StringSearch object using the collator and character boundary detection rules for a given locale

Parameters:
pattern - The text for which this object will search.
target - The text in which to search for the pattern.
loc - The locale whose collation and break-detection rules should be used.
Throws:
ClassCastException - thrown if the collator for the specified locale is not a RuleBasedCollator.

StringSearch

public StringSearch(java.lang.String pattern,
                    java.lang.String target)
Construct a StringSearch object using the collator for the default locale

Parameters:
pattern - The text for which this object will search.
target - The text in which to search for the pattern.
collator - A RuleBasedCollator object which defines the language-sensitive comparison rules used to determine whether text in the pattern and target matches.
Method Detail

setStrength

public void setStrength(int newStrength)
Sets this object's strength property. The strength determines the minimum level of difference considered significant during a search. Generally, Collator.TERTIARY and Collator.IDENTICAL indicate that all differences are considered significant, Collator.SECONDARY indicates that upper/lower case distinctions should be ignored, and Collator.PRIMARY indicates that both case and accents should be ignored. However, the exact meanings of these constants are determined by individual Collator objects.

See Also:
Collator.PRIMARY, Collator.SECONDARY, Collator.TERTIARY, Collator.IDENTICAL

getStrength

public int getStrength()
Returns this object's strength property, which indicates what level of differences are considered significant during a search.

See Also:
setStrength(int)

setCollator

public void setCollator(java.text.RuleBasedCollator coll)
Set the collator to be used for this string search. Also changes the search strength to match that of the new collator.

This method causes internal data such as Boyer-Moore shift tables to be recalculated, but the iterator's position is unchanged.

See Also:
getCollator()

getCollator

public java.text.RuleBasedCollator getCollator()
Return the RuleBasedCollator being used for this string search.

setPattern

public void setPattern(java.lang.String pat)
Set the pattern for which to search. This method causes internal data such as Boyer-Moore shift tables to be recalculated, but the iterator's position is unchanged.

getPattern

public java.lang.String getPattern()
Returns the pattern for which this object is searching.

setTarget

public void setTarget(java.text.CharacterIterator target)
Set the target text which should be searched and resets the iterator's position to point before the start of the new text. This method is useful if you want to re-use an iterator to search for the same pattern within a different body of text.
Overrides:
setTarget in class SearchIterator
Following copied from class: com.ibm.text.SearchIterator
See Also:
SearchIterator.getTarget()

handleNext

protected int handleNext(int start)
Search forward for matching text, starting at a given location. Clients should not call this method directly; instead they should call SearchIterator.next().

If a match is found, this method returns the index at which the match starts and calls SearchIterator.setMatchLength(int) with the number of characters in the target text that make up the match. If no match is found, the method returns DONE and does not call setMatchLength.

Overrides:
handleNext in class SearchIterator
Parameters:
start - The index in the target text at which the search starts.
Returns:
The index at which the matched text in the target starts, or DONE if no match was found.

See Also:
SearchIterator.next(), SearchIterator.DONE

handlePrev

protected int handlePrev(int start)
Search backward for matching text ,starting at a given location. Clients should not call this method directly; instead they should call SearchIterator.previous(), which this method overrides.

If a match is found, this method returns the index at which the match starts and calls SearchIterator.setMatchLength(int) with the number of characters in the target text that make up the match. If no match is found, the method returns DONE and does not call setMatchLength.

Overrides:
handlePrev in class SearchIterator
Parameters:
start - The index in the target text at which the search starts.
Returns:
The index at which the matched text in the target starts, or DONE if no match was found.

See Also:
SearchIterator.previous(), SearchIterator.DONE


Copyright (c) 2001 IBM Corporation and others.