com.ibm.text
Class SearchIterator

java.lang.Object
  |
  +--com.ibm.text.SearchIterator
Direct Known Subclasses:
StringSearch

public abstract class SearchIterator
extends java.lang.Object

SearchIterator is an abstract base class that provides methods to search for a pattern within a text string. Instances of SearchIterator maintain a current position and scan over the target text, returning the indices the pattern is matched and the length of each match.

SearchIterator is an abstract base class that defines a protocol for text searching. Subclasses provide concrete implementations of various search algorithms. For example, StringSearch implements language-sensitive pattern matching based on the comparison rules defined in a RuleBasedCollator object.

Internally, SearchIterator scans text using a CharacterIterator, and is thus able to scan text held by any object implementing that protocol. A StringCharacterIterator is used to scan String objects passed to setText.

SearchIterator provides an API that is similar to that of other text iteration classes such as BreakIterator. Using this class, it is easy to scan through text looking for all occurances of a given pattern. The following example uses a StringSearch object to find all instances of "fox" in the target string. Any other subclass of SearchIterator can be used in an identical manner.


 String target = "The quick brown fox jumped over the lazy fox";
 String pattern = "fox";

 SearchIterator iter = new StringSearch(pattern, target);

 for (int pos = iter.first(); pos != SearchIterator.DONE; pos = iter.next()) {
     System.out.println("Found match at " + pos +
                        ", length is " + iter.getMatchLength());
 }
 

See Also:
StringSearch

Field Summary
static int DONE
          DONE is returned by previous() and next() after all valid matches have been returned, and by first() and last() if there are no matches at all.
 
Constructor Summary
protected SearchIterator(java.text.CharacterIterator target, java.text.BreakIterator breaker)
          Constructor for use by subclasses
 
Method Summary
 int first()
          Return the first index at which the target text matches the search pattern.
 int following(int pos)
          Return the first index greater than pos at which the target text matches the search pattern.
 java.text.BreakIterator getBreakIterator()
          Returns the BreakIterator that is used to restrict the points at which matches are detected.
 int getIndex()
          Return the current index in the text being searched.
 java.lang.String getMatchedText()
          Returns the text that was matched by the most recent call to first(), next(), previous(), or last().
 int getMatchLength()
          Returns the length of text in the target which matches the search pattern.
 java.text.CharacterIterator getTarget()
          Return the target text which is being searched
protected abstract  int handleNext(int startAt)
          Abstract method which subclasses override to provide the mechanism for finding the next match in the target text.
protected abstract  int handlePrev(int startAt)
          Abstract method which subclasses override to provide the mechanism for finding the previous match in the target text.
 boolean isOverlapping()
          Determines whether overlapping matches are returned.
 int last()
          Return the last index in the target text at which it matches the search pattern and adjusts the iteration to point to that position.
 int next()
          Return the index of the next point at which the text matches the search pattern, starting from the current position
 int preceding(int pos)
          Return the first index less than pos at which the target text matches the search pattern.
 int previous()
          Return the index of the previous point at which the text matches the search pattern, starting at the current position
 void setBreakIterator(java.text.BreakIterator iterator)
          Set the BreakIterator that will be used to restrict the points at which matches are detected.
protected  void setMatchLength(int length)
          Sets the length of the currently matched string in the target text.
 void setOverlapping(boolean allowOverlap)
          Determines whether overlapping matches are returned.
 void setTarget(java.text.CharacterIterator iterator)
          Set the target text which should be searched and resets the iterator's position to point before the start of the target text.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DONE

public static final int DONE
DONE is returned by previous() and next() after all valid matches have been returned, and by first() and last() if there are no matches at all.
Constructor Detail

SearchIterator

protected SearchIterator(java.text.CharacterIterator target,
                         java.text.BreakIterator breaker)
Constructor for use by subclasses

Parameters:
target - The target text to be searched. This is for internal use by this class. Subclasses need to maintain their own reference to or iterator over the target text for use by their handleNext and handlePrev methods.
breaker - A BreakIterator that is used to restrict the points at which matches are detected. If handleNext or handlePrev finds a match, but the match's start or end index is not a boundary as determined by the BreakIterator, the match is rejected and handleNext or handlePrev is called again. If this parameter is null, no break detection is attempted.
Method Detail

first

public final int first()
Return the first index at which the target text matches the search pattern. The iterator is adjusted so that its current index (as returned by getIndex()) is the match posisition if one was found and DONE if one was not.
Returns:
The character index of the first match, or DONE if there are no matches.

following

public final int following(int pos)
Return the first index greater than pos at which the target text matches the search pattern. The iterator is adjusted so that its current index (as returned by getIndex()) is the match posisition if one was found and DONE if one was not.
Returns:
The character index of the first match following pos, or DONE if there are no matches.

last

public final int last()
Return the last index in the target text at which it matches the search pattern and adjusts the iteration to point to that position.
Returns:
The index of the first match, or DONE if there are no matches.

preceding

public final int preceding(int pos)
Return the first index less than pos at which the target text matches the search pattern. The iterator is adjusted so that its current index (as returned by getIndex()) is the match posisition if one was found and DONE if one was not.
Returns:
The character index of the first match preceding pos, or DONE if there are no matches.

next

public int next()
Return the index of the next point at which the text matches the search pattern, starting from the current position

Returns:
The index of the next match after the current position, or DONE if there are no more matches.
See Also:
first()

previous

public int previous()
Return the index of the previous point at which the text matches the search pattern, starting at the current position
Returns:
The index of the previous match before the current position, or DONE if there are no more matches.

getIndex

public int getIndex()
Return the current index in the text being searched. If the iteration has gone past the end of the text (or past the beginning for a backwards search), DONE is returned.

setOverlapping

public void setOverlapping(boolean allowOverlap)
Determines whether overlapping matches are returned. If this property is true, matches that begin within the boundry of the previous match are considered valid and will be returned. For example, when searching for "abab" in the target text "ababab", both offsets 0 and 2 will be returned as valid matches if this property is true.

The default setting of this property is true


isOverlapping

public boolean isOverlapping()
Determines whether overlapping matches are returned.
See Also:
setOverlapping(boolean)

getMatchLength

public int getMatchLength()
Returns the length of text in the target which matches the search pattern. This call returns a valid result only after a successful call to first(), next(), previous(), or last(). Just after construction, or after a searching method returns DONE, this method will return 0.
Returns:
The length of the match in the target text, or 0 if there is no match currently.

setBreakIterator

public void setBreakIterator(java.text.BreakIterator iterator)
Set the BreakIterator that will be used to restrict the points at which matches are detected.
Parameters:
breaker - A BreakIterator that will be used to restrict the points at which matches are detected. If a match is found, but the match's start or end index is not a boundary as determined by the BreakIterator, the match will be rejected and another will be searched for. If this parameter is null, no break detection is attempted.
See Also:
getBreakIterator()

getBreakIterator

public java.text.BreakIterator getBreakIterator()
Returns the BreakIterator that is used to restrict the points at which matches are detected. This will be the same object that was passed to the constructor or to setBreakIterator. Note that null is a legal value; it means that break detection should not be attempted.
See Also:
setBreakIterator(java.text.BreakIterator)

setTarget

public void setTarget(java.text.CharacterIterator iterator)
Set the target text which should be searched and resets the iterator's position to point before the start of the target text. This method is useful if you want to re-use an iterator to search for the same pattern within a different body of text.
See Also:
getTarget()

getTarget

public java.text.CharacterIterator getTarget()
Return the target text which is being searched
See Also:
setTarget(java.text.CharacterIterator)

getMatchedText

public java.lang.String getMatchedText()
Returns the text that was matched by the most recent call to first(), next(), previous(), or last(). If the iterator is not pointing at a valid match (e.g. just after construction or after DONE has been returned, returns an empty string.

handleNext

protected abstract int handleNext(int startAt)
Abstract method which subclasses override to provide the mechanism for finding the next match in the target text. This allows different subclasses to provide different search algorithms.

If a match is found, the implementation should return the index at which the match starts and should call setMatchLength with the number of characters in the target text that make up the match. If no match is found, the method should return DONE and should not call setMatchLength.

Parameters:
startAt - The index in the target text at which the search should start.
See Also:
setMatchLength(int)

handlePrev

protected abstract int handlePrev(int startAt)
Abstract method which subclasses override to provide the mechanism for finding the previous match in the target text. This allows different subclasses to provide different search algorithms.

If a match is found, the implementation should return the index at which the match starts and should call setMatchLength with the number of characters in the target text that make up the match. If no match is found, the method should return DONE and should not call setMatchLength.

Parameters:
startAt - The index in the target text at which the search should start.
See Also:
setMatchLength(int)

setMatchLength

protected void setMatchLength(int length)
Sets the length of the currently matched string in the target text. Subclasses' handleNext and handlePrev methods should call this when they find a match in the target text.


Copyright (c) 1998-2000 IBM Corporation and others.