(C) IBM Corp. 2000
DB2 Net Search Extender Administration and User's Guide
Use the db2ext.highlight function to get information
about why a document qualified as a search result. More specifically,
it can be used to:
- get hits
- get hits and surrounding text
- get the document with user-defined highlight tags surrounding the
hits.
Note that the db2ext.highlight function can only be used
with the db2ext.textsearch table-valued function. The
table-valued function searches the index providing the results for the
HIGHLIGHT function to use.
For information on using the db2ext.textsearch function,
see DB2EXT.TEXTSEARCH.
Function syntax
>>-db2ext.highlight--------------------------------------------->
>--(--document-content--,--hit-information--,--hit-processing-information--)-><
Function parameters
The following are input parameters:
- document content CLOB(100K)
- Only UTF8 documents of a TEXT or XML format are supported. To
increase this value, see DB2EXTHL (utility).
- hit information BLOB(20K)
- A string containing hit information. This is returned by the
db2ext.textsearch function, if the numberOfHits
parameter is specified.
- hit processing information VARCHAR(1024)
- This parameter is a list of option value pairs separated by a comma
',' character with each string character surrounded by " "
characters. It specifies how highlighting should be processed for the
specified document. If none of the options are specified, the original
document is returned.
- TAGS = ("STRING", "STRING")
- This option enables the user to specify the tags to be inserted before and
after a hit in the document. If this option is omitted, no tags are
added before and after a hit in the document.
- WINDOW_NUMBER = INTEGER
- This option specifies how many parts (or windows) of the document should
be returned by the highlight function. Each window contains one or more
hits and the first hit in each window determines the part of the document
returned to the user. These hits may or may not have text surrounding
the hit.
If this option is omitted, 0 is taken as the default and the entire
document containing start and end tags (if specified) is returned. In
this case, the WINDOW_SIZE option is ignored.
- WINDOW_SIZE = INTEGER
- This option specifies the recommended size of the window in bytes.
This actual size may vary, depending on the number of hits, length of hits and
the start and end tag sizes. If the option is omitted, 0 is the default
and only hits without surrounding text will be returned.
- WINDOW_SEPARATOR = "STRING"
- This option specifies the tag used to separate one window from the next
window. If the option is omitted, "..." is the
default value.
- FORMAT = "STRING"
- This option specifies the format of the document. Valid values are
XML or TEXT. If this option is omitted, then
TEXT is taken as the default value. Ensure that the format
value is the same as that specified during indexing.
- MODEL_NAME = "STRING"
- This option specifies the model name related to the specified XML
document. Note that if the FORMAT is TEXT, this
option results in an error condition.
- SECTIONS = ("section-name1", ..., "section-nameN")
- For XML documents, highlighting can be restricted to relevant
sections. For example, they can be defined in the model file. To
specify these sections, separate the one or more section names with a
comma. If this option is omitted, highlighting is performed on the
whole XML document. Note that if the FORMAT is
TEXT, this option is ignored.
Function parameters
The following are return parameters.
- CLOB(200K)
- The HIGHLIGHT function returns a CLOB value containing the document parts
modified by the HIGHLIGHT function.
Usage
The following example shows how
you can use the HIGHLIGHT function:
select p.id,
p.title,
db2ext.highlight(p.content,
t.hitinformation,
'TAGS = ("<bf>", "</bf>"),
WINDOWS_NUMBER = 5,
WINDOWS_SIZE = 200,
WINDOW_SEPARATOR = "...",
FORMAT = "XML",
SECTIONS = ("section1-name", "section2-name")')
FROM patent p, table (db2ext.textsearch(
'"relational database systems"',
'DB2EXT',
'TI_FOR_CONTENT',
0,
20,
CAST(NULL as BIGINT),
15)) t
WHERE p.id = t.primkey
Using documents larger than 100 KB will cause the SQL query to terminate
and produce an SQL error (SQL 1476N and sql error -433). To avoid this,
use the db2exthl command to increase the document content
size. For information, see DB2EXTHL (utility).
Note |
---|
Special characters, such as "newline" will be returned as
is. |
Restrictions
- Only XML and flat text documents are supported.
- Only UTF8 databases are supported. For binary or datalink
documents, you need to ensure that the documents are in UTF8.
- Thai documents are not supported.
- If there is a mismatch between the document format used during indexing
and query time the HIGHLIGHT function will return unpredictable
results.
- Only hits found in the text parts of a document will be
highlighted.
- The highlight function can only be used with the
db2ext.textsearch function.
- String values cannot contain the " character.
[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]