To use the stored procedure search, you need to specify cache options during the CREATE INDEX command. This enables high performance, by moving all the specified data into main memory.
However, before the first index update for searching, ensure that your table contains documents to avoid updating an index on a non-populated table. This provides a better indexing performance and a solid estimation of cache memory requirements.
The stored procedure search allows you to quickly return predefined data that is associated with a document. Use the cache table option to define this in the CREATE INDEX command. The ACTIVATE CACHE command then moves the specified data into a memory cache.
Note |
---|
The SQL scalar search functions can also use this text index, if the text index is not created on a view. In a distributed DB2 environment, you must explicitely specify a tablespace for administration tables on a single node for the stored procedure and explicitely call on this node. To ensure that you connect to the correct node, use the DB2NODE environment variable. |
When creating a text index for stored procedure search, you must determine and calculate the following parameters:
The following types of cache are available:
The following methods of updating a text index are available:
This is also known as incremental update.
Use this option if you expect to insert more than 50% of your documents after the initial index activation.
Net Search Extender provides two SQL functions to help you determine the CREATE INDEX memory parameters. These are: MAXIMUM CACHE SIZE and PCTFREE.
DB2EXT.MAXIMUM_CACHE_SIZE(maximumNumberDocs INTEGER, averageRowLength INTEGER, numberOfCacheColumns INTEGER)
The following command returns the average row length parameter from your table:
SELECT AVG(LENGTH(cache column_1) + ... + LENGTH(cache column_n))
Note that the average may change significantly when further documents are inserted into your table. The number of cache columns relates to the number of column expressions used in the CACHE TABLE clause of the DB2TEXT CREATE INDEX command.
For additional information, see Appendix B, Using large amounts of memory.
DB2EXT.PCTFREE(actualNumberDocs INTEGER, maximumNumberDocs INTEGER)
The actual numbers of documents are the number of rows in your table at the time of the first ACTIVATE CACHE command, which creates the memory cache.
The maximum number of documents is an estimate of the maximum number of documents in your table before the next DB2TEXT ACTIVATE command (for a temporary cache), or DB2TEXT ACTIVATE CACHE RECREATE command (for a persistent cache) is run.
If you are recreating the index on each update, set the PCTFREE value to 0.
Examples |
---|
Assume that you have 10 000 rows in your table and you do not expect more then 20 000. Use the following call to calculate the PCTFREE value you require: db2 "values DB2EXT.PCTFREE(10000,20000) " Assume that your maximum row size is 20 000 and that you have 2 columns in your cache with an average size of 76. Use the following call to return the size: db2 " values DB2EXT.MAXIMUM_CACHE_SIZE(20000,76,2) " |
After determining suitable parameters, you can create your index and cache table by using the following call:
db2text CREATE INDEX db2ext.comment FOR TEXT ON db2ext.texttab (comment) CACHE TABLE (docid) PCTFREE 10 MAXIMUM CACHE SIZE 5
In this example, the docid column is built in addition to the index, using main memory for fast result table return. Ten percent of the cache memory is reserved for future documents and the cache is limited to a maximum of 5 MB.
To search on this index, you need to update and then activate the index. This copies the specified table cache expression from the database into memory.
If during ACTIVATE or UPDATE operations, the MAXIMUM CACHE SIZE is exceeded, the following actions are recommended:
If you expect frequent updates on documents, consider using fixed-size data types for the cache column expressions in the CACHE TABLE clause. The following example shows how you can use the same cache storage during update operations:
CACHE TABLE(cast(C1 as char(20)), cast(substr(C2,1,10) as char(10))....
In this case, you ensure that only the non-variable data types are used.
Before cache activation, perform any pending incremental updates to avoid a poor PCTFREE calculation.
To activate the text index, use the following command:
db2text ACTIVATE CACHE FOR INDEX db2ext.comment FOR TEXT
This command retrieves the specified cache table data out of the database and stores this in memory. The time taken depends on the size of the table.
Note |
---|
If you call update index when an index is activated, this will also update the cache tables. As deleted documents may take slots in cache memory, ensure that you set PCTFREE with a high enough value. |
The ACTIVATE CACHE call needs to be redone every time you stop your system. If you use the persistent cache, the new activate will be quicker.
Note |
---|
If an update occurs when the persistent cache is not activated, the persistent cache is dropped and recreated during the activate call. |
To save resources, you should also deactivate any indexes that are not currently required.
To check how much memory is left, use the following call:
db2text control show cache status for database cte index db2ext.comment
This displays whether the index has been activated and how much of the specified cache space is left.