Getting started with the Verity mkvdk utility

The following is the basic mkvdk syntax:

mkvdk -collection path [option] [...] [filespec] [...]

Where:

Square brackets ( [ ] ) indicate optional items.
An ellipsis (...) indicates repetition of the previous item. Thus, [filespec] [...] indicates an optional series of filespec items.
filespec represents a document filename or a list of document filenames. If filespec is a list of files, it should consist of an at sign (@) followed by the filename containing the list (for example, @filelist).
The -collection path argument creates or opens a collection. This argument is required.

Numerous optional syntax options are listed below. All syntax options must precede the first filespec parameter.

Creating a collection

Creating a collection with the mkvdk utility involves setting up a collection directory structure and inserting documents into this structure. You can create a collection in two steps, using two separate commands.

To create a collection:

Set up a collection using the following syntax:
```
mkvdk -create -collection collectionname
```
Where collectionname is the path to the collection directory. Running this command creates a collection directory that includes style files with configuration information.
Insert documents using the following syntax:
```
mkvdk -collection collectionname -bulk -insert filespec
```
Where filespec is the name of a bulk insert file that specifies which documents to index and insert into the collection.

Alternatively, you can set up a collection and insert documents in one command, using the following syntax:

mkvdk -create -collection collectionname -bulk -insert filespec

Note: You can use the -create option only once to create the collection directory structure. After a collection directory structure has been created, do not to use the -create option to update the collection.

Accessing online Help for the mkvdk utility

To display a list of mkvdk command-line options, enter the following command:

mkvdk -help

Collection setup options

The mkvdk utility has a variety of collection setup options, which the following table describes:

Option

Description

-create

Creates a collection in the specified -collection directory. It creates the directory structure, determines the index contents and sets up the document's table schema according to the style files used. If the specified collection already exists, the mkvdk utility exits rather than overwriting the existing collection.

-style dir

Specifies the style directory that contains the style files to use to create a collection. This option can only be used with the -create option. If you do not specify this option when you use the mkvdk utility to create a collection, the mkvdk utility uses the style files in the common/style directory.

-description desc

Sets the collection's description. Enter alphanumeric text, such as "This collection contains electronic mail from ABC Company." Include the quotation marks.

-words

Builds the word list for all partitions in the collection.

Option	Description
-create	Creates a collection in the specified -collection directory. It creates the directory structure, determines the index contents and sets up the document's table schema according to the style files used. If the specified collection already exists, the mkvdk utility exits rather than overwriting the existing collection.
-style dir	Specifies the style directory that contains the style files to use to create a collection. This option can only be used with the -create option. If you do not specify this option when you use the mkvdk utility to create a collection, the mkvdk utility uses the style files in the common/style directory.
-description desc	Sets the collection's description. Enter alphanumeric text, such as "This collection contains electronic mail from ABC Company." Include the quotation marks.
-words	Builds the word list for all partitions in the collection.

Examples: setting up collections

The following examples show the commands for creating a collection and building the word list.

Creating a collection

The following command creates a collection in path_2 using the style files in path_1, and submits and indexes the document(s) in filespec:

mkvdk -create -style path_1 -collection path_2 filespec

Building the word list

The following command builds the word list in the collection residing in the path directory:

mkvdk -words -collection path

General processing options

The mkvdk utility provides a variety of general processing options, which the following table describes:

Option

Description

-collection path

Specifies the path of the collection to create or open. This option is required to execute the mkvdk utility.

-nolock

Turns off file locking. Locking is on by default.

-synch

Performs work immediately. If this option is not used, indexing work is done in the background, as time permits.

-about

Shows information about the collection, such as its description and the date when it was last modified.

-datapath path

Specifies the datapath to use to find documents that are added to the specified collection. All relative document paths are relative to this setting. If you do not set this option, the mkvdk utility looks for documents next to the collection directory.

-topicset path

Creates a topic index for the collection, based on the specified topic set, and stores it in the collection directory. This facilitates quick and efficient searches over the collection data when using topics.

-mode mode

Sets the indexing mode. Values are case-insensitive. The following are the valid settings:

Generic

FastSearch

NewsfeedIdx

NewsfeedOpt

BulkLoad

ReadOnly

Any custom mode defined in the style.plc file.

The default is Generic mode.

-common

Specifies the path of the Verity common directory. If you do not use this option, the Verity engine looks for the common directory in the directory containing the mkvdk executable, and then along the executable search path. The executable search path is determined by your operating system environment settings. It is the path used by the OS to find the programs you run.

-help

Displays the mkvdk utility syntax options.

-debug

Runs the mkvdk command in debugging mode.

-nooptimize

Prevents optimization by this instance of the mkvdk utility. Using this option turns off the service-level VdkServiceType_Optimize. The service types determine the type of work the Verity engine and its self-administration features will execute on a collection.

-nohousekeep

Prevents housekeeping by this instance of the mkvdk utility. Housekeeping includes deleting files that are no longer needed. Using this option turns off the service-level VdkServiceType_DBA. (Service types are described under nooptimize.)

-noindex

Prevents indexing by this instance of mkvdk. Documents are not inserted or deleted. Using this option turns off the service-level VdkServiceType_Index. (Service types are described under nooptimize.)

-charmap name

Specifies the name of the character set to which to map all strings for your application. Set this to a character set that your system can display properly. Using the search engine with the English locale, the character set that any version of Windows displays is 8859. This is NOT the name of the character set of documents being indexed, it is only the name of the character set that your display can handle properly. (The character set of the document is set in the style.dft file using the /charmap option.)

Valid options are 850 and 8859. The default is no mapping.

-locale name

Specifies the name of the Verity locale to be used by the mkvdk utility. The locale name must correspond to the name of an existing locale directory, which must exist in the install_dir/common/locale directory. Valid options are english, deutsch, and francais. The default is english.

-datefmt format

Converts a date field value into Verity's internal data representation.You can use this option in conjunction with the mkvdk options -extract (for the field extraction feature) and -bulk (for the bulk submit feature). The named format string identifies to the date parsing routines in what order dates are written when the date string only consists of a sequence of numbers (for example, 03/03/96). Valid options are described in "Date format options". The default is MDY.

-servlev level

Specifies service level. The specifier, level, is a string consisting of keywords separated by hyphens, such as search-index-optimize. Valid keywords are described in "Service-level keywords".

Option	Description
-collection path	Specifies the path of the collection to create or open. This option is required to execute the mkvdk utility.
-nolock	Turns off file locking. Locking is on by default.
-synch	Performs work immediately. If this option is not used, indexing work is done in the background, as time permits.
-about	Shows information about the collection, such as its description and the date when it was last modified.
-datapath path	Specifies the datapath to use to find documents that are added to the specified collection. All relative document paths are relative to this setting. If you do not set this option, the mkvdk utility looks for documents next to the collection directory.
-topicset path	Creates a topic index for the collection, based on the specified topic set, and stores it in the collection directory. This facilitates quick and efficient searches over the collection data when using topics.
-mode mode	Sets the indexing mode. Values are case-insensitive. The following are the valid settings: Generic FastSearch NewsfeedIdx NewsfeedOpt BulkLoad ReadOnly Any custom mode defined in the style.plc file. The default is Generic mode.
-common	Specifies the path of the Verity common directory. If you do not use this option, the Verity engine looks for the common directory in the directory containing the mkvdk executable, and then along the executable search path. The executable search path is determined by your operating system environment settings. It is the path used by the OS to find the programs you run.
-help	Displays the mkvdk utility syntax options.
-debug	Runs the mkvdk command in debugging mode.
-nooptimize	Prevents optimization by this instance of the mkvdk utility. Using this option turns off the service-level VdkServiceType_Optimize. The service types determine the type of work the Verity engine and its self-administration features will execute on a collection.
-nohousekeep	Prevents housekeeping by this instance of the mkvdk utility. Housekeeping includes deleting files that are no longer needed. Using this option turns off the service-level VdkServiceType_DBA. (Service types are described under nooptimize.)
-noindex	Prevents indexing by this instance of mkvdk. Documents are not inserted or deleted. Using this option turns off the service-level VdkServiceType_Index. (Service types are described under nooptimize.)
-charmap name	Specifies the name of the character set to which to map all strings for your application. Set this to a character set that your system can display properly. Using the search engine with the English locale, the character set that any version of Windows displays is 8859. This is NOT the name of the character set of documents being indexed, it is only the name of the character set that your display can handle properly. (The character set of the document is set in the style.dft file using the /charmap option.) Valid options are 850 and 8859. The default is no mapping.
-locale name	Specifies the name of the Verity locale to be used by the mkvdk utility. The locale name must correspond to the name of an existing locale directory, which must exist in the install_dir/common/locale directory. Valid options are english, deutsch, and francais. The default is english.
-datefmt format	Converts a date field value into Verity's internal data representation.You can use this option in conjunction with the mkvdk options -extract (for the field extraction feature) and -bulk (for the bulk submit feature). The named format string identifies to the date parsing routines in what order dates are written when the date string only consists of a sequence of numbers (for example, 03/03/96). Valid options are described in "Date format options". The default is MDY.
-servlev level	Specifies service level. The specifier, level, is a string consisting of keywords separated by hyphens, such as search-index-optimize. Valid keywords are described in "Service-level keywords".

Examples: processing documents

The following examples show the commands for processing documents.

Using the default options

By default, the mkvdk command submits and indexes documents specified in the command, and services the specified collection. The following command executes the default options:

mkvdk -collection path filespec

Servicing only

The following command performs servicing only. Use this command to only index submitted documents and service the collection:

mkvdk -collection path

Deleting documents from a collection

The following command deletes documents from a collection:

mkvdk -delete -collection path filespec

Bulk inserting or deleting

The following command specifies bulk insertion of a list of documents:

mkvdk -collection coll -bulk -insert filespec

Where filespec is the list of files to insert. Since insert is the default, the following command is equivalent to the preceding command:

mkvdk -collection coll -bulk filespec

The following command specifies bulk deletion of a list of documents:

mkvdk -collection coll -bulk -delete filespec

Where filespec is the list of files to delete. It can be the same file used to insert documents; the only difference is that -delete is specified instead of -insert (or no specification).

Date format options

The Verity engine supports many import date formats, including many textual date formats, and the numeric date formats listed in the following table:

Format variable

Description

MDY

Dates written as month-day-year (US format, the default)

DMY

Dates written as day-month-year (European format)

YMD

Dates written as year-month-day (ISO international format)

YDM

Dates written as year-day-month (Swedish format)

USA

Dates written in US format (the same as MDY)

EUR

Dates written in European format (the same as DMY)

Format variable	Description
MDY	Dates written as month-day-year (US format, the default)
DMY	Dates written as day-month-year (European format)
YMD	Dates written as year-month-day (ISO international format)
YDM	Dates written as year-day-month (Swedish format)
USA	Dates written in US format (the same as MDY)
EUR	Dates written in European format (the same as DMY)

Service-level keywords

The following table describes the valid keywords for the -servlev keyword:

Keyword

Description

search

Enables search and retrieval

insert

Enables adding and updating documents

optimize

Enables opportunistic collection optimization

assist

Enables building of word list

housekeep

Enables housekeeping of unneeded files

delete

Enables document deletion

backup

Enables backup

purge

Enables background purging

repair

Enables collection repair

dataprep

Same as search-index-optimize-assist-housekeep

index

Same as insert-delete

Keyword	Description
search	Enables search and retrieval
insert	Enables adding and updating documents
optimize	Enables opportunistic collection optimization
assist	Enables building of word list
housekeep	Enables housekeeping of unneeded files
delete	Enables document deletion
backup	Enables backup
purge	Enables background purging
repair	Enables collection repair
dataprep	Same as search-index-optimize-assist-housekeep
index	Same as insert-delete

Message options

The mkvdk utility provides a variety of messaging options, as described in the following table:

Option

Description

-quiet

Displays only fatal and error messages to the console. It overrides the -outlevel setting. For a list of message types, see the table in "The mkvdk utility syntax".

-outlevel (num)

Indicates which message types to display to the console. Valid values are determined by adding together the numbers that correspond to the desired message types. The default value is 15. For more information, see the table in "The mkvdk utility syntax".

-logfile filename

Saves messages in the specified file.

-loglevel (num)

Indicates which message types to route to the optional log file. Valid values are determined by adding numbers together that correspond to the desired message types. The default value is 15. For more information, see the table in "The mkvdk utility syntax".

Option	Description
-quiet	Displays only fatal and error messages to the console. It overrides the -outlevel setting. For a list of message types, see the table in "The mkvdk utility syntax".
-outlevel (num)	Indicates which message types to display to the console. Valid values are determined by adding together the numbers that correspond to the desired message types. The default value is 15. For more information, see the table in "The mkvdk utility syntax".
-logfile filename	Saves messages in the specified file.
-loglevel (num)	Indicates which message types to route to the optional log file. Valid values are determined by adding numbers together that correspond to the desired message types. The default value is 15. For more information, see the table in "The mkvdk utility syntax".

Document processing options

The mkvdk utility provides a variety of document processing options, as the following table describes:

Option

Description

-extract

Extracts field values from documents, using the field extraction rules specified in the style.tde file.

-insert

Adds documents to the collection. This is the default option for the mkvdk command.

-update

Adds documents to the collection by replacing all previous information about the specified documents.

-delete

Marks the specified documents as deleted, and makes them unavailable for searches. To actually remove deleted documents from the collection's internal documents table and word indexes, use the squeeze keyword (see "About squeezing deleted documents").

-nosave

Specifies that a work list, which is generated by the mkvdk utility automatically when you use the -extract option, will not be saved in the collection directory in a file called worklist (in the Verity bulk submit file format). By default, the mkvdk utility saves the worklist in the worklist file.

-nosubmit

Specifies that a work list, which is generated by the mkvdk utility automatically when you use the -extract option, will not be submitted to the indexing engine and will be saved in the collection directory in a file called worklist (in the Verity bulk submit file format). This option allows the mkvdk utility to process field extraction separately from other indexing tasks.

Option	Description
-extract	Extracts field values from documents, using the field extraction rules specified in the style.tde file.
-insert	Adds documents to the collection. This is the default option for the mkvdk command.
-update	Adds documents to the collection by replacing all previous information about the specified documents.
-delete	Marks the specified documents as deleted, and makes them unavailable for searches. To actually remove deleted documents from the collection's internal documents table and word indexes, use the squeeze keyword (see "About squeezing deleted documents").
-nosave	Specifies that a work list, which is generated by the mkvdk utility automatically when you use the -extract option, will not be saved in the collection directory in a file called worklist (in the Verity bulk submit file format). By default, the mkvdk utility saves the worklist in the worklist file.
-nosubmit	Specifies that a work list, which is generated by the mkvdk utility automatically when you use the -extract option, will not be submitted to the indexing engine and will be saved in the collection directory in a file called worklist (in the Verity bulk submit file format). This option allows the mkvdk utility to process field extraction separately from other indexing tasks.

Bulk submit options

The mkvdk utility provides a variety of bulk submit options, as described in the following table:

Option

Description

-bulk

Interprets filespec as a bulk submit file. You can use this option with the -insert, -update, and -delete options.

-offset num

Specifies the offset into a bulk submit file or files. If you specify multiple bulk submit files and use the -offset option, the offset is applied to all of the bulk submit files.

-numdocs num

Specifies the number of documents to insert or delete from the bulk insert file or files. If you specify multiple bulk insert or delete files and use the -numdocs option, the -numdocs setting is applied to all of the bulk insert or delete files.

-autodel

Deletes the bulk submit file or files when the bulk submission work is finished.

Option	Description
-bulk	Interprets filespec as a bulk submit file. You can use this option with the -insert, -update, and -delete options.
-offset num	Specifies the offset into a bulk submit file or files. If you specify multiple bulk submit files and use the -offset option, the offset is applied to all of the bulk submit files.
-numdocs num	Specifies the number of documents to insert or delete from the bulk insert file or files. If you specify multiple bulk insert or delete files and use the -numdocs option, the -numdocs setting is applied to all of the bulk insert or delete files.
-autodel	Deletes the bulk submit file or files when the bulk submission work is finished.

Using bulk insert and delete options

The bulk submit feature supports the insertion of documents and related field values into collections.

To use the bulk submit feature to populate fields:

Define the fields in the style.sfl and style.ufl file, as appropriate.
Create a bulk submit file that specifies the documents to insert and the field values for each document.
Run the mkvdk utility using the -bulk option and specifying the bulk submit file or files.

Collection maintenance options

The mkvdk utility provides a variety of collection maintenance options, as described in the following table:

Option

Description

-backup dir

Backs up the collection into the specified directory. The backup does not include the tde subdirectory. The tde subdirectory is created by and for Topic Document Entry if Topic Document Entry is used to create or maintain the collection.

-repair

Repairs the collection, performed by an API call.

-purge

Waits the amount of time specified by the -purgewait option and then deletes all documents in the collection, but not the collection itself. It leaves the collection directory structure intact.

To specify a different wait period, use the -purgewait option instead of the -purge option. If you do not use the -purgewait option, the default is 600 seconds.

-purgeback

Used with the -purge option, performs a purge in the background.

-purgewait sec

Specifies to the -purge option how many seconds to wait. If you do not specify sec, the default is 600.

-noservice

Prevents collection servicing, which includes indexing, by this instance of the mkvdk command, performed by an API call.

-persist

Services the collection repeatedly, at default intervals of 30 seconds. Use the -sleeptime option to set a different interval.

-sleeptime sec

Specifies the interval between service calls when the mkvdk utility is run with the -persist option.

-optimize spec

Performs various optimizations on the collection, depending on the value of spec. The specifier, spec, is a string consisting of keywords separated by hyphens, such as maxmerge-squeeze-readonly. For valid keywords, see "Optimization keywords".

-noexit

Windows only. Causes the I/O window to remain after the program is finished. By default, the window closes and the program exits, so that scripts calling the mkvdk utility do not hang.

Option	Description
-backup dir	Backs up the collection into the specified directory. The backup does not include the tde subdirectory. The tde subdirectory is created by and for Topic Document Entry if Topic Document Entry is used to create or maintain the collection.
-repair	Repairs the collection, performed by an API call.
-purge	Waits the amount of time specified by the -purgewait option and then deletes all documents in the collection, but not the collection itself. It leaves the collection directory structure intact. To specify a different wait period, use the -purgewait option instead of the -purge option. If you do not use the -purgewait option, the default is 600 seconds.
-purgeback	Used with the -purge option, performs a purge in the background.
-purgewait sec	Specifies to the -purge option how many seconds to wait. If you do not specify sec, the default is 600.
-noservice	Prevents collection servicing, which includes indexing, by this instance of the mkvdk command, performed by an API call.
-persist	Services the collection repeatedly, at default intervals of 30 seconds. Use the -sleeptime option to set a different interval.
-sleeptime sec	Specifies the interval between service calls when the mkvdk utility is run with the -persist option.
-optimize spec	Performs various optimizations on the collection, depending on the value of spec. The specifier, spec, is a string consisting of keywords separated by hyphens, such as maxmerge-squeeze-readonly. For valid keywords, see "Optimization keywords".
-noexit	Windows only. Causes the I/O window to remain after the program is finished. By default, the window closes and the program exits, so that scripts calling the mkvdk utility do not hang.

Examples: maintaining collections

The following examples show the commands for maintaining a collection.

Repairing a collection

The following command automatically repairs a collection, or enables it after manual repairs:

mkvdk -repair -collection path

Backing up a collection

The following command backs up a collection to the specified directory:

mkvdk -backup path_1 -collection path_2

Deleting a collection

To delete a collection, use the appropriate command for your operating system. For example, to remove the collection directory structure and control files on a UNIX system, use the following command:

rm -r -collection_path

Purging a collection

The following command deletes all documents from a collection, but does not delete the collection itself:

mkvdk -purge -collection path

Purging a collection in the background

The following command purges the specified collection in the background:

mkvdk -purge -purgeback -collection path

Specifying persistent service

The following command runs the mkvdk command as a persistent process, so that servicing is performed repeatedly after num idle seconds:

mkvdk -persist -sleeptime num -collection path

Deleting a collection

The -purge option deletes all documents in a collection, but does not delete the collection itself. To delete a collection, use operating system commands, such as the rm command on UNIX, to remove the collection directory structure and control files.

Optimization keywords

The following table describes the optimization keywords for the -optimize option:

Keyword

Description

maxclean

Performs the most comprehensive housekeeping possible, and removes out-of-date collection files. Macromedia recommends this optimization only when you are preparing an isolated collection for publication. When using this type, if the collection is being searched, files sometimes get deleted too early, which can affect search results.

maxmerge

Performs maximal merging on the partitions to create partitions that are as large as possible. This creates partitions that can have up to 64000 documents in them.

readonly

Marks the collection as read-only and unchanged after the function call is done. This is appropriate for CD-ROM collections.

spanword

Creates a spanning word list across all the collection's partitions. A collection consists of numerous smaller units, called partitions, each of which includes a word list. Optionally, a spanning word list can be built with an ngram index.

ngramindex

Builds an ngram index for the collection. An ngram index is designed to improve the search performance for queries with the <TYPO> and <WILDCARD> operators. An ngram index cannot be built without a spanning word list. You can build a spanning word list and ngram index in the same command, for example:
mkvdk -collection collname -optimize spanword -ngramindex

squeeze

Squeezes deleted documents from the collection. Squeezing deleted documents recovers space in a collection, and improves search performance. (For more information about squeeze, see "About squeezing deleted documents".) Using this option invalidates the search results.

vdbopt

Configures the collection's Verity databases (VDBs). Each collection consists of smaller units called VDBs. This keyword has the effect of linearizing the data in a VDB, and making the collection metadata contained in the VDB more streamlined. It also lets the VDB grow to a much larger size.

tuneup

Performs the same as combining the maxmerge, vdbopt, and spanword keywords.

publish

Performs the same as all of the optimization types combined. Use this keyword to optimize the collection for the best possible retrieval performance, such as for publication to a network on a server or on a CD-ROM.

Keyword	Description
maxclean	Performs the most comprehensive housekeeping possible, and removes out-of-date collection files. Macromedia recommends this optimization only when you are preparing an isolated collection for publication. When using this type, if the collection is being searched, files sometimes get deleted too early, which can affect search results.
maxmerge	Performs maximal merging on the partitions to create partitions that are as large as possible. This creates partitions that can have up to 64000 documents in them.
readonly	Marks the collection as read-only and unchanged after the function call is done. This is appropriate for CD-ROM collections.
spanword	Creates a spanning word list across all the collection's partitions. A collection consists of numerous smaller units, called partitions, each of which includes a word list. Optionally, a spanning word list can be built with an ngram index.
ngramindex	Builds an ngram index for the collection. An ngram index is designed to improve the search performance for queries with the <TYPO> and <WILDCARD> operators. An ngram index cannot be built without a spanning word list. You can build a spanning word list and ngram index in the same command, for example: mkvdk -collection collname -optimize spanword -ngramindex
squeeze	Squeezes deleted documents from the collection. Squeezing deleted documents recovers space in a collection, and improves search performance. (For more information about squeeze, see "About squeezing deleted documents".) Using this option invalidates the search results.
vdbopt	Configures the collection's Verity databases (VDBs). Each collection consists of smaller units called VDBs. This keyword has the effect of linearizing the data in a VDB, and making the collection metadata contained in the VDB more streamlined. It also lets the VDB grow to a much larger size.
tuneup	Performs the same as combining the maxmerge, vdbopt, and spanword keywords.
publish	Performs the same as all of the optimization types combined. Use this keyword to optimize the collection for the best possible retrieval performance, such as for publication to a network on a server or on a CD-ROM.

About squeezing deleted documents

When a document is deleted from a collection, its space is not recovered. It is merely marked as deleted and not available for subsequent searches. Squeezing actually removes deleted documents from the collection's internal documents table and word indexes, thus creating a smaller collection and reducing the collection's disk space. A smaller collection has a more efficient structure that makes searching slightly faster and uses slightly less memory.

You can safely squeeze deleted documents for a collection at anytime, because the mkvdk utility ensures that the collection is available for searching and servicing through its self-administration features. The application does not need to temporarily disable a collection to squeeze deleted documents, because when a squeeze request is made, the mkvdk utility assigns a new revision code to the collection. After a squeeze has occurred, the next time the application accesses the collection, the Verity engine notifies the application that dramatic changes have been made, and points the application to the new collection data.

Squeezing deleted documents out of a collection is a significant update to the collection. If users are reviewing search results at the time when squeezing occurs, the search results might be invalidated after the squeeze operation.

About optimized Verity databases

The Verity database (VDB) is the fundamental storage mechanism responsible for supporting dynamic access to documents in collections. A VDB consists of simple tables with rows and columns that relate to each other by row position. VDB tables are not relational, and their architecture supports quick and efficient searching over textual data. A VDB consists of segments that are packed into a single file. One of the advantages of having one packed VDB file is optimized search performance. The fewer files that need to be opened during search processing, the faster the search performance.

The VDB optimization option optimizes the packing of a collection's VDBs. When VDBs are built during normal indexing operations, the segments are not stored sequentially in the one-file VDB file system. As a result of VDB optimization, performance can be improved by reserializing the packed segments in the VDBs so that all segments are contiguous, and VDBs can grow in size. Optimized VDBs can grow up to 2 gigabytes in size, as opposed to the maximum 64 megabytes for an unoptimized one.

Using this option might degrade your indexing performance when certain indexing modes are set for the collection.

Performance tuning options

The mkvdk utility provides performance tuning options, as the following table describes:

Option

Description

-maxfiles num

Sets the maximum number of files that the mkvdk utility can have open at once. The default is 50.

-diskcache num

Sets the size of the mkvdk disk cache in kilobytes. The default is 128.

Working with Verity Tools
Managing Collections with the mkvdk utility

Option	Description
-maxfiles num	Sets the maximum number of files that the mkvdk utility can have open at once. The default is 50.
-diskcache num	Sets the size of the mkvdk disk cache in kilobytes. The default is 128.

Getting started with the Verity mkvdk utility

Creating a collection

To create a collection:

Accessing online Help for the mkvdk utility

Collection setup options

Examples: setting up collections

Creating a collection

Building the word list

General processing options

Examples: processing documents

The following examples show the commands for processing documents.

Using the default options

Servicing only

Deleting documents from a collection

Bulk inserting or deleting

Date format options

Service-level keywords

Message options

Document processing options

Bulk submit options

Using bulk insert and delete options

To use the bulk submit feature to populate fields:

Collection maintenance options

Examples: maintaining collections

Repairing a collection

Backing up a collection

Deleting a collection

Purging a collection

Purging a collection in the background

Specifying persistent service

Deleting a collection

Optimization keywords

About squeezing deleted documents

About optimized Verity databases

Performance tuning options

Comments