Last Updated: 06 Jul 2022
Squiz Matrix keeps an index or a list of words, where each word appears and how often each word appears. For the Search Page, Search List, Search Folder and Quick Search to work, indexing needs to be turned on in the system.
Once it is turned on, assets will automatically be indexed when they are created or when a change to an asset is committed. By default, Squiz Matrix will index the attributes of the asset including its content as well as metadata values. You can turn indexing off for an asset type or fields within an asset type on the either the Asset Weights, Asset Tree Weights or Global Weights screen of the Search Manager. You can also change which words are indexed and how many characters a word must be before it is indexed on the Details screen of the Search Manager. For more information on the Search Manager, refer to the Search Manager chapter in this manual.
Turning Indexing On
To turn indexing on, go to the Details screen of the Search Manager. Change Indexing Status to On and click Commit. Indexing will be turned on for the system and any assets that are added and changed from now on will be indexed. Any assets that have been created previously, however, will not be indexed. To index these assets you will need to re-index the system.
Re-indexing the System
To re-index the system, go to the Details screen of the Search Manager. Select Reindex all assets in the system and click Commit. The system will re-index all assets in the system.
Alternatively, if you only want to re-index a certain part of the system, select the parent asset in the Root Node field under the Re-index Assets section and click Commit.
Whenever you change any of the settings on the Search Manager, you will need to perform a re-index. If you do not perform a re-index, the changed will not affect your search results.
Re-indexing the System from the Server
To re-index the system from the server you can run the reindexSearchIndex.php script, which is in <system_root>
The script accepts up to three parameters, the first of which is required:
php packages/search/scripts/reindexSearchIndex.php PATH_TO_SYSTEM_ROOT ROOT_NODE_IDS BATCH_SIZE
PATH_TO_SYSTEM_ROOTis a required parameter that sets the path on the server where Matrix is installed.
In the example below, the user has already moved into the system root directory and is executing the script within the directory by passing
`pwd`in as the first parameter.
ROOT_NODE_IDSis an optional parameter that lets you pass a comma-separated list of root node IDs.
An example of valid parameter values for this parameter is
11,55,66or a single value such as
11. If you do not pass any IDs the script will prompt you to reindex the entire system.
BATCH_SIZEis an optional parameter that will only work if
ROOT_NODE_IDSis also passed. This parameter lets you define the batch size if you need to run the script in chunks.
The default value is
100and there is no upper limit.
Any values less than
0will reset the batch size to
This parameter is an advanced option and should be used with care: the default value is recommended.
An example of the usage of this script is given below:
$ php packages/search/scripts/reindexSearchIndex.php `pwd`
Enter the #IDs of the root nodes to reindex (comma separated) or press ENTER to reindex the whole system: 100,200
Do you want to reindex the root node #100 (yes/no) no
Do you want to reindex the root node #200 (yes/no) yes
Indexing the Content of PDF Files and MS Word Documents
By default, Squiz Matrix will not index the content of the PDF Files and MS Word Documents. In other words, when a user searches for a term, it will not search the content of these documents.
To index the content of these documents, you need to enable Apache Tika. For more information on how to enable these tools, refer to the External Tools Configuration chapter in the System Configuration manual.
Recommendations for Indexing
To help improve general backend performance and search performance for both front end and back end searching, it is recommended that you turn off indexing for:
- All general assets such as Designs, Design Areas, Bodycopies, Divisions, Metadata Schemas and Workflow Schemas
- Fields that are not being used on a Search Page, for example asset ID, created date, updated date and published date.
- Certain parts of the Asset Map, for example the System Management Folder, the Designs Folder and the Users Folder
- The Metadata Schemas, sections within a Metadata Schema or the metadata fields that will not be used for searching.