Update the Search Index for Multilingual Datasets
If you are using the Metadata Server, either because you have set up multilingual datasets or you are using a relational database to store metadata, then you need to use the BuildMetadataSearchIndex script to keep the search index updated whenever there is a change to either the catalogue or the external multilingual/metadata database. If you are not using Metadata Server, follow the process for Single Language Datasets instead.
To update the search index you need to run the script BuildMetadataSearchIndex, which is located in the SuperADMIN program data directory. For example, on Windows if you installed to the default location, it will be located at C:\ProgramData\STR\SuperADMIN\MetaData\MetaDataUtilities\BuildMetadataSearchIndex.bat
Before starting, check that you have plenty of available disk space (the exact amount required to store the final index will depend on the size of your dataset catalogue, but can be several gigabytes for a large catalogue). In addition, the indexing process will create some temporary files while it builds the index. These will be cleaned up automatically at the end of the process, but may cause the index to temporarily grow to around double its final size before these files are cleaned up.
If you encounter disk space errors during the indexing, you should either increase the size of the disk or use the DESTINATION_FOLDER
setting to change the location of the index to a partition that has sufficient disk space.
Step 1 - Update databases.txt
This is a text file that specifies which SuperSTAR datasets you want to include in the index. It is located in the same directory as the indexing script.
You need to update this file so that it contains a list of all the datasets on your deployment that you want to be indexed.
You should include all your datasets in the index. If you have datasets that not all users have access to, then you can still include these in your index. SuperADMIN will automatically take care of permissions. When a user searches in SuperWEB2, the search results will automatically be filtered so that they only contain results from datasets and fields that the user has permission to access.
You can either update databases.txt manually or use the createdatabaselist
command in SuperADMIN.
The file must use the following format (if you use the SuperADMIN command it will generate a list of all of your SuperSTAR datasets in this format):
<dataset_id>|<display_name>|<full_path_to_SXV4>
Where:
<dataset_id> | The ID of the dataset in the SuperSTAR catalogue. |
---|---|
<display_name> | The dataset display name from the SuperSTAR catalogue. |
<full_path_to_SXV4> | The full path to the .sxv4 file that contains the dataset but without the .sxv4 file extension. |
For example, the shipped databases.txt file is as follows. This would instruct the batch process to index the sample People and Retail Banking datasets:
people|people|C:\ProgramData\STR\SuperSERVER SA\databases\People
bank|bank|C:\ProgramData\STR\SuperSERVER SA\databases\RetailBanking
When you have finished editing databases.txt, save the file.
You are recommended to save this file in the same location as the standard shipped file. If for any reason you want to save the file to a different location, you must update SET DB_FILE_LIST="databases.txt"
in BuildMetadataSearchIndex before running the script so that it contains the full path to your new location of the databases.txt file.
Step 2 - Configure BuildMetadataSearchIndex
Before you can run the indexing script, you need to configure some settings so that it can connect to your external metadata database.
Open BuildMetadataSearchIndex in a text editor. Modify the following lines:
Setting: | Make This Change: |
---|---|
INDEX_DIR | (Linux Only): Set this to the full path of the MetaDataUtilities directory where the indexing script is located. For example: export INDEX_DIR=/home/str/superadmin/Metadata/MetaDataUtilities |
DESTINATION_FOLDER | (Linux Only): Set this to the full path to the location where you want to generate the index files. This will need to match the configuration in SuperADMIN, which is set to use the meta_search_index directory in the SuperADMIN program data directory by default. See the next step for more details on this website. |
DB_DRIVER_CLASS | Add the details of the JDBC database driver to use to connect to your external database. For example, to use the Microsoft SQL Server JDBC driver, set the driver class as follows: |
SET DB_DRIVER_CLASS="com.microsoft.sqlserver.jdbc.SQLServerDriver" | |
DB_DRIVER_LOCATION | Add the full path to the location of the JDBC driver (jar file) on your system. For example: |
SET DB_DRIVER_LOCATION="C:\drivers\SQLServer\sqljdbc_4.0\sqljdbc4.jar" | |
DB_URL | Add the connection string the script will use to connect to your metadata database. For example, for SQL Server the connection string is similar to the following: |
SET DB_URL="jdbc:sqlserver://MYSERVER;databaseName=Metadata;user=mydbuser;password=myuserpassword;" | |
Replace the server and user details with the appropriate values for your system. In this example:
| |
REPOSITORY | Add the repository ID. This must be the same value that you used when you created the external metadata database. For example: |
SET REPOSITORY=metadatadbid |
Step 3 - Check the Index Location in BuildMetadataSearchIndex.bat
There is a setting in the indexing script (DESTINATION_FOLDER
) that instructs it where to generate the index files. There is also a setting in SuperADMIN that determines where it will look for the generated index when a user performs a search (you can check this using the command gc search indexDirectory
).
By default, these are both set to the meta_search_index directory in the SuperADMIN program data directory (by default, C:\ProgramData\STR\SuperADMIN\server\meta_search_index on Windows). The predefined index that is supplied with SuperSTAR (which covers the sample Retail Banking and People datasets) is located in this directory.
When you run the script to update the index, the first thing it will do is to remove the existing index. This means with the default settings, search would be unavailable to users until the index is rebuilt (which may take some time, particularly if you have a large number of datasets).
For this reason, you are recommended to take the following steps when updating the index:
- Update the
DESTINATION_FOLDER
setting in the BuildMetadataSearchIndex.bat script to point to a new location. - Run the script.
- When the script finishes, update SuperADMIN to use the new location.
This will ensure that there is no downtime of the search index and that users can continue to search (using the old index files) while you are doing the update.
To update the index location and rebuild the index:
- Open BuildMetadataSearchIndex in a text editor.
Check the
DESTINATION_FOLDER
setting. By default, this is set as follows:CODEset DESTINATION_FOLDER="%SA_PROGRAM_DATA%\server\meta_search_index"
This indicates that the search index files will be saved to the meta_search_index directory in the SuperADMIN program data directory (by default, C:\ProgramData\STR\SuperADMIN\server\meta_search_index).
Update the location to a new directory. For example:
CODEset DESTINATION_FOLDER="%SA_PROGRAM_DATA%\server\meta_search_index_multilingual_july_2014"
You do not need to create this directory; the index script will automatically create it when it runs. Please note that if you are running the script on Linux then the script does not define the variable
%SA_PROGRAM_DATA%
so you will need to specify the full path instead.- Save your changes to the file.
- Run BuildMetadataSearchIndex, and wait for it to finish indexing your datasets.
Go to SuperADMIN and use the the following command to update the index location to your new location:
CODE> gc search indexDirectory value "meta_search_index_multilingual_july_2014"
Go to SuperWEB2 and check that search now includes all your datasets and languages.