sxv4drv.conf
This configuration file contains setting for the SXV4 driver. It allows configuration of various cache settings and buffers used by the driver when inserting records into SXV4 target databases, as well as the type of index used, and the level of warnings issued.
The following are some of the available settings. Under most circumstances you should not change these from the defaults.
The location of sxv4drv.conf will depend on where you installed SuperCHANNEL. By default, it is located in C:\ProgramData\STR\SuperCHANNEL\bin.
SXV4 Index Types
An index is a data structure that organizes data in a table so that you can find it again. SXV4 indexes are only important if you have primary keys on a table and you need to be able to search for records. There are two main reasons why you might need to search for records:
- You have two or more related fact tables in your SXV4 schema and you need to be able to search for individual records during cross tabulation.
- You want to be able to use the delete or update commands. To use the delete or update commands:
- The records must be uniquely identifiable.
- There must be a primary key defined for the tables involved.
- There must be an index to allow searching for records within that table.
SXV4 data is stored in insert order, to give the user maximum control over the optimization of SuperCHANNEL database build times and SuperSERVER cross tabulation performance. Choice of index type directly affects the time taken for SuperCHANNEL to build an SXV4 database. The use of sorted data will improve cross tabulation performance in SuperSERVER.
SXV4 databases support two index types:
Skip Lists | A skip list is a sorted index, which is best for sorted data and larger tables. A skip list index minimizes memory usage by storing all data on disk and reading in data as required. Use of sorted data minimises the number of disk reads needed to scan all the rows in the table as data can be read in blocks and cached. It may produce slower performance than a hash table but it scales well for very large fact tables. A skip list performs well when the table is too large to fit into system memory and is read from disk as it is processed. Skip lists are the default index type. |
---|---|
Hash Tables | A hash table is an unsorted index, which is best for storing unsorted data and smaller tables. A hash table index minimises disk writes by keeping the entire lookup table in memory, meaning that searching data in any order is fast and disk reads are only done when you find the data you are looking for. Hash tables perform well when all of the data can be stored in system memory. Hash tables may perform faster than skip lists up to a point but will eventually become impractical as the look up table becomes too large to load into main memory. |
Performance
Changing the type of index can improve performance; the most appropriate setting will depend on your table size and whether or not the data is sorted. The recommended settings are as follows:
| Table Size: Small | Table Size: Large |
---|---|---|
Data: Unsorted | Hash Table | Hash Table or Skip List |
Data: Sorted | Hash Table or Skip List | Skip List |
Configure SuperCHANNEL
To configure SuperCHANNEL to use Skip Lists, add the following parameter to sxv4drv.conf:
CTAC_DB_INDEXTYPE SORTED
To configure SuperCHANNEL to use Hash Tables, add the following parameter to sxv4drv.conf:
CTAC_DB_INDEXTYPE UNSORTED
Cache Size
Use the CTAC_DB_MAXCACHE
parameter to set the caching size. This is defined in cache blocks (16k row long blocks of the table; the physical size of the block depends on the number of columns and the data types chosen for each column).
To set the cache size, add the following parameter to sxv4drv.conf:
CTAC_DB_MAXCACHE <MAX_SIZE>
Replace <MAX_SIZE>
with a value between 0 and 2000.
The default is 2000. This will be used if the parameter is not present in sxv4drv.conf.
Space-Time Research recommends using 2000 for INSERT as this is the maximum limit, and 30 for UPDATE as testing has shown this to be optimal.
Cache Size and Wide Tables
Tuning the CTAC_DB_MAXCACHE
parameter can help with processing wide tables.
If the table is moderately wide then tuning CTAC_DB_MAXCACHE
will reduce memory usage and improve channelling performance. If the table is extremely wide (for example, more than 1000 columns), Space-Time Research recommends that you restructure your data into a cascading fact table chain, for best channelling and cross-tabulation performance.
Please contact support@spacetimeresearch.com for advice if you need to restructure an extremely wide table into a chain of tables.
Data Buffering
The GATHERED_COLUMN_BUFFER_SIZE
parameter controls how much data is buffered before being written to disk.
You can use the suffix "k" or "m" to denotate kilobytes or megabytes. For example, the following setting configures SuperCHANNEL to buffer 2 MB before writing to disk:
GATHERED_COLUMN_BUFFER_SIZE 2m
The default setting is 256 KB.
Maximum String Buffer Size
Use the MAX_STRING_CONVERSION_BUFFER_LENGTH parameter to configure the maximum buffer size available for an individual column string value during channelling. The default setting is 16 KB, so if the size of the string column is larger than this then you should adjust the buffer size accordingly.
You can use the suffix "k" or "m" to denotate kilobytes or megabytes. For example, the following setting sets the maximum buffer size to 32 KB:
MAX_STRING_CONVERSION_BUFFER_LENGTH 32k