Set Data Cleansing Actions - SuperCHANNEL
Cleansing actions allow you to automatically clean up some of your source data when building your SXV4 database.
Cleansing actions apply to columns that can only be one of a set of possible values. This includes:
- Classified columns - columns in fact tables that should be one of a set of possible values defined in a linked classification table.
- Foreign key columns - columns in fact tables that should be one of a set of possible values defined in another fact table or in a multi-response fact table.
For example, the Gender column might be linked to a classification table containing the codes M (Male), F (Female), and U (Unknown). What should SuperCHANNEL do if it encounters an empty value for this column in one of the records in the fact table?
By specifying a cleansing action, you control what SuperCHANNEL should do when it encounters a value that does not match one of the possible values.
You can choose from the following cleansing actions:
Cleansing Action | Description |
---|---|
Add To Classification | Add the new value to the classification table. This is the default action for classified columns. This option is only available for classified columns; it cannot be used for foreign key columns to other fact tables or between a fact table column and a multi-response fact table. If you try to set these to use the add to classification action then SuperCHANNEL will automatically revert to skip. |
Bin | Convert the value to a specified "bin" value. The bin value must already exist in the corresponding classification table. |
None | Do not capture exceptions. Use this action for non-SXV4 drivers. |
Skip Row | Skip this record. This can be a useful cleansing action for a column linking two fact tables. For example if you want SuperCHANNEL to skip all Account records that do not have associated Customer records. Use caution when selecting this option: your target may end up with fewer fact records than the source. |
Stop | Stop the build process. This is the default action for foreign key columns to other fact tables. |
Define the Cleansing Action
- Open the Target View and the Target Attributes pane.
- In the Target View, select the fact table column.
In the Target Attributes, select the Action from the Cleansing drop-down list.
If you select the bin action, you must also enter the bin Value.
Hierarchical Classifications
For hierarchical classifications, you can choose the add to classification action but it has no effect. If a fact table value for a hierarchical classification is not found in the classification table, it cannot be inserted because the rest of the links in the hierarchy are not known (the fact table value is always the bottom value in the hierarchy).
During the build, SuperCHANNEL will generate warning messages in the logs similar to the following:
ERROR 15:53:00:104 [main] XMLReader3_0 - Warning - TargetColumn.setCleansingAction: Column 'City': Cannot set cleansing action to ADD for a hierarchical classification reference: defaulting to SKIP.
These messages are informational rather than errors and can be safely ignored. The cleansing action is automatically changed to "skip" and the build continues.