Sparsity - Data Control
The SparsityCheck module prevents the release of tables that contain a high proportion of cells with very low values (0,1, or 2). It applies to interior cells only (totals are not included). If SparsityCheck is enabled, each cross-tabulation result is checked to verify that the table is not too sparse for release.
If you need to use SparsityCheck in your deployment, please contact Space-Time Research support (support@spacetimeresearch.com) for advice on the appropriate threshold settings for your processing needs.
To configure the module, you need to define the sparsity check thresholds, which must be named ThresholdA and ThresholdB.
The names are case sensitive, and the default values are:
- ThresholdA - 0.25
- ThresholdB - 0.50
The module works as follows:
Assuming that:
- c is the number of interior cells in the table.
- c0 is the number of zero interior cells.
- c1 is the number of interior cells of value 1.
- c2 is the number of interior cells of value 2.
Then the table will not be released if:
- c-c0=0 /* table is empty, check first to avoid divide by zero error.
- c1/(c-c0) > ThresholdA (the ratio of cells with value 1, to the total number of cells with non-zero value).
- (c1+c2)/(c-c0) > ThresholdB (the ratio of cells with value 1 or 2, to the total number of cells with non-zero value).
Apply the Plugin to a Dataset
Login to SuperADMIN and create a new method:
CODE> method addmethod sparsity-method
Set the
FREQ
common property totrue
(recommended; this will configure SuperSERVER to base the calculation on the contribution count rather than the cross tabulation results).CODE> method sparsity-method common addproperty FREQ "true"
Add the Data Control plugin to the method (the name of the plugin,
SparsityCheck
, is case sensitive):CODE> method sparsity-method adddcplugin sparsitycheck SparsityCheck
Set the plugin properties:
CODE> method sparsity-method sparsitycheck addproperty ThresholdA "0.5" > method sparsity-method sparsitycheck addproperty ThresholdB "0.75" > method sparsity-method sparsitycheck addproperty Message "Table is too sparse" > method sparsity-method sparsitycheck addproperty ConfidentialityModule "true"
Assign the method to a dataset (in this example we are assigning the method to a dataset with the ID
bank
:CODE> cat bank addmethod sparsity-method
You can review the method details using the command
cat <dataset_id> methods details <method_id>
:CODE> cat bank methods details sparsity-method [ Method : sparsity-method (id:sparsity-method) (type:mandatory) ] [ Common ] [ FREQ : true ] [ DCPlugin : SparsityCheck (id:sparsitycheck) (priority:1) ] [ ThresholdA : 0.5 ] [ ThresholdB : 0.75 ] [ Message : Table is too sparse ] [ ConfidentialityModule : true ]