Configure Discrete Perturbation
Prerequisites
To use perturbation, you must have:
- R Keys in the unit records.
- A perturbation table (PTable) file for each dataset. This file is in CSV format with the extension .pert
Module Properties
The Perturbation module has the following properties you can configure.
Property | Description | ||||||||
---|---|---|---|---|---|---|---|---|---|
RKEY | Set this to true to use the R Keys in the unit records. | ||||||||
FREQ | Set this to true to perturb cell values based on the contribution count rather than the cross-tabulation (cell) value. | ||||||||
SmallN | An integer to be used in deriving the lookup values for the perturbation table. Its value can be 10 or below. The default value is 10. | ||||||||
RULESET | Use this option to perturb other results:
For example:
| ||||||||
PTableSize | The size of the perturbation table. The default value is 30. | ||||||||
BigN | An integer to be used as the modulo base when adding R Keys. Must not exceed 2^32 (4294967296). The default value is 4294967296. | ||||||||
ConfidentialityModule | Set this to | ||||||||
Message | A message to be displayed to users in the client. | ||||||||
PTable | The location of the perturbation file for this dataset. If you do not set a value for the PTable property, then by default SuperSERVER expects this file to be saved in the same location as the SXV4 file, but with the extension .pert instead of .sxv4. For example, if the SXV4 file is C:\ProgramData\STR\SuperSERVER SA\databases\RetailBanking.sxv4 then the perturbation file is expected to be located at C:\ProgramData\STR\SuperSERVER SA\databases\RetailBanking.pert If you want to use a different location for the file, then you can set the value of PTable to the location of the .pert file. You can either use an absolute path or a relative path (relative to the SuperSERVER program data directory, which is C:\ProgramData\STR\SuperSERVER SA if you installed to the default location). Any backslashes in the path will need to be escaped with an additional backslash (forward slashes can also be used but do not need to be escaped). For example:
CODE
If the contents of the perturbation file are modified in any way, you must restart SuperSERVER in order for the change to take effect. This is for performance reasons (SuperSERVER caches the perturbation file so that it does not have to reload and parse it on every tabulation). | ||||||||
PropagateZeroes | Whether to propagate zeros across all levels (fact tables) in a given table. Without this setting, perturbation is not coordinated, so there is a risk that an attacker could exploit this fact to determine that a zero estimate was non zero before perturbation was applied. When this setting is enabled, the perturbation method first perturbs all levels as usual and then sets the corresponding estimates for all other levels in the table to zero, if the estimate for any other level in the table is found to be zero. This setting also coordinates perturbation with measures: if a fact table count is perturbed to 0, then the measures for that fact table will also be perturbed to 0. To apply zero propagation, use the following command:
CODE
The available settings are:
For example:
CODE
| ||||||||
PropagateZeroesThreshold | The propagation threshold. Use the threshold to control whether a cell can be set to zero by zero propagation from a related level/record count:
CODE
If the record count of a cell is less than or equal to this threshold, then it can be set to zero by zero propagation. For example, the following command ensures that cells with record counts of 5 or less can be set to zero:
CODE
| ||||||||
QUANTILEVALIDATION QUANTILEPTABLE QUANTILECONFIG | The location of the configuration files for quantile perturbation. By default, these files should be in the same location as the SXV4 file, but you can use these properties to set an alternative location. See Quantiles and Ranges - Perturbation for more details. |
Apply the Plugin
Login to SuperADMIN and create a new method:
CODE> method addmethod perturbation_method
This example sets the ID of the new method to
perturbation_method
. This ID will be used in all the following examples, although you can replace this with your preferred ID if you wish.Add the Perturbation Data Control plugin to the method:
CODE> method perturbation_method adddcplugin perturbation Perturbation
This example sets the ID of the plugin within this method to
perturbation
. You can replace this with your preferred ID.The
Perturbation
at the end of this command is the library name for the perturbation module. This is case sensitive and must be specified exactly as shown here.Set the plugin properties:
CODE> method perturbation_method perturbation addproperty RKEY "true" > method perturbation_method perturbation addproperty FREQ "true" > method perturbation_method perturbation addproperty "SmallN" "10" > method perturbation_method perturbation addproperty "PTableSize" "30" > method perturbation_method perturbation addproperty "BigN" "4294967296" > method perturbation_method perturbation addproperty ConfidentialityModule "true" > method perturbation_method perturbation addproperty Message "Data has been perturbed"
Assign the method to a dataset (in this example we are assigning the method to a dataset with the ID
bank
):CODE> cat bank addmethod perturbation_method
You can review the method details using the command
cat <dataset_id> methods details <method_id>
:CODE> cat bank methods details perturbation_method [ Method : perturbation_method (id:perturbation_method) (type:mandatory) ] [ Common ] [ DCPlugin : Perturbation (id:perturbation) (priority:1) ] [ RKEY : true ] [ FREQ : true ] [ SmallN : 10 ] [ PTableSize : 30 ] [ BigN : 4294967296 ] [ ConfidentialityModule : true ] [ Message : Data has been perturbed ]
Perturbation with Weighted Datasets
If you have weighted datasets, then you must apply an additional data control module, Average_cellwgt
, to your perturbation methods. This module effectively scales up the perturbed amount to account for the weighting.
When using weighted datasets:
- The average cell weight module must be added to the method after the perturbation module, as it uses the result of the perturbation as part of its calculation.
- The
FREQ
property must be set totrue
.
The following is a complete example of perturbation with weighted datasets:
method addmethod weighted_perturbation_example
method weighted_perturbation_example adddcplugin weighted_perturbation Perturbation
method weighted_perturbation_example weighted_perturbation addproperty RKEY "true"
method weighted_perturbation_example weighted_perturbation addproperty "SmallN" "10"
method weighted_perturbation_example weighted_perturbation addproperty "PTableSize" "30"
method weighted_perturbation_example weighted_perturbation addproperty "BigN" "4294967296"
method weighted_perturbation_example weighted_perturbation addproperty ConfidentialityModule "true"
method weighted_perturbation_example weighted_perturbation addproperty Message "Data has been perturbed"
method weighted_perturbation_example adddcplugin Average_cellwgt Average_cellwgt
method weighted_perturbation_example common addproperty FREQ "true"