Confidentiality Plugin Overview
The supplied Data Control confidentiality plugin offers two techniques for applying confidentiality to cell values:
- Rounding - Values that meet certain criteria (typically very small values) are randomly rounded up or down to a different value.
- Cell Suppression - Cells that meet certain criteria are suppressed entirely.
Rounding Rules
Rounding rules adjust values up or down to a defined base.
Rounding is only intended for integer counts. If random rounding is applied to cells with real values, the results will always be integers.
The sample plugin includes the following rounding rules:
Rule | Description | Effect of Applying Rule |
---|---|---|
0-3 rounding | Randomly changes any values of 1 and 2 in the table to either 0 or 3. |
|
Random rounding | All values are rounded up or down to base 3 after tabulation. | |
1-4 rounding | Randomly changes any values in the table between 1 and 4 inclusive to a random number between 1 and 4 with equal probability distribution. |
|
Graduated rounding | This rule is similar to random rounding in that all values are rounded. However, the amount by which values are rounded depends on the size of the value. |
|
Rounding output is determined only by the input data, so the same randomly rounded results will be produced every time if the same table data and dimensions are used, and each recode has the same number of categories. Variations in results may occur when fields are added to tables in a different order.
Rounding and Totals
When using the Data Control plugin for rounding, any rounding of total values as a result of these rules is only applied after the total has been calculated from the unrounded microdata. This ensures that totals do not suffer from accumulated rounding errors.
For example, when the 0-3 rounding rule is in use, a total value will at most only be out by 2 integer values from the unrounded total.
Some of these rules can also be run directly in the SuperCROSS client. When the rules are applied client side in SuperCROSS, the totals are calculated based on the rounded values, and therefore may suffer from accumulated rounding errors. If it is important to you to eliminate accumulated rounding errors in totals then you are recommended to use the Data Control plugin for rounding rather than client side rules.
Rounding and Client Side Computations
Rounding from the Data Control plugin is always carried out first, before any client side computations that you might have configured (such as sum derivations, classification derivations, and percentages). The Data Control plugin provides the client with the already rounded data, and the client uses this to perform its calculations.
Examples
The following examples show the effect of the rounding rules on the Retail Banking sample database.
Click the images to view a larger version.
SuperCROSS
No rounding: | 0 – 3 rounding: |
Random Rounding: | 1-4 Rounding: |
Graduated Rounding: |
SuperWEB2
No rounding: | 0-3 Rounding: |
Random Rounding: | 1-4 Rounding: |
Graduated Rounding: |
Cell Suppression Rules
Cell suppression rules allow you to avoid the risk of statistical disclosure. You define the rules for when a cell value should be suppressed, and any cell meeting those criteria will automatically have its value replaced with a confidentiality string.
By default the replacement string for confidential cells is ..C but you can change this to something else if you prefer.
From SuperSTAR version 8.0.4.10 onwards you can also choose to replace the cell value with 0 instead of using a confidentiality string. See Configure the Confidentiality Plugin for more details on this configuration adjustment.
The data control plugin includes the following rules:
Rule | Cell Values Will Be Concealed If... | For Example... |
---|---|---|
Top Contributors | The biggest n contributing values make up more than a certain percentage of the cell value. | If you set:
then cell values will be concealed if the biggest 3 contributors make up 75% or more of the total cell value. Any cell with (in this example) 3 or fewer contributors will obviously always be concealed. This is because if there are only 3 contributors, then clearly all 3 of them are the "biggest 3" contributing values, which must therefore make up 100% of the cell's value. |
Frequency | The number of contributors to the cell value is less than or equal to the specified frequency. | If you set the frequency to 30 then any cells with under 31 contributors will be concealed. |
Threshold | The value is below a certain threshold. | If you set the threshold to 300, any cell values of 300 or below will be concealed. |
Threshold Rule Example
SuperCROSS, no disclosure control: | SuperWEB2, no disclosure control: |
SuperCROSS table concealing cells with values of 500 or below: | SuperWEB2 table concealing cells with values of 500 or below: |