If you are publishing data then you will know how important it is to protect the confidentiality of any individuals in that data. In many countries, including Australia, there can be serious legal consequences to releasing data that allows individuals to be identified. Additionally, if organisations do inadvertently release data that allows individuals to be identified, public trust is compromised.
On the other hand, if organisations do not release any data, or release only prescribed views of data, then open data transparency is compromised, and the potential economic and social benefits are lost.
WingArc Australia has developed a series of solutions for statistical disclosure control that will help you maintain the confidentiality of the individuals or organisations in your data, while allowing you to publish as much information as possible.
Approaches to Statistical Disclosure Control
There are two main approaches to controlling the disclosure of confidential data:
- Changing the data before the dissemination in such a way that the disclosure risk for the confidential data is decreased, but the information content is retained as much as possible.
- Reducing the information content of the data provided to the external user by concealing sensitive values. Confidential data values are replaced by a symbol or a zero value.
The SuperSTAR Approach
Applying confidentiality to tabular data has traditionally been a manual, time-consuming, and error-prone process.
SuperSTAR addresses this problem through its Data Control API. Modules created using the API can automatically modify the tabulation results before they are returned to the end user. While the API is predominately used to address privacy and confidentiality concerns, Data Control modules can be used for any processing that needs to be done on query results (Data Control modules can access and adjust the query both before and after tabulation takes place).
A number of modules are supplied with SuperSTAR, and described in this section, but you can also use the API to write your own modules if you have custom processing requirements.
SuperSTAR Data Control Modules
This section describes the following modules:
|Confidentiality||The Confidentiality Rule module supports random rounding and cell suppression.|
|Perturbation||Perturbation is our most advanced confidentiality solution. It offers consistent, repeatable cell adjustments that avoid revealing confidential information without introducing bias into your results.|
|Record Count||Used in conjunction with other modules, Record Count calculates the number of records that contribute to the results in a table.|
|Relative Standard Error||Allows you to show RSE figures in SuperWEB2 for weighted databases.|
|Scaling Factor and Precision||Allows you to set a scale value and precision setting to be applied to each cell in the table.|
|Sparsity||Prevents the release of tables that contain a high proportion of cells with very low values.|