Assessing and Mitigating Disclosure Risk: Essentials for Social Science
Disclosure risk - the risk of data subjects being reidentified in shared data - is of concern to all those involved in collecting, using, and distributing social science data. What constitutes risk? How is risk addressed?
First, this course will address the importance of public-use file: the file from which much of the utility of a data collection is extracted and the file which often becomes the historical record of a data collection. Also included in this introductory segment will be a historical perspective on risk and the value and significant uses of secondary data.
Next, the course will cover ways respondent confidentiality can be protected in shared files, including when files are shared publicly. In this segment, how to assess and mitigate disclosure risk will be addressed, and tools developed at ICPSR to assess and document risk will be reviewed. Elements of a disclosure analysis will be discussed, as will disclosure protection (statistical disclosure control, SDC) measures commonly used to create public-use data files. Examples of public-use files created from restricted-use data, steps that can be taken early in the research process to optimize distribution options, and methods of distributing restricted-use data when public-use files cannot be created will also be covered. Examples of disclosure work from ICPSR will be used to illustrate disclosure risk and protection methods.
A third segment will include ways data can be shared when public-use files cannot be created, including via restricted online analysis systems with programmed disclosure controls. Finally, results from the National Survey of Researcher (NSR), which included a national sample of NIH and NSF awardees will be used to demonstrate the extent to which researchers share data, how they share, and the relationship between data sharing and knowledge and use of disclosure protection methods. A test will be available for participants desiring certification for having completed the course. Participants may bring examples of disclosure risk problems to class for discussion. Links to source documents and resources will be provided.
This course is designed for social scientists, information scientists, and others interested in becoming more conversant regarding disclosure risk and analysis, including learning about the components of a disclosure analysis.
Fee: Members = $1400; Non-members = $2800