National Addiction & HIV Data Archive Program

What kind of documentation do you provide, and in what formats?

For any study, there are several possible types of documentation files for data collections:

  • Codebook: Information on the structure, contents, and layout of a data file. The codebook may also contain information on study design and methodology.

  • Dictionary file: Information on column locations and labeling of variables.

  • Data collection instrument: Original survey instrument or questionnaire.

  • Data map: Similar to a dictionary file.

  • Errata file: Errors noted for a particular collection, usually supplied by the principal investigator.

  • Frequency file: Frequency of response or descriptive statistics for selected variables in a collection.

  • Crosstabulation file: Crosstabulations for some or all variables in a collection.

  • User Guide: More detailed information about a particular collection, often provided by the principal investigator.

  • Manual: Instructions prepared by the principal investigator on some aspect of the data collection.

  • Appendices: Additional documentation.

  • Reports: Description of findings or results based on analysis of a dataset. Prepared by the principal investigator.

  • Record layout file: Similar to a dictionary file.

  • Tables/Crosstables: Similar to frequencies files but presented in tabular format.

Our standard for documentation is Portable Document Format (PDF), and we are moving toward compliance with the PDF/A standard. The PDF file format was developed by Adobe Systems Incorporated and can be accessed using PDF reader software, such as the Adobe Acrobat Reader.

Some older studies may have ASCII or Word-processed documentation, but those formats are being converted to PDF.