enabling social science research over time

Glossary

ICPSR has identified a set of terms that are used in discussing and documenting our digital preservation program. We have developed definitions for these terms that often reflect both the general use of the term and the use of the term within the context of ICPSR. Terms within online versions of ICPSR digital preservation documents are linked to definitions within the Glossary.

Show me all items related to:    

Access

Definition:The act of making information available. Digital preservation is a requirement for providing long-term access to digital content. Access is "the OAIS entity that contains the services and functions which make the archival information holdings and related services visible to Consumers." OAIS requires that an archive be able to find and deliver digital content to authorized users; delivery may be to an individual or to an access delivery system.
At ICPSR:The Collection Delivery Unit is responsible for providing access services and the digital preservation function preserves the capability to regenerate the DIPs (Dissemination Information Packages) as needed over time.
Source:ICPSR; OAIS: CCSDS 650.0-B-1, p. 1-7.
Date added:June 18, 2007
Last updated:June 18, 2007
Keywords:general digital content, lifecycle management, access management, OAIS, digital preservation

Administration

Definition:"The OAIS entity that contains the services and functions needed to control the operation of the other OAIS functional entities on a day-to-day basis." The OAIS Reference Model identifies the policies and other documents that are the responsibility of Administration and are required by an OAIS.
At ICPSR:The Administration function is currently provided by Computing and Network Services, which oversees the works of the Data Library staff, in conjunction with the Digital Preservation Officer, who develops requisite policies and guidance for digital preservation operations. Digital preservation policy development at ICPSR is informed by OAIS.
Source:OAIS: CCSDS 650.0-B-1, p. 1-7.
Date added:June 18, 2007
Last updated:June 18, 2007
Keywords:digital preservation, preservation planning, OAIS

Archival Information Collection (AIC)

Definition:"An Archival Information Package whose Content Information is an aggregation of other Archival Information Packages."
At ICPSR:The Collection Delivery Unit is responsible for providing access services and the digital preservation function preserves the capability to regenerate the DIPs (Dissemination Information Packages) as needed over time.
Source:OAIS: CCSDS 650.0-B-1, p. 1-7.
Date added:June 18, 2007
Last updated:June 18, 2007
Keywords:digital preservation, archival storage, OAIS

Archival Information Package (AIP)

Definition:"An Information Package, consisting of the Content Information and the associated Preservation Description Information (PDI), which is preserved within an OAIS."
At ICPSR:The AIP consists of the original files deposited, processed versions of data files and documentation, normalized files, and associated metadata.
Source:OAIS: CCSDS 650.0-B-1, p. 1-7.
Date added:June 18, 2007
Last updated:June 18, 2007
Keywords:digital preservation, archival storage, OAIS, preservation planning

Archival Storage

Definition:"The OAIS entity that contains the services and functions used for the storage and retrieval of Archival Information Packages."
At ICPSR:The Archival Storage function provides onsite and offsite redundancy through online copies (and a tape copy as extra backup) of ICPSR's digital content, both the archival copies and the access copies. ICPSR preserves the ability to regenerate the Dissemination Information Package (DIP); we do not preserve the software-dependent files (e.g., SAS, SPSS, Stata) that are distributed. Archival storage contributes to ensuring business continuity for ICPSR and is a component of the disaster planning at ICPSR.
Source:OAIS: CCSDS 650.0-B-1, p. 1-8.
Date added:June 18, 2007
Last updated:June 18, 2007
Keywords:digital preservation, archival storage, OAIS, disaster planning, life cycle management

Archive

Definition:"An organization that intends to preserve information for access and use by a Designated Community."
At ICPSR:The whole of ICPSR functions as an archive that preserves social science research data for the social science research community (and other users).
Source:OAIS: CCSDS 650.0-B-1, p. 1-8.
Date added:June 18, 2007
Last updated:June 18, 2007
Keywords:digital preservation, archival storage, OAIS

ASCII

Definition:A character encoding scheme used by many computers. ASCII stands for American Standard Code for Information Interchange. ASCII is considered a preservation format for datasets because it is nonproprietary.
At ICPSR:We also use "ASCII" to refer to social science data conveyed in a raw ASCII text format as opposed to a proprietary statistical package format.
Source:ICPSR
Date added:April 6, 2007
Last updated:June 18, 2007
Keywords:general digital content, digital preservation

Born-digital

Definition:A descriptor for information that is created in digital form, as opposed to digitized from analog sources.
At ICPSR: The majority of deposits consist of born digital content. There are some examples of hard copy and anolog materials that might be made digital (digitized) by ICPSR. For example, the Data-PASS project is identifying older social science data that include documentation and other components in hard copy format and there are some deposits that contain video in VHS format.
Source:ICPSR
Date added:April 6, 2007
Last updated:June 18, 2007
Keywords:general digital content, digital preservation

Business Continuity

Definition:"Describes the processes and procedures an organization puts in place to ensure that essential functions can continue during and after a disaster." [1] A note regarding preservation: "Backups vs Preservation: Disaster recovery strategies and backup systems are not sufficient to ensure survival and access to authentic digital resources over time. A backup is a short-term data recovery solution following loss or corruption and is fundamentally different to an electronic preservation archive." [2]
At ICPSR:We are addressing business continuity requirements by ensuring redundant backup of the preservation and access copies of ICPSR's digital content, by the establishment of a warm backup for the ICPSR Web server, by identifying our core functions for business continuity, by assessing the current backup and storage measures for our institutional records that support core functions to diminish the risk of loss in most emergency situations, by conducting a self-assessment of our information security program to comply with relevant standards, and by developing the requisite policies and procedures for business continuity.
Source:There are many definitions for business continuity. [1] SearchStorage.com. [2] "Continued access to authentic digital assets," JISC Digital Preservation Paper, Nov 26, 2006.
Date added:June 18, 2007
Last updated:June 18, 2007
Keywords:disaster planning, digital preservation

Canonical Formats

Definition:"In information technology, canonicalization is the process of making something [conform] with some specification... and is in an approved format. Canonicalization may sometimes mean generating canonical data from noncanonical data."[1]
At ICPSR:Canonical formats are widely supported and considered to be optimal for long-term preservation.
Source:Definition of canonical format in the IT encyclopedia: Whatis?com. See also Clifford Lynch, "Canonicalization: A Fundamental Tool to Facilitate Preservation and Management of Digital Information," D-Lib Magazine, September 1999, volume 5, Number 9.
Date added:April 6, 2007
Last updated:June 18, 2007
Keywords:digital preservation, digital content processing, general digital content

Codec

Definition:"A codec is the means by which sound and video files are compressed for storage and transmission purposes. There are various forms of compression: 'lossy' and 'lossless', but most codecs perform lossless compression because of the much larger data reduction ratios that occur [with lossy compression]. Most codecs are software, although in some areas codecs are hardware components of image and sound systems. Codecs are necessary for playback, since they uncompress [or decompress] the moving image and sound files and allow them to be rendered."
At ICPSR:ICPSR will have to specify which type of codec they would like to use in creating digital files of video materials. Preferred codecs can change as frequently as preferred file formats; it will be important to conduct current research to know which codecs are most appropriate.
Source:Source: "Moving Images and Sound Archiving Study" (Word 1.4M)
Date added:July 18, 2008
Last updated:July 18, 2008
Keywords:digital preservation, digital video, digital content processing

Common Services

Definition:"The supporting services such as inter-process communication, name services, temporary storage allocation, exception handling, security, and directory services necessary to support the OAIS."
At ICPSR:Computing and Network Services (CNS) provides or acquires requisite services to provide Common Services to meet the requirements of digital preservation.
Source:OAIS: CCSDS 650.0-B-1, p. 1-8
Date added:June 18, 2007
Last updated:June 18, 2007
Keywords:digital preservation, OAIS

Compression ratio or reduction ratio

Definition:The ratio that is used to discuss the quantity of original data versus the quantity of data after compression.
At ICPSR:
Source:zenith.com, howstuffworks.com
Date added:July 18, 2008
Last updated:July 18, 2008
Keywords:digital preservation, digital video

Consumer

Definition:"The role played by those persons, or client systems, who interact with OAIS services to find preserved information of interest and to access that information in detail. This can include other OAISs, as well as internal OAIS persons or systems."
At ICPSR:Member institutions and other users are the Consumers of ICPSR digital assets.
Source:OAIS: CCSDS 650.0-B-1, p. 1-8
Date added:June 18, 2007
Last updated:June 18, 2007
Keywords:digital preservation, OAIS, access management

Data

Definition:For social science, data is generally numeric files originating from social research methodologies or administrative records, from which statistics are produced.
At ICPSR:At ICPSR, the majority of digital content matches this definition of data. ICPSR's collections are expanding to include audio, video, geospatial, Web-based and other digital content that pertains to social science research.
Source:ICPSR
Date added:April 6, 2007
Last updated:June 18, 2007
Keywords:general digital content, digital preservation

Data Management

Definition:"The OAIS entity that contains the services and functions for populating, maintaining, and accessing a wide variety of information. Some examples of this information are catalogs and inventories on what may be retrieved from Archival Storage, processing algorithms that may be run on retrieved data, Consumer access statistics, Consumer billing, Event Based Orders, security controls, and OAIS schedules, policies, and procedures."
At ICPSR:The pipeline incorporates a diagram and visualization of the Data Management function of OAIS for ICPSR. The increasingly comprehensive Oracle system provides Data Management services and content defined in OAIS, including information from the Deposit Form, the Study Tracking System, the metadata record, the current data library system, the growing preservation system, the turnover system, and other components of the lifecycle as they are automated. The process improvement initiative is reviewing and revising the lifecycle process at ICPSR.
Source:OAIS: CCSDS 650.0-B-1, p. 1-9.
Date added:June 18, 2007
Last updated:June 18, 2007
Keywords:digital preservation, OAIS, preservation metdata, lifecycle management

Data Processing

Definition:Within the field of information technology, data processing typically means the processing of information by machines.
At ICPSR:Data processing is defined by procedures designed to make a data collection easier to use, ensure its accuracy, enhance its utility, optimize its format, protect confidentiality, etc. For archival purposes, the process and results of data processing must be systematically and comprehensively captured so that the process applied to the data is transparent to users.
Source:ICPSR
Date added:April 6, 2007
Last updated:June 18, 2007
Keywords:digital preservation, digital content processing, quality control, lifecycle management

Designated Community

Definition:An OAIS concept describing the constituency for which the archived information should be relevant and understandable.
At ICPSR:The Designated Community includes depositors (Producers) and users (Consumers) who are typically members of the social science research community or extensions of that community, e.g., data librarians, digital archivists.
Source:ICPSR
Date added:April 6, 2007
Last updated:June 18, 2007
Keywords:digital preservation, OAIS, access management

Digital Curation

Definition:"Digital curation is all about maintaining and adding value to a trusted body of digital information for future and current use; specifically, the active management and appraisal of data over the entire life cycle. Digital curation builds upon the underlying concepts of digital preservation whilst emphasizing opportunities for added value and knowledge through annotation and continuing resource management. Preservation is a curation activity, although both are concerned with managing digital resources with no significant (or only controlled) changes over time."
At ICPSR:Digital curation is a fairly new term. Curation of social science research data has always been the mission and purpose of ICPSR, if not the term used to described what we do. ICPSR is formalizing its data stewardship services at the University of Michigan and for member institutions.
Source:JISC Digital Preservation Paper, Nov 26, 2006
Date added:June 18, 2007
Last updated:June 18, 2007
Keywords:digital preservation, access management, lifecycle management

Digital Preservation

Definition:A term that encompasses all of the activities required to ensure that the digital content designated for long-term preservation is maintained in usable formats, for as long as access to that content is needed or desired, and can be made available in meaningful ways to current and future users. More...
At ICPSR:Digital preservation is a distributed function that includes the Digital Preservation Officer, who develops and promulgates requisite policies that reflect prevailing standards and practice in the digital preservation community; Computing and Network Services, which oversees the archival storage function, the day-to-day operations of digital preservation, and develops tools and procedures to perform digital preservation activities and meet archival requirements.
Source:ICPSR
Date added:June 18, 2007
Last updated:June 18, 2007
Keywords:digital preservation, lifecycle management

Digital Videotape Formats

Definition:"A related family of open bitstream encoding formats for recording digital video on physical media (tapes, hard disks) through digital video devices (digicams, camcorders)." Currently DV, DVCAM, and DVCPRO are the most widely used digital videotape formats.
At ICPSR:These digital video formats are different and distinct from the digital video file formats that will comprise the main thrust of ICPSR's digital video preservation program. However, it is likely that many depositors will use these formats and ICPSR must be prepared to convert them. This will be somewhat challenging because it is can be difficult to transcode these formats to data files.
Source:"Moving Images and Sound Archiving Study" (Word 1.4M)
"Long-Term Storage of Video in the Digital World"
Date added:July 18, 2008
Last updated:July 18, 2008
Keywords:digital preservation, digital video

Disclosure Limitation

Definition:Procedures undertaken to limit the risk of disclosure of individual identities in data files.
At ICPSR:The techniques used for disclosure limitation include data masking, recoding, topcoding, swapping, and perturbation (see other ICPSR sources for definitions of these terms). Like data processing, the process and results of disclosure limitation need to be systematically, comprehensively, and transparently documented for users.
Source:ICPSR
Date added:April 6, 2007
Last updated:June 18, 2007
Keywords:general digital content, digital preservation, access management, confidentiality, quality control

Dissemination Information Package (DIP)

Definition:"The Information Package, derived from one or more AIPs, received by the Consumer in response to a request to the OAIS." An archive works with Consumers over time to ensure that DIPs remain useful.
At ICPSR:The DIPs are the access copies of files (data, documentation, supporting files, and related metadata) that are made available to users by download via the ICPSR Web site; by CD via the mail, for a subset of files that require a user agreement; or in the ICPSR data enclave onsite, for files that contain sensitive information and cannot otherwise be made available.
Source:OAIS: CCSDS 650.0-B-1, p. 1-10.
Date added:June 18, 2007
Last updated:June 18, 2007
Keywords:digital preservation, OAIS, access management

Documentation

Definition:Generically, any information on the structure, contents, and layout of a data file. Sometimes called "technical documentation" or "a codebook". Documentation may be considered a specialized form of metadata.
At ICPSR:Documentation has arrived in a wide array of formats since the establishment of ICPSR in 1962. To meet preservation requirements, documentation must be complete, correct, comprehensive, current, and compliant (to content and preservation standards). ICPSR produces documentation that conforms with the Data Documentation Initiative (DDI). (See the DDI Web site for current information about the version and current status of DDI.) As an XML-based format, DDI provides a preferred preservation format for documentation.
Source:ICPSR
Date added:April 6, 2007
Last updated:June 18, 2007
Keywords:general digital content, digital preservation, access management

Ingest

Definition:"The OAIS entity that contains the services and functions that accept Submission Information Packages from Producers, prepares Archival Information Packages for storage, and ensures that Archival Information Packages and their supporting Descriptive Information become established within the OAIS."
At ICPSR:Ingest covers the lifecycle stages of selection and appraisal (based on the Collection Development policy and criteria), acquisition (with the Deposit Form serving as a Submission Agreement), and processing (quality control) followed by the generation of the AIP for Archival Storage.
Source:OAIS: CCSDS 650.0-B-1, p. 1-11.
Date added:June 18, 2007
Last updated:June 18, 2007
Keywords:digital preservation, OAIS, quality control, lifecycle management

Management

Definition:"The role played by those who set overall OAIS policy as one component in a broader policy domain."
At ICPSR:The Director, the Digital Preservation Officer, and the Director's Group perform the role of Management in the OAIS context, with input from ICPSR Advisory Council and approval of the highest level policies.
Source:OAIS: CCSDS 650.0-B-1, p. 1-11.
Date added:June 18, 2007
Last updated:June 18, 2007
Keywords:digital preservation, OAIS, preservation planning

Metadata

Definition:A term that refers to structured data about data. Metadata is an old concept (e.g., card catalogs and indexes), but metadata is often essential for digital content to be useful and meaningful. Metadata can capture general or specific information about digital content that may define administrative, technical, or structural characteristics of the digital content. "Preservation metadata" is the term for a broader set of metadata that documents the lifecycle of digital content from creation through processing, storage, preservation, and use over time. Preservation metadata is required at the aggregate (e.g., collection and study level) and at the item (e.g., file and variable) level. All preservation actions that are applied to digital content over time should be captured in preservation metadata, for example. The Preservation Metadata Implementation Strategies (PREMIS) data dictionary is a digital preservation community development that is moving towards being a standard. There are additional format-specific (e.g., NISO Still Image data dictionary) and other standards that define additional metadata for preservation.
At ICPSR:We prepare a metadata record for each data collection, and we present a searchable database of metadata records on our public Web site. ICPSR has defined a set of file-level metadata elements for preservation and intends to comply with PREMIS as it develops. The process improvement initiative at ICPSR includes the identification of metadata at each stage of the pipeline.
Source:ICPSR
Date added:April 6, 2007
Last updated:June 18, 2007
Keywords:general digital content, digital preservation, preservation metadata

Normalization

Definition: In a preservation context, normalization refers to a preservation strategy that involves the imposition of standard formats and rules to create preservable file formats. Normalization has specific connotations within the database (e.g., normalized tables), the Web (e.g., normalized URLs), and other communities, but the essence of the term is to standardize for more effective processing and exchange of information.
At ICPSR:We use normalization as a preservation strategy. We convert deposited files from their original format to an accepted preservation format as needed. Both the original file and the normalized file are retained.
Source:ICPSR
Date added:April 6, 2007
Last updated:June 18, 2007
Keywords:digital preservation, preservation planning, digital content processing, quality control

OAIS

Definition:The Open Archive Information System (OAIS) Reference Model, an ISO standard that formally expresses the roles (producer, management, consumer, and implicitly archives), functions (common services, ingest, archival storage, data management, administration, preservation planning, and access), and content (submission information package, archival information collection, archival information package, and dissemination information package) of an archive. It was approved as an ISO standard in 2003. OAIS is undergoing a five-year review in 2007.
At ICPSR:The digital preservation policies program, system, and function are being developed in conformance with OAIS.
Source:OAIS (ISO 14721: 2003). The Consultative Committee for Space Data Systems (CCSDS) of NASA coordinated the development of OAIS. The final version of OAIS produced by CCSDS before OAIS became an ISO standard was CCSDS 650.0-B-1. This is the version that is often cited and referenced because it is in the public domain. OAIS is an extensive document filled with descriptions and examples of additional concepts. These are the concepts most often used at ICPSR and defined in this glossary.
Date added:April 6, 2007
Last updated:June 18, 2007
Keywords:digital preservation, OAIS

Pipeline

Definition:In computer science, pipeline processing is "a category of techniques that provide simultaneous, or parallel, processing within the computer. It refers to overlapping operations by moving data or instructions into a conceptual pipe with all stages of the pipe processing simultaneously. For example, while one instruction is being executed, the computer is decoding the next instruction." The term pipeline calls to mind the assembly line approach in manufacturing.
At ICPSR:The pipeline refers to the flow of digital content from reception through processing to public release with imbedded preservation milestones.
Source:PC Magazine
Date added:April 6, 2007
Last updated:June 18, 2007
Keywords:digital content processing, digital preservation, quality control, lifecycle management

Preservation Planning

Definition:The OAIS entity that "provides the services and functions for monitoring the environment of the OAIS and providing recommendations to ensure that the information stored in the OAIS remains accessible to the Designated User Community over the long term, even if the original computing environment becomes obsolete. Preservation Planning functions include evaluating the contents of the archive and periodically recommending archival information updates to migrate current archive holdings, developing recommendations for archive standards and policies, and monitoring changes in the technology environment and in the Designated Community's service requirements and Knowledge Base. Preservation Planning also designs IP templates and provides design assistance and review to specialize these templates into SIPs and AIPs for specific submissions. Preservation Planning also develops detailed Migration plans, software prototypes and test plans to enable implementation of Administration migration goals."
At ICPSR:The Digital Preservation Officer is primarily responsible for Preservation Planning with the programming and technical infrastructure support of Computing and Network Services (CNS).
Source:OAIS: CCSDS 650.0-B-1, p. 4-2.
Date added:June 18, 2007
Last updated:June 18, 2007
Keywords:digital preservation, OAIS, preservation planning, quality control, lifecycle management

Producer

Definition:"The role played by those persons, or client systems, who provide the information to be preserved. This can include other OAISs or internal OAIS persons or systems."
At ICPSR:The Producer includes principal investigators, project managers, federal agencies, other data archives, and others; it is anyone who authorizes (or requires) ICPSR to preserve digital content.
Source:OAIS: CCSDS 650.0-B-1, p. 1-12.
Date added:June 18, 2007
Last updated:June 18, 2007
Keywords:digital preservation, OAIS

Restricted-Use Data

Definition:Data that contain sensitive information (usually about human subjects) that could permit the identification of individuals.
At ICPSR:To obtain access to these data through ICPSR, a user must complete a legal contract or in some cases travel to where the data are stored. The presence of sensitive information in deposited digital content presents a management challenge for long-term preservation to ensure that archival storage requirements for achieving distributed redundancy address confidentiality requirements, for example.
Source:ICPSR
Date added:April 6, 2007
Last updated:June 18, 2007
Keywords:digital preservation, access management, confidentiality

Submission Information Package (SIP)

Definition:"An Information Package that is delivered by the Producer to the OAIS for use in the construction of one or more AIPs."
At ICPSR:The SIP includes the original files and associated metadata and documentation, including information provided on the ICPSR Deposit Form.
Source:OAIS: CCSDS 650.0-B-1, p. 1-13.
Date added:June 18, 2007
Last updated:June 18, 2007
Keywords:digital preservation, OAIS