Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH)

NOTE: By downloading ICPSR metadata records, you agree to ICPSR's Conditions of Use regarding those records.

ICPSR provides study-level metadata via OAI-PMH. To use this service:

  • Request your IT staff install an OAI harvester.
  • In the harvester software, enter the base URL and metadataPrefix for the format you wish to download. (See below.)
  • Run the software.

Some harvesters operate at the unix/linux command line; some operate using simple web interfaces. ICPSR tested its OAI-PMH implementation using jOAI, which uses a web interface.

For more information on the standard, please visit the The OAI-PMH website. also maintains a page on OAI-PMH Tools, which lists a few OAI harvesters.

Technical Details

OAI-PMH is basically a base URL for either study metadata or related citation metadata:

with 1-3 variables tacked onto the end of the URL. The three variables are:

  • metadataPrefix (format)
  • verb
  • identifier

So a URL to retrieve the study metadata record for ICPSR 6849 in Dublin Core format would look like this:

Variables are added to the end of a URL after a question mark. Individual variables are constructed as fieldname=value and are separated by ampersands.

base URL


The metadataPrefix variable spells out the format of the output. ICPSR supports the following prefixes:


  • Dublin Core - metadataPrefix=oai_dc
  • DDI 2.5 - metadataPrefix=oai_ddi25
  • DDI 2.5 with Citations - metadataPrefix=oai_ddi25_citations
  • MARC21XML - metadataPrefix=oai_marc


  • Scholix - metadataPrefix=oai_scholix

Please note the Scholix feed returns only links to publications that have identifiers (DOI, URL, PMCID).

If you would like ICPSR to provide additional formats/objects, please contact us at


The verb variable spells out what kind of result you want to obtain. Not all OAI-PMH verbs are useful for our particular implementation of OAI-PMH; the useful verbs are:

  • ListRecords - Retrieves 50 records at a time. ICPSR has over 9000 studies, so we use something called a resumptionToken, which will enable scripts to retrieve the entire collection in 50-record increments.
  • GetRecord - Returns an individual metadata record; requires an identifier

In addition, there are other OAI-PMH verbs that we don't fully utilize:

  • Identify - Provides a little information on the OAI-PMH service and repository.
  • ListSets - Not used by ICPSR.
  • ListIdentifiers - Returns a list of ICPSR identifiers (and the release date for each). Since ICPSR identifiers are just the study numbers, this isn't typically useful.
  • ListMetadataFormats - Lists the available metadata formats for a given record; requires an identifier. As ICPSR currently only supports Dublin Core, this verb is mostly useless.


The identifier variable enables you to spell out which object you wish to retrieve, in this case a study. ICPSR identifiers are just the study number. You can use either the 5-digit study number, or the study number without padding. I.e., both 6849 and 06849 will work.

Our citation identifiers are strictly internal, so it's unlikely you'll use them to perform a GetRecord.


ICPSR can provide some support for OAI-PMH if our server is not responding or the retrieved metadata is not valid. We can also add metadata formats if there is sufficient demand. We cannot provide support for installing or implementing OAI harvesters at your institution.

If you have questions, email us at

The OAI-PMH feed was last tested on 2018-02-21 using MarcEdit.