This study is provided by ICPSR. ICPSR provides leadership and training in data access, curation, and methods of analysis for a diverse and expanding social science research community.
Congressional Record for 104th-110th Congresses: Text and Phrase Counts (ICPSR 33501)
This qualitative data collection contains original and processed text from the United States Congressional Record for the 104th-110th Congresses. The Congressional Record includes text from both chambers, the United States House of Representatives and the United States Senate. For each Congress the archive includes the original tagged text files, parsed files that separate the text into individual speeches, speaker metadata that can be linked to the parsed files, and counts of two-word phrases (bigrams) by speaker, party, and date.
Data in this collection are available only to users at ICPSR member institutions. Please log in so we can determine if you are with a member institution and have access to these data files.
WARNING: Because this study has many datasets, the download all files option has been suppressed, and you will need to download one dataset at a time.
WARNING: This study is over 150MB in size and may take several minutes to download on a typical internet connection.
Gentzkow, Matthew, and Jesse Shapiro. Congressional Record for 104th-110th Congresses: Text and Phrase Counts. ICPSR33501-v4. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2015-12-01. http://doi.org/10.3886/ICPSR33501.v4
Persistent URL: https://doi.org/10.3886/ICPSR33501.v4
This study was funded by:
- National Science Foundation (SES-0617658 and SES-0922342)
Scope of Study
Geographic Coverage: United States
This collection has not been processed by ICPSR and is being released in the original ASCII format for convenience of use; no value labels are present in the data.
Please see the ICPSR User Guide for information about what each part of the data collection contains.
Please note that the files for this data collection are extremely large. Users should exercise discretion when downloading files.
Congressional Records obtained from the Government Printing Office
Original ICPSR Release: 2012-12-14
- 2015-12-01 This collection is being updated to comply with new ICPSR file-naming conventions. No other changes have been made to the collection.
- 2015-10-23 This collection is being updated to include data for the 110th Congress, spanning the years 2007 and 2008.
- 2013-07-08 The User Guide was updated.
- Citations exports are provided above.
Export Study-level metadata (does not include variable-level metadata)
If you're looking for collection-level metadata rather than an individual metadata record, please visit our Metadata Records page.