Congressional Record for 104th-110th Congresses: Text and Phrase Counts (ICPSR 33501)
Version Date: Dec 1, 2015 View help for published
Principal Investigator(s): View help for Principal Investigator(s)
Matthew Gentzkow, University of Chicago, and National Bureau of Economic Research;
Jesse Shapiro, University of Chicago, and National Bureau of Economic Research
https://doi.org/10.3886/ICPSR33501.v4
Version V4
Summary View help for Summary
Please note that inconsistencies have been identified in some of the data accompanying this collection related to the variable "speechID." Potential data users are advised that the files in DS15-DS20 may be compromised and should be used with caution.
This qualitative data collection contains original and processed text from the United States Congressional Record for the 104th-110th Congresses. The Congressional Record includes text from both chambers, the United States House of Representatives and the United States Senate. For each Congress the archive includes the original tagged text files, parsed files that separate the text into individual speeches, speaker metadata that can be linked to the parsed files, and counts of two-word phrases (bigrams) by speaker, party, and date.
Citation View help for Citation
Export Citation:
Funding View help for Funding
Subject Terms View help for Subject Terms
Geographic Coverage View help for Geographic Coverage
Smallest Geographic Unit View help for Smallest Geographic Unit
United States
Distributor(s) View help for Distributor(s)
Time Period(s) View help for Time Period(s)
Date of Collection View help for Date of Collection
Data Collection Notes View help for Data Collection Notes
-
This collection has not been processed by ICPSR and is being released in the original ASCII format for convenience of use; no value labels are present in the data.
-
Please see the ICPSR User Guide for information about what each part of the data collection contains.
-
Please note that the files for this data collection are extremely large. Users should exercise discretion when downloading files.
Study Design View help for Study Design
Please refer to the Original P.I. Documentation in the ICPSR User Guide.
Sample View help for Sample
The data are not a sample, as this collection is an aggregation of data on Congressional speech.
Universe View help for Universe
Full-text of the published Congressional Record for both chambers of the 104th-110th Congresses of the United States.
Unit(s) of Observation View help for Unit(s) of Observation
Data Source View help for Data Source
Congressional Records obtained from the Government Printing Office
Data Type(s) View help for Data Type(s)
HideOriginal Release Date View help for Original Release Date
2012-12-14
Version History View help for Version History
- Gentzkow, Matthew, and Jesse Shapiro. Congressional Record for 104th-110th Congresses: Text and Phrase Counts. ICPSR33501-v4. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2015-12-01. http://doi.org/10.3886/ICPSR33501.v4
2015-12-01 This collection is being updated to comply with new ICPSR file-naming conventions. No other changes have been made to the collection.
2015-10-23 This collection is being updated to include data for the 110th Congress, spanning the years 2007 and 2008.
2013-07-08 The User Guide was updated.
Notes
These data are freely available to data users at ICPSR member institutions. The curation and dissemination of this study are provided by the institutional members of ICPSR. How do I access ICPSR data if I am not at a member institution?