Survey of Inmates in State and Federal Correctional Facilities
Designed by the Bureau of Justice Statistics and conducted by the Bureau of the Census, these surveys are part of a series of data gathering efforts undertaken to assist policymakers in assessing and remedying deficiencies in the nation's correctional institutions. The surveys gathered extensive information on demographic, socioeconomic, and criminal history characteristics. Also obtained were details of inmates' military service records such as time of service and branch of service, eligibility for benefits, type of discharge, and contact with veterans' groups. Other variables include age, ethnicity, education, gun possession and use, lifetime drug use and alcohol use and treatment, prior incarceration record, and prearrest annual income. Data on characteristics of victims and on prison activities, programs and services are provided as well.
Using the Resource Guide
NACJD, a part of the Inter-University Consortium for Political and Social Research (ICPSR) at the University of Michigan, designed this Resource Guide for World Wide Web users to learn about the Survey of Inmates in State and Federal Correctional Facilities dataset and to connect to other related information sources.
With this guide, first time users or experienced analysts can:
About the Data
The Survey of Inmates in State and Federal Correctional Facilities is comprised of two distinct surveys. Both surveys used the same data collection instument, and data files resulting from the combination of the two have the same variables and record layout. The Survey of Inmates in State Correctional Facilities (SISCF) was conducted for the Bureau of Justice Statistics (BJS) by the Bureau of the Census. The Survey of Inmates in Federal Correctional Facilities (SIFCF) was also conducted for the BJS and the Federal Bureau of Prisons (BOP) by the Bureau of the Census. These surveys provide nationally representative data on State prison inmates and sentenced Federal inmates held in Federally owned and operated facilities. Through personal interviews from June through October of the survey year, inmates in both State and Federal prisons provided information about their current offense and sentence, criminal history, family background and personal characteristics, prior drug and alcohol use and treatment programs, gun possession and use, and prison activities, programs and services. Surveys of State prison inmates have been conducted in 1974, 1979, 1986, 1991 and 1997. Sentenced Federal prison inmates were first interviewed in the 1991 survey. Beginning in the year 1997, data collected for the State and Federal surveys were combined into one file.
The dataset is comprised of two raw data files, a machine-readable codebook, data collection instruments, and SPSS and SAS data definition statements. The first data file, Numeric data, includes the majority of the responses to the questionnaire items. The second data file, Alphanumeric data, includes all of the literal responses including the "Other-specify" question responses.
The estimation procedures for the SISCF and SIFCF involved weighting the responses from the sampled, interviewed inmates to produce estimates with some calculable degree of sampling error. A series of adjustment factors were applied to the basic weight of each interviewed inmate. Weights for Federal and State inmates were calculated separately.
Basic weight (BW).
The initial weight, or basic weight, for each sampled inmate is the inverse of the probability of selection. This weight changes every year; for the basic weight for a specific year, see the codebook corresponding to that year
Drug Subsampling Factor (DSSF)
The Drug Subsampling Factor was calculated for the SIFCF only. To compensate for subsampling drug offenders by taking only a third of those originally selected, in this adjustment drug offenders were multiplied by 3 and nondrug offenders by 1.
Weighting Control Factor (WCF)
In some prisons, the sampling rate for a facility was adjusted because the actual number of persons in a prison on the sampling date was different from the expected number from earlier Census of State and Federal Correctional Facilities reports or lists from the BOP. When the actual number was less than 80% or more than 120% of the expected number, the weighting control factor was applied to account for adjusting the inmate sampling rate. The weighting control factor is equal to the number of inmates in a facility on the interview date divided by the number expected for that facility. If the expected number was within 20% of the expected number, the weighting control factor was 1.
Duplication Control Factor (DCF)
Several of the very smallest prisons have a total inmate population that is smaller than the number to be sampled in each facility in a particular stratum. For example, if a sample prison contained 15 inmates in a stratum in which 55 were expected to be interviewed, there would be a shortage of inmates. The DCF is used to adjust for the workload shortfall in such prisons. It is equal to the expected number of sample inmates in each facility in a stratum divided by the number of inmates in the prison on the date of the sample. In most prisons, the calculated DCF is less than one because the prison had more total inmates than the expected number in the sample for that stratum; in this case the DCF is set to 1.
Noninterview Factor (NIF)
This factor was applied to adjust the weights to account for noninterviewed inmates. The NIF was calculated as follows:
Basic demographic data on noninterviewed inmates were obtained by interviewers from prison records after they completed interviewing in a facility for the SISCF or from BOP for SIFCF.
Inmate records, including noninterviewed inmates, were separated by gender, stratum, race (Black, nonBlack), and age.
If there were fewer than 30 unweighted cases in a cell, it was collapsed with those in the nearest age category.
For each cell, the adjusted weights (BW x WCF x DCF) were summed separately for interviewed inmates (I) and for noninterviewed inmates (N).
A noninterview adjustment factor was calculated for each cell as the sum of the adjusted weights (BW x WCF x DCF) for both interviewed and noninterviewed inmates divided by the adjusted weights for the interviewed, or NIF=(I + N)/I.
Offense Category Ratio Adjustment Factor (OCRAF)
The OCRAF was used to adjust the weighted sample to reflect varying interview rates among inmates in different offense categories. The OCRAF was computed separately for males and females for a number of different offense categories for State inmates and offense categories for Federal inmates. It was calculated as the weighted count of interview and noninterview thru the DCF divided by the weighted count for each stratum through application of the NIF.
Control Count Ratio Adjustment Factor (CCRAF)
CCRAF adjusts the weighted interviews by stratum level counts as of some specific date; this date varies by year. For the date specific to some collection year, see the codebook corresponding to that collection year. For the SISCF these counts were from the National Prisoners Statistics series (NPS-1A). For the SIFCF, the BOP provided counts of sentenced Federal prisoners as of some date (see codebook.)
Thus the final weight (FW) is the product of the basic weight and all the adjustment factors.
For the SISCF:
FW = BW x WCF x DCF x NIF x OCRAF x CCRAF
For the SIFCF:
FW = BW x DSSF x WCF x DCF x NIF x OCRAF x CCRAF
Accuracy of Estimates
Since the SISCF and SIFCF estimates come from a sample, they may differ from figures from a complete census using the same questionnaire, instructions, and enumerators. A sample survey has two possible types of errors: sampling and nonsampling. The accuracy of an estimate depends on both types of errors, but the full extent of the nonsampling error is unknown. Consequently, one should be particularly careful when interpreting results based on a relatively small number of cases or small differences between estimates. The standard errors for SISCF and SIFCF estimates primarily indicate the magnitude of sampling error. They also partially measure the effect of some nonsampling errors in responses and enumeration, but do not measure systematic biases in the data. (Bias is the average over all possible samples of the differences between the sample estimates and the desired value.)
There are several sources of nonsampling errors, including the following:
- Inability to obtain information about all cases in the sample
- Definition difficulties
- Differences in the interpretation of questions
- Respondents' inability or unwillingness to provide correct information
- Respondents' inability to recall information
- Errors made in data collection such as in recording or coding the data
- Errors made in processing the data
- Errors made in estimating values for missing data
- Failure to represent all units within the sample
Nonresponse in the SISCF and SIFCF resulted from failing to obtain cooperation with sample prisons (first stage nonresponse) or failing to obtain completed interviews with sampled inmates (second stage). In the weighting of the sample, the NIF adjusted the weights for second stage nonresponse. The NIF was calculated based on gender, race, age and stratum. However, biases exist in the estimates to the extent that noninterviewed inmates have different characteristics from those of interviewed inmates in the same age-gender-ethnicity-stratum group. Total nonresponse for each survey includes both first and second stage nonresponse.
Comparability of data
Data obtained from the SISCF and SIFCF are not entirely comparable with data from other sources. This is due to differences in interviewer training and experience and in differing survey processes. This is an example of nonsampling variability not reflected in the standard errors. Caution should be used when comparing results from different sources.
Note on results based upon a small number of cases or small differences in estimates: When summary measures (such as medians and percent distributions) are computed on a base smaller than 5,000 for the SISCF and 1,000 for SIFCF, they probably do not reveal useful information because of the large standard errors involved. In addition, nonsampling errors may result in small differences which may appear to be borderline significant, but are not really different.
Sampling variability is variation that occurred by chance because a sample was surveyed rather than the entire population. Standard errors are primarily measures of sampling variability, although they may include some nonsampling error. They are measures of the variations that occur by chance because a sample rather than the entire population was surveyed. The sample estimate and its standard error enable one to construct a confidence interval, a range that would include the average result for all possible samples with a known probability. A particular confidence interval may or may not contain the average estimate derived from all possible samples. However, one can say with specified confidence that the interval includes the average estimate calculated from all possible samples. Standard errors may also be used to perform hypothesis testing.
Generalized variance estimates
A number of approximations are required to derive, at a moderate cost, standard errors applicable to estimates from these two surveys. Instead of providing an individual standard error for each estimate, two parameters, a and b, are provided to calculate standard errors for each type of characteristic. For more information, please see the codebook specific to your data.
Variances were calculated using Vplex, a Bureau of the Census software package designed to calculate variances for data derived from multistage complex sample designs. Variances were calculated for the total sample and for gender, marital status and race/ethnicity subgroups (male or female, and black, nonblack, or Hispanic and married or single). Variables for which variances were estimated included criminal justice status, prior sentence to incarceration, prior sentence to probation, current offense (murder or manslaughter, sexual offense, assault, robbery, other violent, drug offense), marital status, ever used marijuana, ever used cocaine or crack, alcohol use, armed during crime, HIV status, military service, one or more victims, education, age (not used for SIFCF), monthly income prior to arrest, whether physically or sexually abused, family member ever in prison, employment status at arrest, sentenced status, whether under the influence of drugs at time of arrest, whether under the influence of alcohol at time of arrest, whether maximum sentence was less or more than 5 years, whether a disability, whether had
children, whether received help for a mental or emotional problem, who lived with growing up. These variances were calculated for the general form
σ = ax2+bx
The variances were then transformed logrythmically and plotted in a regression in several iterations, excluding outliers until a best fit was obtained. Hence, the a values are the intercept and the b values the slope of the line.
Tests may be performed at various levels of significance. A significance level is the probability of concluding that the characteristics are different when, in fact, they are the same. To conclude that two parameters are different at the .05 level of significance, for example, the absolute value of the estimated difference between characteristics must be greater than or equal to 1.96 times the standard error of the difference.
More detailed information on standard errors can be found in the codebook.
Specific variance estimates
Standard error estimates for specific variables can be derived using software packages developed to generate standard errors for data obtained from a complex sample survey design. Variables have been added to the data files to be used in running such software packages. Variables v2068 through v2079 give sample and universe information: v2068 and v2069 give total numbers of males and females in the universe; v2070 and v2071, total numbers of males and females within stratum; v2074 and v2075, total numbers of males and females within each sampled prison; V2076 and v2077, the number of males and females interviewed within each sampled prison; and v2078 and v2079, the total number of prisons in the universe within each stratum. In addition, v2048 gives the stratum, v2050 and v2051, the male and female population according to the universe file, and v2047 the number of inmates sampled.
The online Survey Documentation and Analysis is a set of programs for the documentation and web-based analysis of survey data. It is recommended for users who would like to do the following:
- search for variables of interest in a dataset
- review frequencies or sumary statistics or key variables to determine what further analyses are appropriate
- review frequencies or summary statistics for missing data
- produce simple summary statistics for reports
- create statistical tables from raw data
- create custom subsets of cases or variables from a particularly large collection to save time in downloading or space on a personal computer
Other Survey of Inmates in State and Federal Correctional Facilities Resources
Bureau of Justice Statistics - Corrections Statistics
The Survey of Inmates in State Correctional Facilities and the Survey of Inmates in Federal Correctional Facilities
The link below will search the ICPSR citations database for citations of publications with "inmate" in the title. Users can create their own searches or browse the
citations database through our Publications Bibliography web page.
Search for Survey of Inmates Publications