Numerical Meanings of Probabilistic Expressions (ICPSR 6046)

Published: Jan 12, 2006

Principal Investigator(s):
Frederick Mosteller; Cleo Youtz

Version V1

These data were collected to obtain a clearer understanding of the quantitative meanings that people perceive in common words used to describe probabilistic outcomes. For example, in everyday language, people apply the expressions "always" and "certain" to events that occur in fewer than 100 percent of their opportunities. In this study, science writers were surveyed and asked to quantify, in a percentage term, their understanding of each of 52 expressions. They were also asked to indicate how they thought their readers would quantify each term, giving both an upper and lower limit they thought their readers would set for each expression. One group of expressions included the word "probability", and ranged from "very high probability" to "very low probability". Another used various forms of the word "probable", such as "very probable" and "improbable". Other expressions were centered around the word "chance": "better than even chance" to "less than even chance". The survey also included words like "always", "often", "frequently", "never", and "sometimes". Also tested were expressions with regularly used modifiers such as "very", or negation (not, un-, im-, in-), so that the effect of such modifiers could be evaluated. The sample of respondents was split to permit assessment of the effects of order of presentation: half received a form that ranked the expressions within 15 groups from high probability to low, while the other half received a form ordering the expressions from low probability to high.

Mosteller, Frederick, and Youtz, Cleo. Numerical Meanings of Probabilistic Expressions. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2006-01-12.

Export Citation:

  • RIS (generic format for RefWorks, EndNote, etc.)
  • EndNote

National Science Foundation (SES 8401422)


The data are provided, as received from the producer, in 104 discrete small files, each corresponding to one of the 52 probabilistic expressions and to one of the two survey forms. The files have been edited by ICPSR for easier handling by statistical software. Specifically, two lines of comment, which identified the expression to which the file's data referred, have been removed from each file. In their place, two variables have been added to each record: one identifying the expression and a second identifying the form code. In addition, the respondent identification code was edited to remove blanks. Users should note that the data are not arranged in these files in fixed columns but in a free-format list, with one record per line and each variable delimited by a comma.

A total of 238 respondents from the population (37 percent response rate).

The universe consisted of 637 members of the National Association of Science Writers in the United States and Canada.

self-enumerated questionnaires

survey data



2006-01-12 All files were removed from dataset 106 and flagged as study-level files, so that they will accompany all downloads.

2006-01-12 All files were removed from dataset 105 and flagged as study-level files, so that they will accompany all downloads.

1994-10-19 ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection:

  • Performed recodes and/or calculated derived variables.
  • Checked for undocumented or out-of-range codes.


  • Data in this collection are available only to users at ICPSR member institutions.

  • The citation of this study may have changed due to the new version control system that has been implemented.
ICPSR logo

This study is provided by ICPSR. ICPSR provides leadership and training in data access, curation, and methods of analysis for a diverse and expanding social science research community.