This study is provided by ICPSR. ICPSR provides leadership and training in data access, curation, and methods of analysis for a diverse and expanding social science research community.

Global Entrepreneurship Monitor (GEM): Adult Population Survey Data Set, 1998-2003 (ICPSR 20320)

Principal Investigator(s):


The Global Entrepreneurship Monitor (GEM) was designed to capture various aspects of firm creation and entrepreneurship across countries. The data have been collected over the course of 6 years (1998-2003) and includes responses from individuals in over 40 countries. The data were gathered from 138 separate surveys. This dataset is a harmonized file capturing the results from all of the surveys. Respondents were queried on the following main topics: general entrepreneurship, start-up activities, ownership and management of the firm, and business angels (angel investors). Respondents were initially screened by way of a series of general questions pertaining to starting a business such as whether they were currently trying to start a new business, whether they knew anyone who had started a new business, whether they thought it was a good time to start a new business, as well as their perceptions of the income potential and the prestige associated with starting a new business. Respondents were also asked about the process of starting up a new business, in particular they were asked whether they had done anything to start a new business in the past 12 months, whether they would own all, part, or none of the business, and how many additional people would be involved in owning and/or managing the new business. They were also asked about wages paid and profits generated by the new business. Respondents were asked to specify what sort of business they were starting, what they would sell, how it would be listed in the phone book, and whether potential customers would find the products or services new and unfamiliar. They were also asked whether their client base would be domestic or foreign, and whether production would be local. Additional questions sought information on the number of people working for the firm, total start-up costs, and the various sources of the start-up money. The survey also included a series of questions pertaining to the ownership and management of the new business. In particular, respondents were asked whether they owned all or part of the business, whether there were family members involved in the ownership of the new business,whether the new business was a split from an existing family-owned business, and for what reason the respondents were involved with the particular venture. Finally, respondents were asked about business angels, individuals who contribute capital investment to the start up of a business. Specifically, respondents were asked to give information regarding funds provided by a business angel, what their relationship was to the business angel, and whether the business angel received or would receive a share in ownership in exchange for their investment. The dataset also contains variables that describe the month and year the survey was conducted, the country in which the respondent was surveyed, as well as the respondent's age, gender, labor force status, income, and educational attainment.

Access Notes

  • These data are available only to users at ICPSR member institutions. Because you are not logged in, we cannot verify that you will be able to download these data.


WARNING: This study is over 150MB in size and may take several minutes to download on a typical internet connection.

Global Entrepreneurship Monitor [GEM]: Adult Population Survey Data Set, 1998-2010 - Download All Files (661 MB) large dataset

Study Description


Reynolds, Paul Davidson, and Diana Hechavarria. Global Entrepreneurship Monitor (GEM): Adult Population Survey Data Set, 1998-2003. ICPSR20320-v2. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2009-05-13. http://doi.org/10.3886/ICPSR20320.v2

Persistent URL:

Export Citation:

  • RIS (generic format for RefWorks, EndNote, etc.)
  • EndNote XML (EndNote X4.0.1 or higher)

Scope of Study

Subject Terms:   attitudes, business, business conditions, business ownership, businesses, entrepreneurs, investments, investors, occupations, perceptions, prestige, small businesses, startup companies

Geographic Coverage:   Argentina, Australia, Belgium, Brazil, Canada, Chile, China (Peoples Republic), Croatia, Denmark, Finland, France, Germany, Global, Greece, Hong Kong, Hungary, Iceland, India, Ireland, Israel, Italy, Japan, Mexico, Netherlands, New Zealand, Norway, Poland, Portugal, Russia, Scotland, Singapore, Slovenia, South Africa, South Korea, Spain, Sweden, Switzerland, Taiwan, Thailand, Uganda, United Kingdom, United States, Venezuela, Wales

Time Period:  

  • 1998--2003

Date of Collection:  

  • 1998--2003

Unit of Observation:   individual

Universe:   Adult populations of over 40 countries.

Data Types:   survey data

Data Collection Notes:

The most current version of the codebook, version 16, reflects the dropping of 3 temporary working variables and adjusting missing values for 9 variables.

Due to certain Stata restrictions, value labels were removed from the variable CTRYALP. Value labels for this variable can be found in the SAS and SPSS setup files. In addition, these value labels can be found in the codebook processing notes.

Due to certain Stata limitations, the "DK/R/Missing" and "Missing" values have been recoded to '-9' for the following variables: SUMONTOT, SUMONOWN, SUMONFAM, BAFUND, and BAFUNDUS.

There is one error in the data regarding the variable SUMONEY3. Missing data for this variable exists in two forms. There is a total of 102,878 cases that have been coded '999999' and labeled SYSTEM MISSING. Users may wish to recode these values to missing data. There are, in addition, 250,802 cases that are actually system missing. Thus, the total number of cases that should be system missing for the variable SUMONEY3 is 353,680. The codebook also lists the data coded '999999' and the system missing data separately.


Sample:   Developing representative samples of adults was a two-stage process. The first step involved a random selection of households leading to a contact with an adult resident. In countries where a high proportion of households have land line telephones, this was done by creating a random set of numbers considered to be household phone numbers. Numbers were then called, generally up to three times, until an adult respondent answered the phone. In countries with a low proportion of households with phones, geographic areas were selected at random for personal contacts by interviewers, who then approached households for a face-to-face interview. In most cases, the first adult contacted was asked to complete the interview. In a few surveys, an adult would be randomly selected for the interview from those adults living in the household. In many developed countries there was a deliberate attempt (quota sampling) to complete half of all interviews with men and half with women. While the most common form of household sampling was the use of random digit dialing (RDD) procedures to select household land line phones at random, there was wide variation among the commercial survey research firms providing the data. As response rates at the household and individual level also varied widely, all survey vendors used post-stratification weighting procedures to develop case weights. This involved comparing the nature of the survey sample with the most recent reliable national statistics on the characteristics of the adult population. Case weights were then assigned to the sample population so the characteristics of the sample would match the national population.

Weight:   Original weights provided by the individual survey vendors were based on adjusting the sample case weights to match distributions using the best available national data. These have been adjusted to provide, for the current dataset, four types of weights: (1) WEIGHT: Original weights provided by the survey research vendor, re-centered (adjusted) such that the average value for the sample for each year equals 1,000. (The sum of the weights equals the sum of the cases.) (2) WEIGHT_L: Original weights adjusted so they are only available for those aged 18-64, an estimate of the age at which individuals are assumed to be active in the labor force and the only age range included in all national samples by survey vendors. These weights were re-centered for each country for each year. (3) WEIGHT_A: Original weights were adjusted so they are only available for those aged 18 and older, considered an appropriate range for assessments involving informal investors, many of whom are older and retired from the labor force. These weights were re-centered for each country for each year. (4) WEIGHT_P: The population sampling ratio estimated from the total number of adults aged 18-64 in the population, divided by the total number of adults in the sample in this age range. This is convenient for estimating the total number of individuals in the population involved in business creation and management activities. The number of cases assigned the derived weights is reduced from the total sample because of the restriction on age. Most important are the omissions of those under age 18, included in many countries where those under age 18 are assumed eligible for the labor force. There are, in addition, a small proportion of cases (1,011 of 358,278, or about 0.2 percent) where the age of the respondent was not provided, and they are excluded from all derived weights. In order to increase confidence in a standardized weighting scheme, weights across all national samples for 2001, 2002, and 2003 were compared to a standard source of current population age and gender characteristics for all countries in the sample. This created weight is provided for these three years. For 1998, 1999, and 2000 WEIGHT_L and WEIGHT_A were based solely on the survey firm values of WEIGHT.

Mode of Data Collection:   computer-assisted telephone interview (CATI), face-to-face interview, telephone interview

Response Rates:   Available response rates for the Global Entrepreneurship Monitor, by country and survey year, are given as follows: 1998 -- unavailable; 1999 -- Canada: 5 percent, Germany: 22 percent, Finland: 22 percent, France: 17 percent, Israel: 9 percent, Italy: 14 percent, Japan: 63 percent, United Kingdom: 12 percent, and the United States: 5 percent; 2000 -- Argentina: 23 percent, Germany: 39 percent, Italy: 19 percent, Sweden: 77 percent, and Singapore: 82 percent; 2001 -- Argentina: 14 percent, Australia: 10 percent, Canada: 7 percent, Denmark: 22 percent, Italy: 19 percent, Norway: 22.9 percent, New Zealand: 47 percent, Sweden: 78 percent, Singapore: 70-80 percent, and the United States: 8.39 percent; 2002 -- Argentina: 12.5 percent, Australia: 47 percent, Canada: 7 percent, Denmark: 12.5 percent, Hong Kong: 63 percent, Hungary: 20 percent, Israel: 38 percent, South Korea: 9.3 percent, Norway: 22.9 percent, New Zealand: 47 percent, Sweden: 78 percent, and Slovenia: 34 percent; 2003 -- Australia: 18 percent, Belgium: 22.45 percent, Chile: 10 percent, Germany: 50 percent, Denmark: 35 percent, Spain: 70 percent, Finland: 5 percent, France: 2.7 percent, Greece: 29.5 percent, Hong Kong: 33.7 percent, Croatia: 38 percent, Iceland: 54.4 percent, Japan: 8 percent, Norway: 80 percent, New Zealand: 10.8 percent, Sweden: 77 percent, Singapore: 50 percent, Slovenia: 40 percent, Switzerland: 30 percent, Uganda: 100 percent, United Kingdom: 30 percent, and South Africa: 69 percent.


Original ICPSR Release:  

Version History:

  • 2009-05-13 Corrections have been made to the data and codebook regarding variable TEA_MOT (TEA index motivation).



Metadata Exports

If you're looking for collection-level metadata rather than an individual metadata record, please visit our Metadata Records page.

Download Statistics

Found a problem? Use our Report Problem form to let us know.