India Human Development Survey Panel (IHDS, IHDS-II), 2005, 2011-2012 (ICPSR 37382)

Version Date: Nov 19, 2019 View help for published

Principal Investigator(s): View help for Principal Investigator(s)
Sonalde Desai, University of Maryland; Reeve Vanneman, University of Maryland; National Council of Applied Economic Research, New Delhi


Version V1

IHDS Panel 2005, 2011-2012

The India Human Development Survey (IHDS) is a nationally representative, multi-topic survey of 42,152 households in 1,503 villages and 971 urban neighborhoods across India. Data were originally collected from households during 2004-2005. Interviewers returned in 2011-2012 to re-interview these same households. During both waves of data collection, two one-hour interviews were conducted covering a large range of topics. The goal of the IHDS program is to document changes in the daily lives of Indian households in a society undergoing rapid transition.

This particular data collection merges the two waves of IHDS (known as IHDS and IHDS-II) into a harmonized pattern from the perspective view points of individuals, households, and eligible women. The data are presented in three different data formats: cross-sectional, wide, and long to facilitate a broader range of analysis options. Due to the specificity of geography and inclusion of sensitive / identifying topics there is a public-use and restricted-use rendition for each of the nine data files.

Desai, Sonalde, Vanneman, Reeve, and National Council of Applied Economic Research, New Delhi. India Human Development Survey Panel (IHDS, IHDS-II), 2005, 2011-2012. Inter-university Consortium for Political and Social Research [distributor], 2019-11-19.

Export Citation:

  • RIS (generic format for RefWorks, EndNote, etc.)
  • EndNote
United States Department of Health and Human Services. National Institutes of Health. Eunice Kennedy Shriver National Institute of Child Health and Human Development (R03HD091315 and R01HD041455)


Users are reminded that these data are to be used solely for statistical analysis and reporting of aggregated information, and not for the investigation of specific individuals or organizations.

Users interested in the restricted renditions of the data will need to complete a Restricted Data Use Agreement. These forms can be accessed by selecting the "Access Restricted Data" button from the study home page.

Inter-university Consortium for Political and Social Research
2004 -- 2005, 2011 -- 2012
2004-11 -- 2005-10, 2011-01 -- 2013-05

The India Human Development Survey (IHDS) is a collaborative research program produced by the National Council of Applied Economic Research (NCAER), New Delhi, and the University of Maryland. The data collection is funded primarily by the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD).

The values for state and district identification codes correspond to the 2001 Indian census.

The variables HHWAVES [IN ALL 18 DATASETS], PWAVES [IN DATASETS 1-6 AND 13-18], and PSUWAVES [IN ALL 18 DATASETS] report which waves a household, person, or PSU was interviewed.

Household and individual identification numbers have been recalculated so that the same person has the same ID in each wave. With the cross-sectional appended files now using the wave-specific PERSONID the old identification variables are also included in the new files. Also added to the files are:

  • HHBASE: The household ID for the base household in the original IHDS (survey wave + complete hh id at first entry into IHDS). [IN ALL 18 DATASETS]
  • HHFAM2: The single digit ID (1-6) for each of the split households if the original IHDS household was divided by IHDS-II (+hhbase = unique HH id in wave 2). [IN ALL 18 DATASETS]
  • PBASE: A unique person ID within each original IHDS base household. Thus, HHBASE and PBASE uniquely identify every individual whether in IHDS, IHDS-II, or both waves (unique multisurvey person id). [IN DATASETS 1-6 AND 13-18]

Part of the work by the Principal Investigators to harmonize the data was to make the variable names match between the two data collection waves. All variables in these panel data are named as they are in IHDS-II. For instance the variable C07X represents "Last month's consumption of meat, chicken or fish" in IHDS-II. But it was named C08X in the original IHDS. Now, in these panel data, the variable name from IHDS-II is maintained. Both of these variables exist in the household data files [DATASETS 7-12].

Some question content changed between surveys. For instance the monthly consumption of electricity and fuels was asked separately in two questions in IHDS-II (CO21 and CO22) but was a single question in the original IHDS (CO20). In the merged panel data there are variables CO21 and CO22 defined only for IHDS-II (their values for IHDS households are missing values) plus an additional variable CO21_22 which contains the value for fuel and electricity consumption from the original IHDS (named CO20 there) and which adds together CO21 and CO22 for IHDS-II. This is the best approximation for comparing values across the two surveys. The variable labels try to make this as clear as possible. So, in the merged panel files, the CO21 variable specifies "HQ24 14.21 HH fuel Rs [IHDS2]", indicating that it is only available in IHDS-II. The newly created variable, CO21_22 has a variable label of "HQ24 14.21,22 Fuel, electricity [est. as in IHDS1]" which indicates that it is the best estimation of what the combined value is in IHDS-II. These variables exist in the household data files [DATASETS 7-12].

There are an additional small number of variables where the answer categories for the same question changed between the two surveys. In these instances there are two variables in the merged panel files -- one that repeats the exact categories in each survey and a revised version that tries to create a combined variable that is more comparable across surveys. For example, SA4, the type of toilet, had four response categories in each survey, but the categories changed between the original IHDS and IHDS-II. The Principal Investigators repeated the four categories in each survey in SA4X, but have also created a more comparable variable, SA4, which has only three response categories that are more comparable across the two waves. The variable label designates that SA4X has the "original" values, while SA4 has "revised" values. Both of these variables exist in the household data files [DATASETS 7-12].

The values reported for all questions regarding finances (income, debt, expenses, investments, etc.) are in Indian Rupees.

Errors in the data may exist, but the Principal Investigators decided to leave the data as is so that it represents an accurate reflection of interviewee's responses.

In order to maintain the confidentiality and help protect the disclosure of the participating respondents ICPSR created public-use and restricted-use renditions of each data file. ICPSR did not alter the data in any capacity within the restricted-use rendition of the files. For the public-use rendtions ICPSR revised the data files by:

  • Masking valid entries of string variables pertaining to geography or occupation.
  • Masking valid entries of caste designations.
  • Top- and/or bottom-coding variables relating to sensitive topics (i.e., number of abortions) or that have the potential to be identifying (i.e., household composition).

A list of the individual variables that were masked or recoded in any manner is available as a processing note in the beginning of each codebook.

Additional information and resources related to IHDS are available from DSDR through the IHDS data guide and the IHDS-II data guide.

For additional information regarding the India Human Development Survey, please visit the India Human Development Survey Web site.

The purpose of this study is to harmonize the data from the two waves of data collection to analyze changes over time.

The overall goal of IHDS is to document changes in the daily lives of Indian households in an era of rapid transformation.

The Principal Investigators harmonized the data from the original IHDS (ICPSR 22626) and IHDS-II (ICPSR 36151) for three primary file types - individuals, households, and eligible women. There are also three different possible ways to merge the data. This approach resulted in 9 different files for analysis (3 file types X 3 merge types). These merge types include:

  • Append: The two waves are appended one after the other as if they were two separate cross-sections.
  • Wide: All IHDS variables have been added to the end of each IHDS-II record with the variable renamed by adding a prefix "x". For example, if the variable in IHDS-II was named CO7X then the variable for the original IHDS data would be XC07X. The "wide" files are easiest to make comparisons of change from one survey to the next.
  • Long: For those individuals, households, and eligible women who were interviewed in both surveys, these include two records with the same variable names, but the first reporting the data from IHDS, and the second record reporting data from IHDS-II. These "long" files are best for "fixed effect" panel designs. For households that divided between the two surveys, the IHDS household record is repeated for each of the IHDS-II sub-households.

Re-interviewed respondents are only included in the Individuals - Wide Panel and Individuals - Long Panel file types.

The India Human Development Survey (IHDS) was conducted in all states and union territories of India (with the exception of Andaman Nicobar and Lakshadweep). The sample consists 27,010 rural and 13,126 urban households. The sample was drawn using stratified random sampling and contains 13,900 rural households who were interviewed in 1993-94 in a previous survey by the National Council of Applied Economic Research (NCAER) and 28,428 new households. Of the 612 districts in India in 2001, 382 are included in IHDS. The sample is spread across 1503 villages and 971 urban blocks.

Longitudinal: Panel

2005 Urban and rural household population of India primarily aged 15 and older.

individual, household, eligible women
survey data

There is significant overlap of the same variables between the three file types. Nearly 100 percent of the variables in the "long" file are also in the "appended" file. The variable names and labels are identical. However, the "appended" file will contain additional variables not present in the "long" file. Likewise, the "wide" file type also contains nearly 100 percent of the same files as in the "appended" file. However, the number of variables are doubled. The variable names and labels are duplicated on these variables with the addition of the letter "X" to begin each variable name as explained above.

  • Individuals - Appended Cross-Sections: 474 variables / 420,311 cases (DS1 and DS2)
  • Individuals - Wide Panel: 836 variables / 150,983 cases (DS3 and DS4)
  • Individuals - Long Panel: 449 variables / 301,971 cases (DS5 and DS6)

Some of the major topical sections in the "individuals" file type include questions regarding education, employment, finances, household composition, migration, physical health, and substance use.

  • Households - Appended Cross-Sections: 813 variables / 83,706 cases (DS7 and DS8)
  • Households - Wide Panel: 1,364 variables / 40,018 cases (DS9 and DS10)
  • Households - Long Panel: 579 variables / 80,036 cases (DS11 and DS12)

Some of the major topical sections in the "households" file type include questions regarding assets/debts/investments, farming, household operations, household possessions, income, living arrangements, and recreational pursuits.

  • Eligible Women - Appended Cross-Sections: 733 variables / 73,115 cases (DS13 and DS14)
  • Eligible Women - Wide Panel: 1,257 variables / 25,479 cases (DS15 and DS16)
  • Eligible Women - Long Panel: 576 variables / 50,958 cases (DS17 and DS18)

Some of the major topical sections in the "eligible women" file type include questions regarding childbirth, early childhood care, decision making responsibilities, household responsibilities, immunizations, marriage, menstrual cycles, pregnancy, sexual activity, and weddings.

For the initial IHDS the response rates were calculated as 82 percent for the recontact sample, 98 percent for the new sample, and 92 percent for the total response rate.

For the follow-up IHDS-II 85 percent of the original households from 2004-2005 were re-interviewed.


2019-11-19 ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection:

  • Checked for undocumented or out-of-range codes.

The data are not weighted. However, the datasets contain the following weight variables that users may wish to apply for analyses:

  • Individuals: DS1 and DS2: WT and FWT / DS3 and DS4: WT, FWT, XWT, and XFWT / DS5 and DS6: WT2005, WT, and FWT
  • Households: DS7 and DS8: WT and FWT / DS9 and DS10: WT2005, FWT2005, WT2012, and FWT2012 / DS11 and DS12: WT2005, FWT2005, WT2012, and FWT2012
  • Eligible Women: DS13 and DS14: WTEW, FWTEW, and WTHH / DS15 and DS16: WTEW, XWTEW, FWTEW, XFWTEW, WTHH, XWTHH and WTEW2012 / DS17 and DS18: WTEW, FWTEW, and WTHH

Weights adjust for differential sampling proportions from rural districts and from urban towns and cities, and for the probability of villages or towns/cities being sampled.


  • The public-use data files in this collection are available for access by the general public. Access does not require affiliation with an ICPSR member institution.

  • One or more files in this data collection have special restrictions. Restricted data files are not available for direct download from the website; click on the Restricted Data button to learn more.

  • The citation of this study may have changed due to the new version control system that has been implemented.