Data Merge Tool (Beta)

This tool is in the process of being developed and may have bugs. Please report problems with this tool to

Many data collections consist of files at different levels of analysis. For example, a project may collect information about individuals, households, and communities, or it may collect data about students, teachers, and schools. Data from different levels of aggregation are often stored in separate files. The Data Merge Tool enables data users to combine data from separate files into a single dataset at the selected level of aggregation. For example, if the household is the unit of analysis, the Data Merge Tool will create variables describing individuals within the household as well as the community in which the household resides.

The Data Merge tool provides three ways of adding variables from other files.

  1. One-to-many links: Variables from a higher level of aggregation are copied to the primary analytical file. For example, community-level variables may be added to all households residing in that community, or household attributes may be added all individuals in the household.

  2. Many-to-one links: There are two ways to add variables from lower levels of aggregation.

    1. Replication. When a higher level of aggregation (e.g. household) is linked to multiple records at a lower level of aggregation (e.g. individual), a new variable may be created for the values on each linked record. For example, if a household includes four people, variables describing the ages of those people would be represented as AGE01, AGE02, AGE03, and AGE04 on the household record.

    2. Aggregation. Data from multiple linked records may also be aggregated into summary variables. By using filters, a data user can create summary variables for Count_0_14 (number of persons aged 0 to 14), Count_15_64 (number of persons aged 15 to 64), and Count_65+ (number of persons aged 65 or older).

Studies that can be used with the Data Merge Tool

The data merge tool is currently unavailable.

The Data Merge Tool was created as part of the Data Sharing for Demographic Research (DSDR) project with support from the Eunice Kennedy Shriver National Institute of Child Health and Human Development.