Los Angeles Family and Neighborhood Survey (L.A.FANS): Household Economy/Imputed Income Data

QUESTION: I have question about the faminc variable. There are 112 people with faminc==0; they are a screwy bunch; there are only originally about 39 missing faminc variables.

How does one explain a faminc of 0? How was this variable created? These 112 people are an influential bunch of numbers.

I’ve imputed them as missing in a NON-multilevel analysis, and then income works out as it’s supposed to. Could they possibly originally be missing?

RESPONSE:

As noted in the documentation for INPINC1/IMPINC1R (imputed_income.doc), faminc is the sum of famearn, alltrans, and astotinc.

In IMPINC1/IMPINC1R, there are 138 households with FAMINC=0. 68 of them did not do a household module (no_hhmod=1). So there are only 70 where faminc is zero and a household module was done.

Some of those 70 faminc=0 might well be true. Remember that this is family not household income. So if, say, the RSA has no spouse or children, and the RSA did not work in the previous year and has no transfer or asset income , then faminc could be legitmately zero. This RSA might be age 27 and living at home with his/her parents. Due to the definitions of who would be given the household module, such an RSA would get the household module and not his/her parents. Another example is an RSA who lives with friends and received no transfer or asset income or earned anything in the previous year. It’s those faminc=0 cases where there is only the respondent (and their spouse/partner and children if relevant) in the household where the faminc=0 is a bit more suspect.

On the other hand, some of those with faminc=0 may be cases where the respondent said in the household module that they had no earnings or transfer income or asset income when in fact they did. In those cases, faminc=0 would be a nonresponse issue.

The IMPINC1/IMPINC1R data was created solely from the data collected in the household module. It did not look at earnings that might have been reported in the EHC since we don’t have yearly earnings in the EHC. It seems that around 14 of the 70 faminc=0 cases had a job in the previous year based on the EHC. However, those 14 said in the household module that they did not earn anything in the previous year. Those might be cases to treat is “missing” faminc as opposed to a legitimate zero.

The public assistance data in the EHC was also not used since it provides not amounts. However, it could be used to see if the respondent said in the EHC that they received SSI, foodstamps or public assistance in the previous year. If it appears from the EHC that this respondent did get public assistance in the previous year, then the faminc=0 case could be treated as “missing” as opposed to a true zero.

You’ll need to do a little cross checking with the EHC to see which of the FAMINC=0 might well be missing as opposed to true zeros.

QUESTION: I am confused about the variable “Roster Age” (AGE_YR) in the HHINC module. Whose age is being reported here? I would assume it is the respondent to the survey, but this module is for household income and there are a number of responses indicating that the “Roster Age” is 0-4 years old (93), 5 – 9 years old (153), etc.

RESPONSE:

As mentioned in the codebook for the HHINC file, the file has one record for each person in the household who received a type of income listed in the A12-A13 and A16-A17 questions of the household questionnaire.

The PID variable refers to the person receiving the income. Thus the AGE_YR and SEX characteristics are for the person receiving income, as are the variables MARITAL, SPOUS_ID, SP_RA11, RB1, and RB2_1-RB2_6. They are not for the household module respondent.

The RESP_ID variable is the id of the household module respondent for the family to which the person receiving income belongs.

HHRF is the family indicator for which family where a household module was done to which this person receiving income belongs.

In the HHLD file, which is one record household, the PID and PID_B variables are the ids of the household module respondents and the age/sex/etc. characteristics in the HHLD file are for the household module respondents.

QUESTION: How does a child ages 0-5 receives any income? For example, in the codebook I see two records that appear to indicate that children, one age 3 and once age 5, received Social Security Income (HA16D_= 1).

RESPONSE:

Regarding young recipients of income in the HHINC file, we can only go by what respondents reported. Some respondents might have interpreted the question about who receives income of particular type to include all those in the family of the person getting the income. While the public assistance check or social security check may go to person A, the respondent may view it as the family of A receiving the money. There might also have been some interviewer error but when there’s nothing to prove definitively that such error is the reason for an odd response, we don’t change the response.

As you work with household survey data, you’ll discover that respondents will give answers that are not what any one expected. Some respondents will interpret questions differently than expected, others may be confused as to what is being asked but don’t ask the interviewer for clarification. Others may be making up answers. Interviewer typos can also happen.

It’s up to the analyst to decide how to deal with responses that seem unusual or contradictory.

QUESTION: My understanding is that the imputed income/assets file (IMPINC1R) contains one observation for each household in HHLD1 (even those for which income/assets values were not imputed). So if we merge the imputed values into the household data, we can replace the household income/assets values with the imputed values for all cases. Is this right?

Also, I cannot seem to locate an imputed “asset” measure; that is, assets that include housing value. I can locate the NONHASST. Do I need to add the house value onto the value of NONHASST to get an imputed composite “assets” measure?

RESPONSE:

You are correct regarding imputed assets. NONHASST is imputed non-housing assets. You would have to add imputed home value (C_HOUSE) to that for a composite measure. Unfortunately, home value was only asked as a categorical variable. So only a category was imputed. You’ll have to figure out how to add a categorical value to NONHASST.

Remember that if the RSA is not the PCG or spouse of the PCG of the RSC, then a second household module was done with the PCG as the respondent (this was only a couple hundred households or so).

Thus, there is one record in IMPINC1R for each selected household module respondent (HHRF is the family number within the household) for households where more than just a roster was completed. If a household only had a roster done and no other module, we did not impute income and assets for them.

QUESTION: The imputed home value (C_HOUSE) is the market value of the house, right? If the individual doesn’t fully own the house yet, this is equity plus debt. If we want to measure wealth, we need to subtract off the debt first. Do we have a measure of equity (housing wealth)? I couldn’t find an imputed measure of equity or an imputed amount of debt remaining on the house in the new data, and also could not find non-imputed measures in the codebook.

RESPONSE:

Yes, C_HOUSE is the market value—it’s what you think you’d get if you sold the house today. The variable PRINLEFT is the estimate of the amount of the mortgage that would be outstanding (assuming a 30-year fixed rate mortgage) was calculated using a standard mortgage formula. It is discussed in the documentation for IMPINC1R.