Los Angeles Family and Neighborhood Survey (L.A.FANS): Adult Module

QUESTION: In the LAFANS-2 Adult module questionnaire, there is no skip listed for AC41=5 (no, do not have a visa, etc.). Do they get asked the AC42 to AC48 questions about visas and other documents?

RESPONSE:

No, those with AC41=5 were skipped to AC49 in the CAPI program, which is what was supposed to happen. There is just a simple typo in the L.A.FANS-2 Adult questionnaire where the note on the skip was inadvertently omitted.

QUESTION: I’ve been compiling a longitudinal dataset based on 1055 RSCs and SIBs, aged 11 or over, who completed the Child Module at LAFANS1 (W1). I’m looking now at the data for those respondents who were interviewed at LAFANS2 (W2). Out of the 1055, I’ve identified 413 who had “aged-up” (age >=18) and completed an Adult Module at W2. I am starting to look at the W2 education-related information (Section D) for these 413 cases. My work involves looking at W2 education reports in comparison to W1 reports. I am running into some significant problems.

I’ve constructed a data file that incorporates the W2 adult data (for those who responded to it) into the W1 child data. It is confusing because some of the item numbers and skip patterns in the codebook and datafile do not clearly match those in the questionnaire. The most troubling issue is: Although I have data to verify that these 413 cases are panel respondents (a mix of RSCs and SIBs from W1), I am encountering data that are completely missing on items on which I would expect there to be data, and data are present on items that I would expect to not have data. As an example, for these 413 cases:

AD1A: (CAPI check) Respondent Was Not Interviewed at W1? Yes (1) = 413
AD1: (CAPI check) Respondent born in USA Yes (1) = 332; No (5) = 81
AD2-AD10 do appear to have the correct # of respondents (#s are based on the N=81 from AD1 and area reduced further based on response to AD2).

Based on info in the questionnaire, I was expecting only those respondents who had not completed a W1 interview (and who completed some/all schooling in US) to have valid data on AD11. That is not the case as all 413 have a value in AD11. In contrast, based on info about my sample and the questionnaire, I was expecting ALL of my respondents to have something recorded on AD26 — the item that marks the start of questions confirming (or correcting) educational attainment and status reported at W1 (AD26-AD41). But, that is not the case as AD26-AD41 are missing for all 413 respondents.

Could you please help me understand what is going on?

RESPONSE:

As you are discovering, the W2 adult module mostly had preloaded information for those who did the W1 adult module. W1 info for the aged up RSC/SIB is largely not in the W2 adult module because preloaded info was mainly pulled from the W1 adult module, which those kids obviously did not do.

The wording for the “2” response on AD1A in the questionnaire has a typo in that it should not have mentioned RSCs or SIBs. This is because what AD1A really is, and this is confirmed by checking the CAPI coding, is an indicator for whether the person did the W1 ADULT module—that is what is was always supposed to be so the CAPI did what we had intended.

RSCs and SIBs did not do the adult module in W1, so they would have AD1A=1 (yes, did not do W1 adult module). Thus they would be asked the education questions that immediately follow. This was much simpler than trying to mirror W1 education preloaded info by creating it from W1 child and parent module data and then use it to ask them AD26-AD58.

So all RSC18+ and SIB18+ would do AD1-AD22 and then skip to AD62 where they are then asked the detailed questions about schooling since W1.

The W1_AD19 only applies to those who did the W1 adult module as does the W1_SCHCOMP since those variables were pulled from W1 adult module data. You would have to create something similar for those RSC18+ and SIB18+ using data from their W1 child and parent modules.

QUESTION: I found an error in the L.A. FANS adult data. It seems that the individual with hhid=28659 and pid=1 has a mistake in the adult public data set. Although their interview date was 10/1/00, and their EHC start date is thus 10/1/98, the variables “bwmo”, “bwday” & “bwyear” were coded as 3/27/98. I discovered it when I found an error in a variable I calculated using both the bw variables and the interview date variables.

RESPONSE:

Case HHID=28659 is one where the person was interviewed twice–probably to finish up an interview. The FI redid the adult module for the person some months later. So the ADATE reflects the second interview as CAPI used it to overwrite the first date. However, the BWMO/BWDAY/BWYEAR, which are fields created by the CAPI program, were apparently set at the first interview and not reset at the second by the CAPI program. So, the BWMO/BWDAY/BWYEAR represent what the CAPI program was using for the start date of the 2 year interval in the Adult module at the time of the later interview. Thus the two year window might be a bit longer depending on when a section using the 2-year window was done.

Needless to say, the CAPI program was not designed to deal with a module interview being stretched out over a long period of time. The assumption is that a module will be done all in one day or within a few days of when the interview started. No one assumed there’d a case where it might take 6 months.

I wouldn’t worry about the possibility that the 2-year window may be 2.5 years for this one case. It’s likely not to have made any difference unless the person had an event change around that time (i.e., 3/27/98 to 10/1/99). Also, since we have no way to know for sure if any of the dates a respondent reported are when something actually happened (people don’t always remember dates that well), there’s already a certain amount of error that naturally exists in any computed intervals from recall data.

QUESTION: There are discrepancies in the numbers of foreign-born and US-born adult respondents, so I am unsure which to use. As far as I can tell, there are 3 variables in the public data set regarding country of birth, AC34_CYR, CAPI check AC34_4, and AD1. The total number of US-born vs. foreign-born in each of these variables contradict each other.

I am also curious about the numbers of legitimate skips appearing in these questions… In what situation would question C34 and D1 in the adult module be skipped?

RESPONSE:

With regards to AC34 and AD1, there were some problems in the CAPI program and with FI errors. We tried to clean those us as best we could, but sometimes we did not catch everything. Note that AC34_STR may have a value for California or a US region, but AC34_CYR is missing. AC34_4 is based on the responses to AC34_STR and AC34_CYR, not AC34_CYR. AD1 had various problems as noted in the adult1 codebook. It has people with AD1=. who have AC34 info that denotes native vs foreign birth. We did not always go in and clean up missings or DK responses because we had info from elsewhere that suggested what the value really should be. If we did not have confirming info from another source, we often left inconsistencies for analysts to decide. As noted in the main codebook intro materials, we tried not to “over clean” the data. That results in these sorts of little discrepancies.

The term “legitimate skip” is used for blank values. 99 percent of the time, blanks mean the person was deliberately not asked that question. However, there were CAPI and FI problems that resulted in blanks where questions should have been asked. We tried to assign -5 values to such cases, but did not always catch all of them. AD1 as noted above had a lot of problems, so it’s not surprising that it would show more discrepancies. There are 3 people who have AC34_4=1 but AD1=. as does AD2-AD10, which is expected for native born. Setting AD1=1 was just missed.

Basically, AC34_4 is the best measure for actually being born in the US. We did try to clean that up so it was correct as best as we could tell.

QUESTION: I have a set of questions are in regard to the sibling questions (G7-G19) of the adult module. What I would like to use is the educational attainment, gender, and age of the sibling closest in age to the respondent, asked about in questions G14-16. However, there are 1,430 legitimate skips in these 3 questions that I cannot explain (I have also included the distribution of these 3 variables from the adult dataset in the attached file). This number of skips does not match the numbers of respondents who reported having no brothers or sisters in question G7 (N=248), or those who reported that all their brothers and sisters are no longer living in question G8 (N=693). Do you know why 1,430 skipped these questions? Were they able to opt out in some way other than reporting they don’t have a sibling, refusing to answer, or reporting “don’t know”?

I am also unclear which sibling the respondents are being asked about in questions G10-13. Only 448 respondents are answering these questions, and I cannot figure out from the questionnaire on the LA FANS website or the codebook which of the previously reported siblings in questions G7-9 is being asked about in G10-13.

RESPONSE:

Regarding section G, if you have read the questionnaire, remember that only RSAs were given Section G. Thus any PCG only or Emancipated minor, would not do Section G. However, initially, the FIs were giving Section G to PCGonlys/EMs so before that was corrected, 309 PCGs got Section G. So there are 629 PCGonlys/EMs who actually did not get Section G. There were around 40 or so people who started the Adult module but did not make it through to Section G. Remember that aside from those who said they had no siblings (AG7=0 ) or had no siblings who are still alive (AG9=1), those who have AG13>. will also skip to AG19 (i.e., those with only one living sibling). Accounting for all of those gets a number close to 1400.

So, basically, the responses in AG14-AG16 are for all the people who were supposed to get those questions. I think you just forgot to check all the skip patterns in Section G.

The questions in G10-G13, if you read the questionnaire, are for those with only one living sibling.

The LAFANS individual codebooks are not meant to supplant the LAFANS questionnaires since the codebooks do not have the actual questions. I think you’ll find it helpful to use the questionnaires in conjunction with the codebooks in figuring out response patterns.

QUESTION: How can I tell I if the spouse of the Adult respondent is an RSA or PCG or neither?

RESPONSE:

The easiest way to know if the spouse was the RSA or PCG or not, is to merge on the RRSAi and RPCGi varibles from ROSTHH1 to ADULT1 and compare them to SPOUS_ID.

To know if the spouse is in the household, look at SP_RA11 (1 means FT hhld member, 2 means PT hhld member, . means no spouse in roster). To know if the spouse did an adult, pcg or parent module, use the SP_ADLT, SP_PCG and SP_PAR variables in ADULT1.

QUESTION: I wanted to know the immigration status (type of visa etc) and health insurance status of the RSA’s spouse but could not find any information except for current health insurance (yes or no) from the household roster. Is this the only information in LAFANS about the spouse’s health insurance and immigration or am I missing something.

RESPONSE:

If the spouse of the RSA also was given the adult module (i.E., The spouse is the PCG or is one of a few cases where the spouse inadvertently was given the adult module), you can get that immigration information from the adult module record for them, and health insurance coverage can be found also in the EHC.

However, if the RSA’s spouse was not interviewed, you don’t know anything about the spouse’s immigration status. With regards to health insurance info other than the roster, you might be able to make some assumptions from the type of coverage mentioned in the ehc by the RSA.

Unfortunately, no specific questions on immigration or any detailed info on health insurance was asked of RSA spouses who were not also adult module respondents.

In cases where the RSA is a married male with children under 18, you’ll likely have detailed info on the spouse because she’s usually the PCG and thus has an adult module. However, when the RSA is a married female, then you have more limited info on her husband. Similarly married rsa males without children will also have limited info on their wives.

QUESTION: Is there a way to differentiate between different Latino groups e.g. Mexican Americans and Puerto Ricans?

RESPONSE:

If you look at the Adult module questionnaire and codebook, you’ll see that at question C27 we asked Adult respondents who said they were Latino to tell us if they were Mexican, Central American, Puerto Rican, Cuban, or some other Latin American group. We only asked this question of Adult respondents (i.e., RSAs and PCGs).

In the public use data, we combined Puerto Rican and Cuban to address privacy concerns. The Restricted Version 1 data contains the original more detailed C27 response.

QUESTION: Regarding non-response codes, I know from the codebooks (on p. 63) that “-5” is the code used for missing data (where a question should have been asked but was not) and that “.” is a legitimate skip code. For some variables, it looks as if they should not have been skipped, yet there are “.” values for these variables.

For example, I am looking at RSA’s (both RSA only’s and RSA/PCG), which is a total sample of n=2,623. I was looking specifically at variable am5 (which asks “Do you smoke cigarettes?”). From what I can tell from the questionnaire and the codebooks, it looks like this question should have been asked of all RSA’s, so the frequencies should add up to 2,623. But the total is 2,547, so it looks like 76 are missing (2623 minus 2547=76). So I was wondering why those weren’t coded “-5” rather than “.” Similarly for some other variables, I see that some are legitimate skips based on the skip pattern, but others seem to be missing data and are coded “.” rather than “-5”. I see though that with other questions, there are “-5” codes.

Can you provide any clarification on that?

RESPONSE:

The 76 with AM5=. in your example are adult respondents who never got to the AM5 question due to a breakoff. They all have ADLTCOMP=0. Those with ADLTCOMP=0 were not given -5 missing codes because they never got to those questions and thus were like those who with ADLTCOMP=1 who legitimately skipped those questions.

The -5 missing code is used for people who actually got to the relevant section of the questionnaire but due to some FI or CAPI problem, a question they should have been asked was accidentally skipped.

It’s possible that there may be a few cases where we missed assigning a -5 code to such problems as we had to do these ex post, and as you know there are a lot of variables and a lot of skip patterns in the LAFANS.

Depending on what you’re doing, you may want to drop those with ADLTCOMP=0 to avoid such confusion.

Los Angeles Family and Neighborhood Survey (L.A.FANS): Adult Module

QUESTION: In the LAFANS-2 Adult module questionnaire, there is no skip listed for AC41=5 (no, do not have a visa, etc.). Do they get asked the AC42 to AC48 questions about visas and other documents?

AD1A: (CAPI check) Respondent Was Not Interviewed at W1? Yes (1) = 413

AD1: (CAPI check) Respondent born in USA Yes (1) = 332; No (5) = 81

AD2-AD10 do appear to have the correct # of respondents (#s are based on the N=81 from AD1 and area reduced further based on response to AD2).

Could you please help me understand what is going on?

I am also curious about the numbers of legitimate skips appearing in these questions… In what situation would question C34 and D1 in the adult module be skipped?

QUESTION: How can I tell I if the spouse of the Adult respondent is an RSA or PCG or neither?

QUESTION: Is there a way to differentiate between different Latino groups e.g. Mexican Americans and Puerto Ricans?

Can you provide any clarification on that?