ࡱ> fhe` p;bjbj .b3T8dX412((>,@ 41616161616161$2hO5RZ1""Z1o1H$H$H$41H$41H$H$r/T0 Ϻ@ vX/41101f/5 50050LH$lLLLZ1Z1!LLL1  Proposal for the documentation and handling of the repeated cross-national survey as well as of related comparison procedures 1. Working name for proposal RC-NS (The Repeated Cross-National Survey approach to comparison issues) 2. Working Group SIDOS / MetaDater / CDG 3. Alliance Member Sponsors Center for Surveys, Methods and Analysis (GESIS/ZUMA) Central Archive (GESIS/ZA) Centro de Investigaciones Sociolgicas (CIS) Finnish Social Science Data Archive (FSD) German Socio-Economic Panel (SOEP) Inter-university Consortium for Political and Social Research (ICPSR) Minnesota Population Center Norwegian Social Science Data Services (NSD) Roper Center Swedish Social Science Data Service (SDA) Swiss information and data archive service for the social sciences (SIDOS) Washington State University 4. Architect Reto Hadorn, SIDOS  HYPERLINK "mailto:reto.hadorn@sidos.unine.ch" reto.hadorn@sidos.unine.ch Submitted on 16.9.2005 5. Summary 5.1 Class of studies being addressed The proposal addresses the so called comparative studies. Data files are of various structures, so they are not the adequate reference here. What is at stake is the set of actions undertaken to achieve comparability; these actions involve data files in a variety of states and relationships, so the data files as such is an insufficient reference. The term 'comparative' being rather badly defined, we prefer to distinguish clearly the space and the time dimensions, using dedicated terms for describing data elements and relationships in each case to focus on the cross-national study as paradigm for the more general cross-sample comparison on the space axis. The cross-national study program, which is comparative by design, is treated as the paradigm for all kinds of comparisons. The harmonization study, which treats comparisons between unconnected studies and datasets, can be considered as a 'harmonization program'; a harmonization study must be defined to describe the intent of the comparisons to be made; this harmonization study occupies the same position in the overall structure as the program study of the cross-national program. A lot of structures and processes defined for the cross-national study program can be re-used for the harmonization study. It is our epistemological and methodological position that the comparison of any pair of variables across the whole database makes little sense, even if the whole hierarchy of metadata from variable to study and project level is taken into account. Any decision about the comparability of two variables or the harmonization procedure is dependent on the scope of the comparison. The 'harmonization study' is the appropriate place to give that information. So, no comparison should be made or documented outside the framework given by a study. This position settles the role of the RC-NS program as a general paradigm for the description of comparative relationships between any kinds of variables. 5.2 Brief problem statement The data go through multiple stages in an RC-NS, which are not accounted for by DDI 2.0. Starting with a standard definition, one gets a set of sample-specific datasets, which can be integrated over space or cumulated over time. The data model shall describe those various states. The RC-NS involves series of identical or similar questions and variables. These series must be documented. They can be used to facilitate the treatment of the whole set of metadata (repetitions interpreted in terms of inheritance or reference). Once built, the series can be used to support the harmonization, integration and cumulation processes. DDI 2.0 is basically a single file approach, which does not support the definition of structural relationships between studies, between datasets or between questions or variables from distinct studies or datasets. The DDI definition lacks a conceptual level, where 'functional objects' can be defined, which are not represented by a specific metadata element. The various compound datasets described in the proposal are such functional or virtual objects. The study description object is not differentiated enough to map structures on more finely defined levels (research project, data schema, concrete dataset), where the relationships have to be defined. This subdivision of the present study description is not developed in the present proposal. The DDI often uses the 'statement' approach to structure various elements of information. So, the description of 'comparison' in the form of a 'comparability statement' appears as a legitimate way to describe comparisons. Unfortunately, it is an illustration of the variable-to-variable approach to comparison, and not of the comparison in context, advocated by this proposal. From our point of view, one comparability statement can refer only to one specific comparison scope, so there is place for a multiplicity of variable-to-variable comparability statements for the same pair of variables. There are practical limits also: imagine that you integrate into each variable-to-variable comparability statement information on all involved metadata levels The challenge of extending the DDI data model is less a modeling problem than an epistemological and conceptual issue regarding the definition of the conditions of comparison. We should not jump over that part of the discussion. Other issues are supposed to be addressed by other working groups. The question structure is an example for this. In this proposal, the narrow relationship between the question and the variables it creates is central (see the definition of the Question-and-Variable in the documents): actually, this is what defines the structure of questions. We hope that the description of the questionnaire as a distinct object is not the ultimate scope of the instrument definition group. 5.3 Brief verbal summary of solution This proposal takes the life-cycle stance and tries to respond to the question: which are the various states a study goes through, which is planned to be comparative through space (e.g. a cross-national study) and time (e.g. a repeated cross-national study). The idea is that a metadata model covering the life cycle of such a study must fulfill the requirements of various steps of the work with the various datasets involved: the standard definition (questions and variable definitions), the country datasets, the dataset resulting from the integration of the country datasets, the next wave's standard, the corresponding country datasets and integrated dataset, the cumulated datasets on country level as well as the cumulation of the integrated datasets. References between study descriptions, questions and variables describe the kinship relationships among space (reference to the standard or coordination study) and over time (references to previous occurrences of the same question and variable). Those relationships are the backbones along which the synthetic datasets can be processed (the integration of country datasets as well as the cumulation of the repeated cross-sections or integrated datasets). The metadata for the integrated and cumulated datasets can be derived from the metadata captured for describing the country studies/datasets and their relationships to the standard and its evolution through time. Given the two generic approaches to the comparison issue ("start with the planned comparative study" vs. "just pick two variables at random and tell me how far they compare"), the proposal takes decidedly the first approach. Any comparison bases on a conceptual framework, which is set by a 'comparison study'. In the case of a comparative study program like EB or ISSP, this role is played by the study program or coordination study, which describes the coordination efforts. In the case of datasets or variables taken from uncoordinated studies, a 'comparison study' has to be set up, which sets the frame for the comparison of the critical information at all related levels (concepts, methodology, variable definitions). Variables compare not in abstracto but with reference to a conceptual and methodological framework. So the planned comparative study sets the scenes for any kind of comparison. The main data elements necessary for this approach are described at the end of the text explaining the proposal. If this conceptual proposal, the next step concerns the representation of that concept for several related applications involved in the process; concretely: what additional metadata elements are necessary for several IT systems using the same general model to communicate think of a comparative study involving a coordination team and several local teams, all using distinct instances of a system based on the same metadata model. This would just be a step more in the life cycle stance. 6. Issues Basically, the following is all 'doing' something, not just describing; the description is basically a sub product of the action. Life-cycle means 'living data and metadata'. Setting up the standard definition for a RC-NS, including standard questions and expected (eventually pre-harmonized) dataset definition. Using the standard definition to facilitate the definition of the metadata sets for the country datasets Defining references from the country questions and variables to the standard definition; references describe also possible variations on the standard. Making a global and detailed diagnosis about variations between standard and country Q/V. Defining an integrated dataset; adding harmonized variables, copying them into the country dataset definitions, defining the necessary computations in the local datasets, integrating harmonized data into the integrated dataset. All this is done using the references defined between related questions and variables. Using the first standard definition to facilitate the definition of the standard for a repetition of the survey; references from the second to the first standard created automatically, documenting the possible variations. Defining references on country level between the first and the second implementations of the standard; those references document possible variations. Making a global and detailed diagnosis about variations between the first and the second wave (more generally: multiple waves) Defining a cumulated dataset; adding harmonized variables, copying them into the wave specific dataset definitions, defining the necessary computations for the various waves, cumulating harmonized data into the cumulated dataset. All this is done using the references defined between related questions and variables. The procedure is the same for cumulating country datasets or integrated datasets. Defining a harmonization study, building references from candidate single studies to the harmonization study, defining possible harmonized variables, building references from variables which are candidates for harmonization. This results in a space x time structure of references along which integration and cumulation can be made the same way as in the planned RC-NS. Needless to say, there are some problems which can be addressed only then, when the real metadata will reveal the limits of this conceptual model. This is a normal situation in the development of a complex model and works since the publication of DDI 1.0. 7. End result sought To be able to support metadata capture, for all metadata involved in a RC-NS and related comparative procedures; to document accurately the relationships between the metadata elements and the variations between similar objects; to support efficiently the procedures involved in publishing metadata for the compound datasets; to support the analysis of variations over space and time and the integration/cumulation in synthetic files; to facilitate the publication of the full documentation for the synthetic files, based on the references built into the metadata model and the metadata entered and produced on the way incl. as a sub product of variable harmonization and computation procedures; to define and conduct harmonization studies, in which data from uncoordinated surveys are harmonized and compared. 8. Rationale for change RC-NS are among the most used datasets. RC-NS offer an appropriate framework for other kind of comparative projects. RC-NS are a type of datasets, where metadata collected on various levels can be inherited and composed in various ways for the publication of the metadata in their final state (integrated/cumulated) 9. Use cases The approach is defined in much detail in a  HYPERLINK "http://info1.za.uni-koeln.de/ddicdg/documents/RCNS_050706.zip" set of documents available on CDG's website. Please refer to them. 10. Background information Included in the documents referred to in topic 9. 11. Data elements The data elements are described in the style sheet RC-NS.xls included in the set above. To be noted: other parts of the DDI have to be remodelled to let the RC-NS approach fit into. These are notable the study description and the question structure.     RC-NS 16.9.2005  ____________ - p.  PAGE 1 of  NUMPAGES 5 -  #0adjjT>+hhXY5CJOJQJ^JaJmH sH +hh]5CJOJQJ^JaJmH sH (hh CJOJQJ^JaJmH sH %h!^5CJOJQJ^JaJmH sH +hhi 5CJOJQJ^JaJmH sH +hh 5CJOJQJ^JaJmH sH 4jhh CJOJQJU^JaJmHnHu(hhCJOJQJ^JaJmH sH "hCJOJQJ^JaJmH sH     7 8 n  I e gdSgd]$a$gd gd;o;    7 8 n  e  / 0 1 kVkkkkkk(hhSCJOJQJ^JaJmH sH (hhSCJOJQJ^JaJmH sH +hh]5CJOJQJ^JaJmH sH (hh CJOJQJ^JaJmH sH (hh CJOJQJ^JaJmH sH (hhSCJOJQJ^JaJmH sH +hhS5CJOJQJ^JaJmH sH (hh]CJOJQJ^JaJmH sH   0 1 > ? R G H gd]S & Fgd]Sgd gd]gdS1 4 = > ? Q R S 龩{_H{2+hh?5CJOJQJ^JaJmH sH ,hh]S0JCJOJQJ^JaJmH sH 7jhh]SCJOJQJU^JaJmH sH (hh]SCJOJQJ^JaJmH sH 1jhh]SCJOJQJU^JaJmH sH (hh CJOJQJ^JaJmH sH (hhuVCJOJQJ^JaJmH sH +hhuV5CJOJQJ^JaJmH sH +hhx>5CJOJQJ^JaJmH sH   #  - . / ; E !5տ몔jT?(hh!CJOJQJ^JaJmH sH +hh 5CJOJQJ^JaJmH sH (hh!^CJOJQJ^JaJmH sH (hh CJOJQJ^JaJmH sH +hh]S5CJOJQJ^JaJmH sH (hh]SCJOJQJ^JaJmH sH +hhE5CJOJQJ^JaJmH sH +hh!5CJOJQJ^JaJmH sH (hh!CJOJQJ^JaJmH sH rr&E\cpӾӾiS+hh!5CJOJQJ^JaJmH sH +hhN5CJOJQJ^JaJmH sH (hhNCJOJQJ^JaJmH sH (hh]SCJOJQJ^JaJmH sH (hh!^CJOJQJ^JaJmH sH (hh!CJOJQJ^JaJmH sH +hh!5CJOJQJ^JaJmH sH +hh!^5CJOJQJ^JaJmH sH  !"$$''( ({*|***gdi gd%!gd] 2]  . !,Wh}KU֬֗֬֗֗֗lVlVll+hhg55CJOJQJ^JaJmH sH (hhg5CJOJQJ^JaJmH sH +hhf*5CJOJQJ^JaJmH sH (hhf*CJOJQJ^JaJmH sH (hho,CJOJQJ^JaJmH sH (hh!^CJOJQJ^JaJmH sH (hh?CJOJQJ^JaJmH sH (hh!CJOJQJ^JaJmH sH PXHT{*|*龩jT?,j$hhi CJOJQJaJmH sH (hh_HCJOJQJ^JaJmH sH +hh!5CJOJQJ^JaJmH sH (hhiOCJOJQJ^JaJmH sH (hh?CJOJQJ^JaJmH sH (hh!^CJOJQJ^JaJmH sH (hhf*CJOJQJ^JaJmH sH (hh%!CJOJQJ^JaJmH sH +hh!^5CJOJQJ^JaJmH sH +hh%!5CJOJQJ^JaJmH sH |*}*~*****+5+7+++F-P---.Y.9/v00 111111:2G2a2k222R3S33龩jjjjjj(hh?CJOJQJ^JaJmH sH (hh5CJOJQJ^JaJmH sH (hh!^CJOJQJ^JaJmH sH (hhCJOJQJ^JaJmH sH (hh_HCJOJQJ^JaJmH sH +hhj=5CJOJQJ^JaJmH sH +hh_H5CJOJQJ^JaJmH sH #*6+7++++,,,,,--Z.[.9/:///P0Q011R3S3S4T4i4j4v4gda\gd & Fgdgd]3R4T4i4j4566667:7G7~7777777777F8K888nXEEE$hhqCJOJQJaJmH sH +hhpY5CJOJQJ^JaJmH sH +hhR5CJOJQJ^JaJmH sH $hh!^CJOJQJaJmH sH $hha\CJOJQJaJmH sH +hh55CJOJQJ^JaJmH sH +hhj=5CJOJQJ^JaJmH sH (hh5CJOJQJ^JaJmH sH (hhpYCJOJQJ^JaJmH sH v44O556$7777777'8(888889999 : :::x:y:gd$gdqgd] & Fgda\88888*9+9v9w9x99999999}hU@(hhj=CJOJQJ^JaJmH sH $hhpYCJOJQJaJmH sH (hhq0JCJOJQJaJmH sH 3jhhUCJOJQJUaJmH sH $hh$CJOJQJaJmH sH -jhh$CJOJQJUaJmH sH +hhj=5CJOJQJ^JaJmH sH +hhpY5CJOJQJ^JaJmH sH $hhqCJOJQJaJmH sH 9 : : ::F:Q:[:u:;;;;!;";$;%;';,;.;9;:;;;<;~~~~p_pGpCh.jhCJOJQJU^JaJmHnHu hd}hCJOJQJ^JaJhCJOJQJ^JaJhz?ajhz?aUh4*B*ph4@4 xHeader  p#4 @4 xFooter  p#.)@!. x Page NumberFV@1F qFollowedHyperlink >*B*php3b78nIe01>?RGH  !" {"|"""6#7###+$,$$$%%Z&[&9':'''P(Q())R+S+S,T,i,j,v,,O--.$///////'0(000001111 2 222x2y2333 3!3#3$3&3'38393;3<3I3m3n3q300000000000000000000000000000000000000000000 0 0000000000000000000000000000000000000 00 00 00 00 00 00 00 00 00 0 000000 0 0 0 0 0 00000000000000000000000@0y00@0y00@0y00@0y00@0@0@0@0@0@0@0y00,1>?111q3K00tK00K00 0K00 0 00 !!SSSV1 |*389<;p;!#$%'()+-.0 *v4y:p;"&*,/o; R*1w11p3XX3:<ALNV!@   @ n(  6    B S  ?(  HB  C Dp3LL#tVw#wu _Hlt114480206 _Hlt11448020711q3@@11q3   O '!l |!4a! Y!$V!DY!  T ԫ!IIS^%q3    &&R]dd$//q3   = *urn:schemas-microsoft-com:office:smarttags PlaceName=*urn:schemas-microsoft-com:office:smarttags PlaceTypeV *urn:schemas-microsoft-com:office:smarttagsplacehttp://www.5iantlavalamp.com/h *urn:schemas-microsoft-com:office:smarttagsCity0http://www.5iamas-microsoft-com:office:smarttags FX   8>^d333333 3!3#3$3&3,3.383n3q3al333333 3!3#3$3&3'373n3q3331R #7#T,h,j,v,// 22333333 3!3#3$3&3'383<3N3X3\3j3l3n3q3#0XYad/;r99 & c p    .KUPXHT""##F%P%%%v(( )))))):*G*a*k***-..../:/G/~//F0K000F2Q2[2\2d2e2g2h2k2l2o2p2u2y23333333 3!3#3$3&3,3.383n3q3-+I,}P|`.[kf"AozIh^`OJQJo(hHh^`OJQJ^Jo(hHohpp^p`OJQJo(hHh@ @ ^@ `OJQJo(hHh^`OJQJ^Jo(hHoh^`OJQJo(hHh^`OJQJo(hHh^`OJQJ^Jo(hHohPP^P`OJQJo(hHh ^`hH.h ^`hH.h pLp^p`LhH.h @ @ ^@ `hH.h ^`hH.h L^`LhH.h ^`hH.h ^`hH.h PLP^P`LhH.h ^`hH.h ^`hH.h pLp^p`LhH.h @ @ ^@ `hH.h ^`hH.h L^`LhH.h ^`hH.h ^`hH.h PLP^P`LhH.h^`OJQJo(hHh^`OJQJ^Jo(hHohpp^p`OJQJo(hHh@ @ ^@ `OJQJo(hHh^`OJQJ^Jo(hHoh^`OJQJo(hHh^`OJQJo(hHh^`OJQJ^Jo(hHohPP^P`OJQJo(hHAoz[k}P-+I                                    LWA@i X ~ ;y xpYR!~$*f*o, 5g509j=x>qF_H4http://info1.za.uni-koeln.de/ddicdg/documents/RCNS_050706.zipxT"mailto:reto.hadorn@sidos.unine.ch  !"#$%&'()*+,-./013456789;<=>?@ABCDEFGHIJKLMNOPQRSTVWXYZ[\^_`abcdgRoot Entry F ϺiData 21Table:5WordDocument.bSummaryInformation(UDocumentSummaryInformation8]CompObjq  FMicrosoft Office Word Document MSWordDocWord.Document.89q