ࡱ> oqn` Dbjbj .n<xxxxxxx!!!8L!\!D72!":H"H"H"N$h$ t$97;7;7;7;7;7;7$8hT;R_7xr) $",$"r)r)_7xxH"H"t7333r)xH"xH"973r)97335xx=6H"! @<~P!<- 6977076,; /;=6=6;xQ6|$V%3&'|$|$|$_7_7&0|$|$|$7r)r)r)r)$dxxxxxx Proposal of the DDIs Comparative Data Group (draft version 1 August, 31st 2005) [Open topics or questions in blue.] 1. Working name for proposal Comparative Study Proposal (working title) 2. Working Group Comparative Data Group Iris Alfredsson, Swedish Data Archive (SDA) Atle Alvheim, Norwegian Social Science Data Services (NSD) Jan Goebel, German Socio-Economic Panel (SOEP) Peter Granda, ICPSR Reto Hadorn, Swiss Data Archive (SIDOS) Ryan Johnson, Washington State University Mari Kleemola, Finish Data Archive (FSD) Mark Maynard, Roper Center Meinhard Moschner, Central Archive (GESIS/ZA) Pilar Rey del Castillo, Centro de Investigaciones Sociolgicas (CIS) Ingo Sieber, German Socio-Economic Panel (SOEP) Wendy Thomas, Minnesota Population Center Achim Wackerow, Center for Surveys, Methods and Analysis (GESIS/ZUMA) Oliver Watteler, Central Archive (GESIS/ZA) 3. Alliance Member Sponsors Center for Surveys, Methods and Analysis (GESIS/ZUMA) Central Archive (GESIS/ZA) Centro de Investigaciones Sociolgicas (CIS) Finnish Social Science Data Archive (FSD) German Socio-Economic Panel (SOEP) Inter-university Consortium for Political and Social Research (ICPSR) Minnesota Population Center Norwegian Social Science Data Services (NSD) Roper Center Swedish Social Science Data Service (SDA) Swiss information and data archive service for the social sciences (SIDOS) Washington State University 4. Architect [Who?] [Role of the architect according to Manual: The lead organization designates an individual who will act as an Architect for the specification. The Architect is the contact person for any questions or comments from the Director, Expert Committee members, or the public. The Architect is responsible for shepherding the proposal through the review process. Tasks include: Serve as Primary Contact for all questions Create and maintain the Comment Log, in which comments are received, logged, brought to the group (including the Implementer), discussed, and decided upon Add disposition information to the Comment Log, at which point an issue is closed (the same comment or issue cannot be raised twice)] 5. Summary 5.1 Class of data files [better studies?] being addressed The studies addressed comprise all those that are considered to be comparative or potentially comparable. This covers a great variety of examples in which comparison runs along the lines of time or space or both. There is no easy definition and the expanded DDI standard should not impose any restrictions. [Examples from list Moschner] 5.2 Brief problem statement [DDI 2.0 not able to handle certain problems of comparative studies so far.] Comparison of studies is a wide area that needs to be broken down in order to handle it appropriately for documentation purposes. The major problem is that we are facing an abundant list of cases that can hardly be defined in a common way. Furthermore the perspective of the DDI Alliance (and the DDI Group before) has shifted from a mere translation of existing codebooks into XML to for example the potential definition of subsets without the physical integration of the data, to supersets covering series of studies or groups of studies with similar content (e.g. election studies), or the wide field of relationships between and within individual studies that might be turned into technical references (links) for the retrieval of information. We are more and more turning away from human readability and focus more on machine actionability, since tools are being designed and constructed that revolutionize the world of documentation. In every case a kind of comparability statement should be included to show the relationship between the elements that are being compared. [What is a study? Do we need to define this or is it irrelevant for us in this situation?] Different questions can be raised when documenting a comparison: a) What is being compared? b) Where is the comparison being done? c) Who does the comparison? (Authoring statement) d) When is the comparison being done? Was it planned before (comparison by design) or done after the fact? e) How is the comparison being done? Was an alteration process involved (e.g. recoding, translation)? The comparison does not necessarily have to include a physical integration (appropriate label?) of the data underlying the study. Here we follow the proposal of Reto Hadorn and distinguish between a stage of vitual combination and physical combination. Following Hadorn we will use the term compound to determine datasets are might be compared over space and/or time (space-compound, time-compound, space-time-compound). It is important to note that we are talking about the datasets and not the studies at this level. A study can include more than one dataset and a dataset can be present in more than one study. If the data is being physically integrated, Hadorn suggests to use the term integration for the combination of data over space (e.g. countries) and cumulation for the combination over time. The Comparative Data Group is well aware of the fact that those terms, as well as things like study, are being widely used throughout the social science community. We nevertheless deem it necessary to pin them down to a certain meaning, which suites the needs of this proposal. The details of Hadorns proposal are laid out in the Repeated Cross-national Dataset-package. a) Questions a and b can be combined: What is being compared where? Comparison is done on two levels: 1) the study level and 2) the variable level. On the study level we need to consider the following topics: 1.1) the sampling process > Who was asked where, when and how? [Do we share a common definition of sample? Does it only cover the sampling method or the entire sampling process including the choice of universe, the definition of the sample and the actual sampling process.] - Who was asked? Demographic items. - Where was this person asked? Geographic coverage ( GEOGRAPHY WORKING GROUP - When was this person asked? Field time, time and date of interview. - How was the person asked? Interviewing instrument used. ( INSTRUMENT WORKING GROUP 1.2) the theoretical concept The major comparisons are being done on the variable level. Here we have to consider the following topics: 2.1) the questions - Question wording - Answer categories and codes - Interviewing instructions - Theoretical concept - Translation of the question wording, the answer categories or the interviewing instruction being done following a master questionnaire or after the fact in a harmonization process. - Adaptation of the question wording. This is similar to the translation but does not necessarily comprise another language (e.g. American and British English). [What about the theoretical element indicator? We talk about questions and their theoretical concept, but the indicator should be part of the chain since the theoretical concept is turned into one or more indicators, which in turn are operationalized by questions.] b) Where is the comparison being done? (combined with question a) [Tools?] c) Who does the comparison? (Authoring statement) d) Question d can be combined with question e: When and how is the comparison being done? The two major points in time when comparisons are considered are: 1) before the fact (comparison by design), and 2) after the fact. In both cases harmonization steps might be necessary to complete the comparison. This harmonization process may include: - the translation and adaptation of question wordings, answer categories and interviewing instructions, - the recoding of answer categories, - the construction of new variables. Therefore the following topics need to be taken into account: d.1) translation > Who does translate what according to which standard and using what method? d.2) recoding > Who recodes what how according to which standard? The question how the recoding is done includes a construction statement that might also comprise the actual programming code used e.g. in a statistical software package to calculate the new categories. d.3) construction > Who does construct which variable for which purpose according to which standard? The same prerequisites as for the recoding hold true for the construction process. Furthermore, a derived variable can be made up of more than one indicator, which in turn is operationalized by a single question. Therefore the following relationships need be kept in mind: d.3.a) relations between two or more variables (n variables can make up a new variable), d.3.b) relations between indicators (resp. questions) and variables (n indicators can make up one variable). Harmonization can be seen as a continuum between a perfectly designed comparative study, that does not need any adaptive work after the surveys were taken, on the one end and a comparative study by chance (or by accident) where the samples have to be weighted of filtered and the majority of variables has to be rearranged, on the other end. The reality lies in between those poles. Harmonization can be divided into technical harmonization and harmonization in content. In the first case all information is being kept (e.g. adaptation of codes over time) and in the later information might be reduced to make variables comparable (e.g. construction of indices). Standardization might be seen as a specific form of harmonization using a defined scheme according to which the changes are being made. If the variable is at the core of the comparison, the entire bulk of information on the study level becomes the context of that single variable. This leads to a shift in magnitude similar to the ISO 111-79 standard, which puts this single variable at the center of attention. An important role in the field of harmonization is played by controlled vocabularies. Different groups of terms can be summarized under this topic: 1) Classification schemes (e.g. for the description of professions or educational levels), 2) Keywords in the form of a plane or hierarchical list, and [A list of keywords can be derived from a textual analysis or another source. It does not need to be ordered or have a specific hierarchy. It might only be controlled by the person utilizing.] 3) Thesauri. [A thesaurus is a controlled list of terms which is in alphabetic and systematic order. The systematic order concerns the semantic relationship between the terms. It covers a certain area of application.] e) How is the comparison being done? (Combined with question d) 5.3 Brief verbal summary of solution The Comparative Data Group does not propose specific elements or attributes to solve the shortcomings of the DDI-standard so far. This is due to the fact that (a) DDI already includes a great deal of elements and attributes that can be used for the documentation of comparisons and (b) the redesign of the standard along a live-cycle model including its modularization open greater possibilities of repeating entire blocks of information that are needed for comparison. Two proposals have been made to the CDB by members, which will be briefly introduced here. For a full description, see the corresponding documents on the Web page of the CDG. Wackerow Hadorn This proposal takes the life-cycle stance and tries to respond to the question: which are the various states a study goes through, which is planned to be comparative through space (e.g. a cross-national study) and time (e.g. a repeated cross-national study). The idea is that a metadata model covering the life cycle of such a study must fulfill the requirements of various steps of the work with the various datasets involved: the standard definition (questions and variable definitions), the country datasets, the dataset resulting from the integration of the country datasets, the next wave's standard, the corresponding country datasets and integrated dataset, the cumulated datasets on country level as well as the cumulation of the integrated datasets. References between study descriptions, questions and variables describe the kinship relationships among space (reference to the standard or coordination study) and over time (references to previous occurrences of the same question and variable). Those relationships are the backbones along which the synthetic datasets can be processed (the integration of country datasets as well as the cumulation of the repeated cross-sections or integrated datasets). The metadata for the integrated and cumulated datasets can be derived from the metadata captured for describing the country studies/datasets and their relationships to the standard and its evolution through time. Given the two generic approaches to the comparison issue ("start with the planned comparative study" vs. "just pick two variables at random and tell me how far they compare"), the proposal takes decidedly the first approach. Any comparison bases on a conceptual framework, which is set by a 'comparison study'. In the case of a comparative study program like EB or ISSP, this role is played by the study program or coordination study, which describes the coordination efforts. In the case of datasets or variables taken from uncoordinated studies, a 'comparison study' has to be set up, which sets the frame for the comparison of the critical information at all related levels (concepts, methodology, variable definitions). Variables compare not in abstracto but with reference to a conceptual and methodological framework. So the planned comparative study sets the scenes for any kind of comparison. The main data elements necessary for this approach are described at the end of the text explaining the proposal. If this conceptual proposal, the next step concerns the representation of that concept for several related applications involved in the process; concretely: what additional metadata elements are necessary for several IT systems using the same general model to communicate think of a comparative study involving a coordination team and several local teams, all using distinct instances of a system based on the same metadata model. This would just be a step more in the life cycle stance. 6. Issues [I put together topics that we have discussed so far. They are coming from the sources on our website and are covered under point 5.2.] Comparability statement External classification schemes External keyword lists (including thesauri) Harmonization process - Translation and adaptation of question wordings, answer texts and interviewing instructions - Recoding - Computing or construction - Application of standards before or after the fact (standardization) Relations between - two or more variables - two or more questions concerning their wording - a variable and n questions (or indicators) - m datasets and n studies - m projects and n studies - two or more studies concerning the sampling - two or more studies concerning the theoretical concept 7. End result sought 8. Rational for change 9. Use cases 10. Background information All necessary documents are available through the CDGs website at <  HYPERLINK "http://info1.za.uni-koeln.de/ddicdg/main.html" http://info1.za.uni-koeln.de/ddicdg/main.html>. Of special importance is the Repeated Cross-national Dataset-package, which includes detailed descriptions of the terminology used throughout the proposal. [List of comparative studies?]      FILENAME Proposal_draft version 1_31082005.doc 08/31/2005  ____________ - p.  PAGE 7 of  NUMPAGES 11 - .KMTVz{o ڷکq_NqNqN= h hx>OJQJ^JmH sH  h h]OJQJ^JmH sH #h hXY5OJQJ^JmH sH #h h]5OJQJ^JmH sH  h h OJQJ^JmH sH )h h B*OJQJ^JmH ph3fsH h OJQJ^JmH sH &h h 5H*OJQJ^JmH sH h 5OJQJ^JmH sH #h h 5OJQJ^JmH sH &jh OJQJU^JmHnHu.TUVz{ V  / ] B n gd]$a$gd DDn o 4 W  h  8Ye & Fgd] & FgduVgduVgd] 3 4 V  h  XYZ͹oaoP h hiOOJQJ^JmH sH  *h\huVOJQJ^J# *h\huVOJQJ^JmH sH # *h\h-OJQJ^JmH sH # *h\huVOJQJ^JmH sH & *h\huV5OJQJ^JmH sH & *h\hx>5OJQJ^JmH sH  h h]OJQJ^JmH sH  h hx>OJQJ^JmH sH  h hx>OJQJ^JmH sH Zef~ ]_KR_rʸܧܧ܅ܧtcQ?Q?# *h\h 5OJQJ^JmH sH # *h\h 4OJQJ^JmH sH  h h 4OJQJ^JmH sH  h h*OJQJ^JmH sH  h hiOOJQJ^JmH sH  h h-OJQJ^JmH sH  h hEOJQJ^JmH sH #h hE5OJQJ^JmH sH #h h!5OJQJ^JmH sH  h h!OJQJ^JmH sH #h huV5OJQJ^JmH sH ef^_  4Ovy`xgd gd*gd]  ,6VWXͼr`RARRAR jhUOJQJ^JmH sH hUOJQJ^JmH sH #h\h*5OJQJ^JmH sH  h h*OJQJ^JmH sH #h\B*OJQJ^JmH phsH # *h\h*OJQJ^JmH sH )h h*B*OJQJ^JmH ph3fsH  h h*OJQJ^JmH sH  h hOJQJ^JmH sH  h h 4OJQJ^JmH sH  h h 5OJQJ^JmH sH `,#Vp ( !!a"m###x^`gdU x^gdU x^gd xgd a""" ##0#:#a#j#m###6&A&&&''**V+k+r+++++,,6//0011p_ h hiOOJQJ^JmH sH  h h?\OJQJ^JmH sH  h hsOJQJ^JmH sH  h hrOJQJ^JmH sH hU2OJQJ^JmH sH # *hU2h*OJQJ^JmH sH , *hU2h*B*OJQJ^JmH ph3fsH )h h*B*OJQJ^JmH ph3fsH  h h*OJQJ^JmH sH ###D$$$$A%%%%1&&&'()j))V+, ...6//xgd||gd* x^gd|| xgd xgd /001171813333333e9f9<<<<I?T?U????@B@X@gd]xgd||17181111 3 333333ʸrdSE0)hthU2B*OJQJ^JmH ph3fsH hiOOJQJ^JmH sH  h hEOJQJ^JmH sH hU2OJQJ^JmH sH  h hiOOJQJ^JmH sH # *h-hiOOJQJ^JmH sH # *h-h\OJQJ^JmH sH  h h\OJQJ^JmH sH # *hU2h\OJQJ^JmH sH # *hU2hEOJQJ^JmH sH  h h_HOJQJ^JmH sH #h h!5OJQJ^JmH sH  334405<578d9e9f9j9k9::;;,<7<8<:<P<\<<<<<=H?I?ԿԿԪԪkkkTkkk,hth&]6B*OJQJ^JmH ph3fsH )hth7[B*OJQJ^JmH ph3fsH )hthlYB*OJQJ^JmH ph3fsH )hth&]B*OJQJ^JmH ph3fsH )hthH]B*OJQJ^JmH ph3fsH )hth-B*OJQJ^JmH ph3fsH )hthsMB*OJQJ^JmH ph3fsH ,hthU25B*OJQJ^JmH ph3fsH I?J?K?L?T????????Z@@@@A"A;A>AMA~AAHBIB^B_BvBwBBBBBݺveeeTT h hj=OJQJ^JmH sH  h hqFOJQJ^JmH sH  h hX OJQJ^JmH sH  h hy[oOJQJ^JmH sH  h h4OJQJ^JmH sH  h h*OJQJ^JmH sH  h h_HOJQJ^JmH sH #h hj=5OJQJ^JmH sH #h h_H5OJQJ^JmH sH  h hiOOJQJ^JmH sH  X@@@@#A5AMA~AAAABHB^B_BvBBBBCCDDDDDDgd]xgd|| x^gd||BBB"C#C$CQCRCsCCCDDDDDDDDDD Dɱڞ|g_[_[_[_[F)jhd}h&]CJOJQJU^JaJh<jh<U)h hEB*OJQJ^JmH ph3fsH !h hEB*OJQJ^Jph3f h hEOJQJ^JmH sH $h hj=0JOJQJ^JmH sH /jh hj=OJQJU^JmH sH  h hj=OJQJ^JmH sH )jh hj=OJQJU^JmH sH  h hj=OJQJ^JmH sH DDDD^D`DaDnDDDDDgd]$a$gdxgdy D*D+DPDQD]D^D_D`DaDsDtDzD{D|D}DDDDDDDDDDD﹡ubuMububuMubIh<)ht0JCJOJQJ^JaJmHnHu$hkh&]0JCJOJQJ^JaJ-jhkh&]0JCJOJQJU^JaJ hkh&]CJOJQJ^JaJh&].jh&]CJOJQJU^JaJmHnHuh&]CJOJQJ^JaJ%h&]CJOJQJ^JaJmHnHu)jhd}h&]CJOJQJU^JaJ hd}h&]CJOJQJ^JaJDD)h hEB*OJQJ^JmH ph3fsH ,1h. A!"#$n% -DyK .http://info1.za.uni-koeln.de/ddicdg/main.htmlyK \http://info1.za.uni-koeln.de/ddicdg/main.html@@@ NormalCJ_HaJmHsHtHDA@D Default Paragraph FontRiR  Table Normal4 l4a (k(No List6U@6 j= Hyperlink >*B*ph4@4 xHeader  p#4 @4 xFooter  p#.)@!. x Page Number<n.TUVz{V/]Bno 4Wh 8Yef   ^ _  4Ovy`,#Vp ( amDA1 !j!!V#$ &&&6''(())7)8)+++++++e1f14444I7T7U77778B8X8888#959M9~9999:H:^:_:v::::;;<<<<<<<<<^<`<a<n<<<<0@0@0@0@0@0@0@0@0@0@0000000000000000000000000000000000000000 0 0 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000@0y00@0y00@0y00@0y00@0@0@0@0@0@0y00\B 8YvyD !j!!):::<@0K00xK00K00K00tK00K00 0 0K0 0 I0 0I0 0 0K00 K00K00K00K00K00K00[Ae1frB2#@'0#dK0 0 K0 0K0 0K0 0K0 0K0 0@0OK00 00 NN Z13I?B DDD#'(*,/01356n e`#/X@DD$&)+-.24D%:#;Q;<X =`giny|!@   @ n(  6    B S  ?(  HB  C D<LL#tKw#wu OLE_LINK3<<9g9g94h9"9"9"9\"9"9949t99 949t99949t99949R9R""(rrhhsy<     '..zzrx<   h *urn:schemas-microsoft-com:office:smarttagsCity0http://www.5iamas-microsoft-com:office:smarttagsV*urn:schemas-microsoft-com:office:smarttagsplacehttp://www.5iantlavalamp.com/=*urn:schemas-microsoft-com:office:smarttags PlaceType=*urn:schemas-microsoft-com:office:smarttags PlaceName   'QT/78@SX]bcf 6@di/2QUaf  %IO xN] !!"""""""")))))))++++::;< <<<<<<<<<<<<<<<Q<s<}<<<<</4gj #).  "CFDG+J779:9O9R999999999::<<<<<<<<<<<<<Q<s<}<<<<<33333333333333333333333333333.T Yf  )8)++B8X888;<<<<<<<<<<<<Q<^<a<s<}<<<<<<"""")))+++++++#+$+(+)+-+.+0+1+4+5+8+9+;+<+C+E+J+K+O+P+R+S+Z+[+e+f+j+l+o+p+q+r+v+w+++++++++++++++++++++++++++++,,0-<-0d1j1k1l1o1p1s1t1{1|111111111111H7<<<<<<<<<<<<<<<AozIh^`OJQJo(hHh^`OJQJ^Jo(hHohpp^p`OJQJo(hHh@ @ ^@ `OJQJo(hHh^`OJQJ^Jo(hHoh^`OJQJo(hHh^`OJQJo(hHh^`OJQJ^Jo(hHohPP^P`OJQJo(hHAoz         LW43X ~ ;y U2<x&]#~$* 5a509j=x>qF_HuVXY7[?\y[oKr||d}134\kc--\ iOrH]]s!EyHF 4UtsMlY;Y@11%11<P@UnknownGz Times New Roman5Symbol3& z Arial;Wingdings?5 z Courier New"1 `-&f93m3m!4d;;2qHP ?]21watteler Reto Hadorn Oh+'0  8 D P \hpx1 watteler Normal.dot Reto Hadorn6Microsoft Office Word@z@h2@Jݮ@8'|P3՜.+,D՜.+,, hp|  ZAm; 1 Title 8@ _PID_HLINKSAN.http://info1.za.uni-koeln.de/ddicdg/main.html~  !"#$%&'()*+,-./012345679:;<=>?ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]_`abcdeghijklmpRoot Entry F0@~PrData 81Table@;WordDocument.nSummaryInformation(^DocumentSummaryInformation8fCompObjq  FMicrosoft Office Word Document MSWordDocWord.Document.89q