ࡱ> ML \p Reto Hadorn Ba==x<L;"8X@"1Arial1Arial1Arial1Arial1$Arial1 Arial1Arial1Arial1Arial1QTahoma1QTahoma1Batang1Batang1Arial1Arial1Arial1Arial19Arial19Arial""SFr."\ #,##0;"SFr."\ \-#,##0'""SFr."\ #,##0;[Red]"SFr."\ \-#,##0(#"SFr."\ #,##0.00;"SFr."\ \-#,##0.00-("SFr."\ #,##0.00;[Red]"SFr."\ \-#,##0.00F*A_ "SFr."\ * #,##0_ ;_ "SFr."\ * \-#,##0_ ;_ "SFr."\ * "-"_ ;_ @_ .))_ * #,##0_ ;_ * \-#,##0_ ;_ * "-"_ ;_ @_ N,I_ "SFr."\ * #,##0.00_ ;_ "SFr."\ * \-#,##0.00_ ;_ "SFr."\ * "-"??_ ;_ @_ 6+1_ * #,##0.00_ ;_ * \-#,##0.00_ ;_ * "-"??_ ;_ @_ "$"#,##0_);\("$"#,##0\)!"$"#,##0_);[Red]\("$"#,##0\)""$"#,##0.00_);\("$"#,##0.00\)'""$"#,##0.00_);[Red]\("$"#,##0.00\)72_("$"* #,##0_);_("$"* \(#,##0\);_("$"* "-"_);_(@_).)_(* #,##0_);_(* \(#,##0\);_(* "-"_);_(@_)?:_("$"* #,##0.00_);_("$"* \(#,##0.00\);_("$"* "-"??_);_(@_)61_(* #,##0.00_);_(* \(#,##0.00\);_(* "-"??_);_(@_)"Yes";"Yes";"No""True";"True";"False""On";"On";"Off"],[$ -2]\ #,##0.00_);[Red]\([$ -2]\ #,##0.00\)                       X   X  X       X X  X  X   X    X       (  %     \  % %   0X 0X  1X 0 1X  5 5  x""    x!"    x"   x!   x"   x !  ! )x!! )x"! )x!   x!"     x""     x"    x!    x"    x 1 )x!! )x"! )x!  1 1  8""@ @  8 " @   8" @   8  ( (  `"RSheet1uSheet2vSheet3)_ftn1<:_ftnref1:TZR3 A@@   NotesElement Structure (Name) Nesting Level Data Type OccurrenceMachine ActionableYG E N E R A L C O M M E N TProjectTableStudyDescription of a research or data collection project. The idea behind, the authors, the funding agency the general methodology etc. etc. A research project can be described before any data collection is made, e.g. when the project is planned or accepted by the funding agency (Life cycle perspective). One project can feed several Studies, defined as data schemas and one Study can be funded under several successive projects. Details should be developed elsewhere.4 Link_Project_StudyDatasetCDefines the set of data expressing a single concrete sample. In the terminology of the RC-NS thing, it is a simple dataset. Here is the place for time of collection, sample size etc. Foreign key: Study. Normally, a dataset cannot belong to more than one Study (data schema). But in the context of a RC-NS, a dataset can be considered both under a 'local' perspective and under the 'cross-national' perspective. If the program and the local implementations are both described as a studies, then there is an m:n relationship between studies and datasets and a link table is needed.lz General data schema. Refers to a universe, a sampling type, a general design (single, comparison, repetition). If the very same schema is used repeatedly,or over distinct sampls but without any variation, several datasets can be linked to the same study. The fields must be distributed accordingly between the two objects. It has yet to be decided whether a study program like ISSP or EB can be described with the data elements in the Study object or if a distinct object has te be defined. Has to be discussed elsewhere in more detail. Link_Study_DatasestForeign keys: id_Study and id_Dataset; only necessary if the study program is described as a study (which is the working hypothesis in this proposal).%Foreign keys: id_Project and id_Study StudyDesignValuesiCross-section (default) | Cross-sample | Longitudinal | Longitudinal Cross-sample (Repeated Cross-sample)z The reference data schema supposes the subdivision of the information presently in Study description between three objects: the Study, the Project and the Dataset, where the relationships may be of various types (not a clear hierarchy at all steps). Although the RC-NS proposal supposes this kind of organisation, it concerns the general design of the metadata and cannot be further discussed here. Open question: should the whole program be described in a separate object, Program, or can it be described as a Study of the Program type? This is also a general design problem, which has to be discussed. In the first case, a table has to be added below, and links defined with the other objects available; in the second case, we need a field for characterising the type of study (program or implementation study) and a reference between studies. The following relies on the second approach. StudyStudyRefOnly if the special metadata structures for RC-NS are to be used. Informs the systems using the metadata about the use of the data structure; is not redundant with any other present DDI element. Cross-sample is preferred to cross national as being more general. StudyLevelProgram | ImplementationOnly if the special metadata structures for RC-NS are to be used. Informs the systems using the metadata about the use of the data structure,Id of the Coordination or Hamonization studyuId of country studies, uncoordinated studies involved in harmonization or successive studies needed in the same space2Explanation on special characteristics of the link Id_Super Id_Sub LinkType CommentCodedNumeric Long textQualifies the reference TimeCoord SpaceCoordoThe positions of a simple dataset on the time axis in a given survey program. For more explanations, see above.Default value = "Single" * Example of a somewhat fancy list: Standard | Creta | Naxos | Zermatt | Florida | Compound | IntegratedTe position of a simple dataset on the space axis in a given survey program. The position of a simple dataset within a compound dataset is defined by a set of two coordinates, one for time and on for space. These are not redundant with space and time information already available in the DDI; these coordinates must refer to a restricted list of points in space and time, which are appropriate for the given study. The value domains as designed in the MetaDater model are appropriate for the definition of such lists. In this case, the domain to be used is defined on Study level. The program will add automatically the values 'Standard', 'Compound' and 'Inte-grated' to the value list. For more explanations, see documents.^Default value = "Single" * Example of values: 1998 | 2000 | 2002 | 2004 | Compound | CumulatedDefines a reference between (local or successive) implementation studies and the program study (or study describing the program activities: coordination, standard definition, control on consistency, control on translation management, data processing etc. etc.).  zThe data elements described in this table relate to the concetps defined in the set of documents obout the RC-NS approach.jy ~A data model supporting the life cycle of the data and metadata necessarily represents the information in distinct stages of elaboration and in various versions. So, a special challenge of the representation of cross-national and diachronic datasets is to account for (1) changes over space and time and (2) various states of the 'thing' (standard, compounds, integrated/cumulated).& ]There is a special difficulty with the cross-national or cross-sample data: various parts of the work are done in various places, by distincts agents, using distinct instruments. Even if they use the same instrument, say, a metadata management instrument, they may use various instances of it. This situation is part of the life cycle of (meta)data.r The present proposal attempts to formalise the various states, in which meta/data may exist, due to the various stages they go through. It does not solve the problems related to the distributed work in the case of the cross-national study. In this sense, the proposal is not a definitive solution but an information structure, which would be a starting point for asking the next best questions./= The most important difficulty in drafting the proposal relates to the choice of the model of reference. Some details of the solution proposed may vary depending on the way the rest of the data model is built. This makes it impossible to make a fix proposal: decisions must be taken in related domains. Revising the DDI for more complex datasets cannot be solely done adding new elements, as implied in the update procedure, but supposes deeper revision of the concepts.Tf ' Cardinality of the relationship - The relationship between studies is basically a 1:n one  one superordinated to several subordinated studies. Two arguments are yet in favor of a special linking table: (1) Those links are used only for a small number of studies, so it is not appropriate to add fields into the Study table, which will mostly stay empty. (2) Experience will possibly show the need for other kinds of links between studies, where more flexibility is needed. The link table diminishes the risk for a future design change on this matter. To be definedThe elements defined in th< is table refer to a relational data model, including not only references to other elements defined in the document, in the way usually done in XML, but linkage elements defined only by foreign keys and some attributes characterising the link. To be underscored: the evaluation of the model should not depend on the language used (ERD) but on the concepts it defines..C  DsDsRefNo references between datasets have been evoked in the construction of the compound dataset as exposed in the reference document, probably because a focus on studies and variables. They are not absolutely necessary. But as for the study design, it may be good practice to declare the relationships between datasets, as this may enable the program to give the user a better support while checking the relationships on question and variable level.Id of the Standard dataset or of any anterior dataset; id of integrated or cumulated dataset; id of dataset in harmonization study/Id of country datasets or any posterior dataset Space | Time IdSuper IdSub Cardinality - The relationship between studies is basically a 1:n one  one superordinated to several subordinated datasets. Two arguments are yet in favor of a special linking table: (1) " Those links are used only for a small number of datasets, so it is not appro-priate to add fields, which will mostly stay empty; (2) " Experience will possibly show the need for other kinds of links between datasets, where more flexibility is needed. The link table diminishes the risk for a future design change on this matter. Table/ObjectQuestionVariable Data typesTable/Object: a table in a relational model, with much information in related tables. Coded: a numeric value associated with multilingual semantics V\The position held here is that the definition of questions and variables is concommitent (there is no question, which does not imply one or several variables of some type). I is a conceptual error to think of the description of a questionnaire indpendently of the variables fed by the questions. This does not mean that variables can only be defined by questions; but as far as question structures are the issue, the variables are necessarily part of the definition.Question/Variable (Q/V)Concept/ObjectA question element represents what the user of a questionnaire usually identifies as a single question, whatever its internal complexity. Complexity is rendered by (1) a self-reference of the question on itself (2) a relationship to all variables fed by the same question, organised by variable domains. For more details, refer to the Q/V package passed on to Tom Piazza. There should be no hierarchy between Question and Variable.  . Field!Well& define it however you want. ReferencesConceptNReferences play a central role in construction the compound datasets and preparing the computation of the integrated, the cumulated, the cumulated integrated and the harmonized datasets. Questions and variables have been said to constitute one single, but complex, object (see definitions in the first chapter of this document). How will these references be represented in the database? The simplest way would be to do it with just a reference from question to question  admitting that the corresponding variable set is linked the same way. This would yet force to define too many constraints on the definition of variables and entail the flexibility of the system. In addition, variables, which are not defined by questions, could not have references. For this reason, reference information is necessary on both the question and variable level.QuestQuestRef VarVarRefIdSource%Id of the Standard, the Previous etc.IdTarget(Id of the country, posterior object etc.EStandard | Integrated | Previous | Cumulated | Original | Computation IdSource IdTarget RefTypeVarCode VariationCommentVariationId IdenticalMCountry question is wholly identical with question in reference questionnaireWordVar. in wording8Wording of question is different, aiming similar contentVarStrucVar. in question structureRSubquestion list is eventually different or multiple where standard is simple etc. Word+VarStruc"Var. in wording and quest. struct./Both wording and subquestion list are differentiSubquestion wording (if available) and value structure identical with question in reference questionnaireVarWordVar. wording subQ Variation in subquestion wordingVarValVar. values structure$Variation in the structure of values VarWord+ValVar. wording and values>Variations both in subquestion wording and structure of valuesNotAvNot available in standard Q.KSubquestion not available in reference questionnaire, although question is.NotAppropriateEvaluation not appropriate8For example: the question is open, answers are postcoded VariationTypeReference typeStandard Standard Q/V Country Q/V IntegratedIntegrated Q/VPreviousFirst occurrence of Q/VPosterior wave Q/V Cumulated Any wave Q/V Cumulated Q/VOriginalQ/V in any dataset in any study$Copy of Q/V in harmonization dataset Computation(Q)/V used in construction Constructed VGSummary of the uses of the reference tables for various reference typesCf. summary belowCf. values on the rightReferences between study descriptions, questions and variables describe the kinship relationships among space (reference to the standard or coordination study) and over time (references to previous occurrences of the same question and variable). Those relationships are the backbones along which the synthetic datasets can be processed (the integration of country datasets as well as the cumulation of the repeated cross-sections or integrated datasets). The metadata for the integrated and cumulated datasets can be derived from the metadata captured for describing the country studies/datasets and their relationships to the standard and its evolution through time.Given the two generic approaches to the comparison issue ("start with the planned comparative study" vs. "just pick two variables at random and tell me how far they compare"), the proposal takes decidedly the first approach. Any comparison bases on a conceptual framework, which is set by a 'comparison study'. In the case of a comparative study program like EB or ISSP, this role is played by the study program or coordination study, which describes the coordination efforts. In the case of datasets or variables taken from uncoordinated studies, a 'comparison study' has to be set up, whichsets the frame for the comparison of the critical information at all related levels (concepts, methodology, variable< definitions). Variables compare not in abstracto but with reference to a conceptual and methodological framework. So the planned comparative study sets the scenes for any kind of comparison.g u0 ;V'3M8_n= D:|EOFG3HIJJEcc~  HZfqs  dMbP?_*+%M\\sidos5\Canon PCLW odXLetterCanon t  Canon GP300-405 PCLAdddddddd     ddd@@d  d d dd d     d d d"  edddd d!d ddA      8o @@/ o @@/ d2CONFIDENTIALCONFIDENTIALHArial AXX\SRGBCO~1.ICM\SRGBCO~1.ICM\SRGBCO~1.ICMSIDOS Defaultss"dX??U} 3} m } m } m } $ } I@} F} } ;} $ H $$$$ $$ $ > @ @  $$$C$ + 0      &1 ! !1 !!2"""# #.'2"""# +/'2"""# +0'2"""# +1 S2"""#+ S2"""# +5S 2"""# +2 '  .A /B 4 "> "# # 4 " "# # 4 "> "# %' 2 "G "#~ "? # # ' 2 "G "#~ "? # # ( 4 " "# #- ) 5 ,G "$" # , , 5 G "$" # , , 5! G "#" # *& ,4 5" G "% "# , ,5 ""# -3* 4 " "# #' 4 "> "# # 2( "G "#~ "? # #+ )* 2' "G "#~ "? # #) ), 66 " "# ,7* 5; G "$" # ,8* 5< G "$" # ,9* 5! G "#" # ,: 5" G "% "# ,*D lb",,,0$,,88BbbF^^hT2B8llBZZP ! " # '$ $% $& $' $( $ ) $ * $ + $ , $- $. $/ $0 $1 $2 $3 $ 4 $ 5 $ 6 $ 7 $ 8 $ 9 $: ;  < : =  >  ?   5 ""# -= * !6D !"E !"# !,C!* "6I ""J ""# ",K"* #6? #"> #"# #,F#* $=L $">$"#>' %>S %$G %$$%" %# %>O%' &>T &$G &$$&" &# &>Q&' '>U '$G '$#'" '# '> '>R (>u ($G ($#(" (# (> (?V (@W (AX)> )"#> )BY )CZ )D[*> *"#> *B\ *C] *D^+> +"#> +B_ +C` +Da,> ,"#> ,Eb ,Fc ,Gd ->" -$G -$%-"#>' .H@ ."> ."# .>H.' /HM /" /"#>' 0>S 0$G 0$$0" 0# 0>O0' 1>T 1$G 1$$1" 1# 1>Q1' 2>U 2$G 2$#2" 2# 2> 2>R 3>u 3$G 3$#3" 3# 3> 3IV 3JW 3KX 4"#> 48Y 47Z 49e 5H" 5"#> 58f 57g 59h 68i 67j 69k7L7""#> 78l 77m 79n8L8""#> 88o 87p 89q9M>""## 9:r 9;s 9<t :)" :G :% :"##;)"##<, <"# <R <Nv <ON <OP=,="##PQQ>, >"## >Pw >Qx >Qy?, ?"## ?Pz ?Qy ?Q{D: l2BBB.ZZ^zDDDD<B.ZZ^z:H*FF@:P D@  A  B  C  D E F G @, @"## @P| @Q} @Q~A, A"## AP AQ AQB, B"## BP BQ BQC, C"## CP CQ CQD)"##E)"##F)"##G)"##DDDD(    B`e XPP? 4]4@`e H{͸I/r~ 8<Reto Hadorn: <8 W '    d Reto Hadorn>@7  v  dMbP?_*+%"??U>@7  w  dMbP?_*+%"??U>@7 Oh+'0@HXl J Gager Reto HadornMicrosoft Excel@Mj@ ՜.+,D՜.+,P  PXd lt|  Sheet1Sheet2Sheet3Sheet1!_ftnref1  Worksheets Named Ranges0@$_AdHocReviewCycleID_EmailSubject _AuthorEmail_AuthorEmailDisplayName_ReviewingToolsShownOnce6Latest Spreadsheetsj.b.gager@gmail.comJ Gager  !"#$%&'()*+,-./0123456789:;=>?@ABCEFGHIJKRoot Entry FWorkbookwSummaryInformation(<DocumentSummaryInformation8D