From ilona_e at berkeley.edu Tue Oct 4 11:16:09 2005 From: ilona_e at berkeley.edu (Ilona Einowski) Date: Tue Oct 4 11:15:09 2005 Subject: FW: [DDI-ADG] Aggregate Spreadsheet Message-ID: Skipped content of type multipart/alternative-------------- next part -------------- A non-text attachment was scrubbed... Name: IE 10 03 05 nCubeCompleteChanges.xls Type: application/vnd.ms-excel Size: 41472 bytes Desc: not available Url : http://www.icpsr.umich.edu/pipermail/ddi-adg/attachments/20051004/213ec547/IE100305nCubeCompleteChanges-0001.xls From ilona_e at berkeley.edu Wed Oct 5 14:14:14 2005 From: ilona_e at berkeley.edu (Ilona Einowski) Date: Wed Oct 5 14:14:05 2005 Subject: [DDI-ADG] need help clarifying descriptions Message-ID: Wendy - after you left the call yesterday we had much discussion on these two elements: dimensionReference - Row 20 identifies the dimensionVariable describing this dimension variableReference - Row 21 identifies the dimensionVariable describing this dimension The new descriptions I attempted were as follows dimensionReference - Row 20 indicates the name of the variable to be used as one of the coordinates of the nCube e.g. gender (GENDER1), or age (AGE), or party affiliation (PARTYID) . I'm not sure if this note should be here describing dimensionReference or if it belong in Row 21 to describe dimensionVariable and this Row 20 should say "unique name for this particular dimension of this variable" variableReference - Row 21 see note above for Row 20. 0r is it: identifies the variable in the file describing this dimension in the special cases where the catValu (Section 4 Variable 4.3.18.1) provides the field content and the labl (Section 4 Variable 4.3.18.2) provides the code because in the file there is a physical record for each reported value of the variable for example, in County Business Patterns there is a variable for Industry Code In current Scheme it is defined as: an IDREF that points to the variable that makes up this dimension of the nCube. I promised the group I would forward this to you and see if you could untangle it for us. Please reply all so we can all get on the same page ASAP. Thanks! Ilona Ilona Einowski Assistant Director, User Services UC Data Archive & Technical Assistance 2538 Channing Way #5100 Berkeley, CA 94720-5100 510-642-6571 ilona_e@berkeley.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.icpsr.umich.edu/pipermail/ddi-adg/attachments/20051005/2212b6d3/attachment.html From wlt at pop.umn.edu Wed Oct 5 14:19:19 2005 From: wlt at pop.umn.edu (Wendy Thomas) Date: Wed Oct 5 14:19:46 2005 Subject: [DDI-ADG] need help clarifying descriptions In-Reply-To: Message-ID: I'll try to get to this today, but since these are additions from J's version I need to sit down and look at the whole thing. I apologize that this has been low on my list, but its just the way things go sometimes. wendy On Wed, 5 Oct 2005, Ilona Einowski wrote: > Wendy - after you left the call yesterday we had much discussion on these > two elements: > > > > > dimensionReference - Row 20 > identifies the dimensionVariable describing this dimension > > > > > > variableReference - Row 21 > > identifies the dimensionVariable describing this dimension > > > > The new descriptions I attempted were as follows > > dimensionReference - Row 20 > > indicates the name of the variable to be used as one of the coordinates of > the nCube e.g. gender (GENDER1), or age (AGE), or party affiliation > (PARTYID) . I'm not sure if this note should be here describing > dimensionReference or if it belong in Row 21 to describe dimensionVariable > and this Row 20 should say "unique name for this particular dimension of > this variable" > > variableReference - Row 21 > > > > see note above for Row 20. > 0r is it: identifies the variable in the file describing this dimension in > the special cases where the catValu (Section 4 Variable 4.3.18.1) provides > the field content and the labl (Section 4 Variable 4.3.18.2) provides the > code because in the file there is a physical record for each reported value > of the variable for example, in County Business Patterns there is a > variable for Industry Code > > In current Scheme it is defined as: an IDREF that points to the variable > that makes up this dimension of the nCube. > > I promised the group I would forward this to you and see if you could > untangle it for us. > > Please reply all so we can all get on the same page ASAP. > > Thanks! > > Ilona > > > Ilona Einowski > Assistant Director, User Services > UC Data Archive & Technical Assistance > 2538 Channing Way #5100 > Berkeley, CA 94720-5100 > 510-642-6571 > ilona_e@berkeley.edu > > > Wendy L. Thomas Phone: +1 612.624.4389 Data Access Core Director Fax: +1 612.626.8375 Minnesota Population Center Email: wlt@pop.umn.edu University of Minnesota 50 Willey Hall 225 19th Avenue South Minneapolis, MN 55455 From wlt at pop.umn.edu Wed Oct 5 15:45:25 2005 From: wlt at pop.umn.edu (Wendy Thomas) Date: Wed Oct 5 15:45:55 2005 Subject: [DDI-ADG] need help clarifying descriptions In-Reply-To: Message-ID: On Wed, 5 Oct 2005, Ilona Einowski wrote: > Wendy - after you left the call yesterday we had much discussion on these > two elements: > > > > > dimensionReference - Row 20 > identifies the dimensionVariable describing this dimension > I don't know what this is as it was not in Version 2.0 It seems redundant. The varRef (Row 21) provides the IDREF to the variable describing this dimension. The name of the dimension would be obtained from the label of the referenced variable. Sorry I can't shed any more light on this. The cohort (ability to identify a subset of the categories of the dimension variable is a separate thing and appears to be in rows 22-30. I'll have to go over this whole thing carefully. Wendy > > > > > variableReference - Row 21 > > identifies the dimensionVariable describing this dimension > > > > The new descriptions I attempted were as follows > > dimensionReference - Row 20 > > indicates the name of the variable to be used as one of the coordinates of > the nCube e.g. gender (GENDER1), or age (AGE), or party affiliation > (PARTYID) . I'm not sure if this note should be here describing > dimensionReference or if it belong in Row 21 to describe dimensionVariable > and this Row 20 should say "unique name for this particular dimension of > this variable" > > variableReference - Row 21 > > > > see note above for Row 20. > 0r is it: identifies the variable in the file describing this dimension in > the special cases where the catValu (Section 4 Variable 4.3.18.1) provides > the field content and the labl (Section 4 Variable 4.3.18.2) provides the > code because in the file there is a physical record for each reported value > of the variable for example, in County Business Patterns there is a > variable for Industry Code > > In current Scheme it is defined as: an IDREF that points to the variable > that makes up this dimension of the nCube. > > I promised the group I would forward this to you and see if you could > untangle it for us. > > Please reply all so we can all get on the same page ASAP. > > Thanks! > > Ilona > > > Ilona Einowski > Assistant Director, User Services > UC Data Archive & Technical Assistance > 2538 Channing Way #5100 > Berkeley, CA 94720-5100 > 510-642-6571 > ilona_e@berkeley.edu > > > Wendy L. Thomas Phone: +1 612.624.4389 Data Access Core Director Fax: +1 612.626.8375 Minnesota Population Center Email: wlt@pop.umn.edu University of Minnesota 50 Willey Hall 225 19th Avenue South Minneapolis, MN 55455 From ilona_e at berkeley.edu Wed Oct 5 17:13:32 2005 From: ilona_e at berkeley.edu (Ilona Einowski) Date: Wed Oct 5 17:13:48 2005 Subject: [DDI-ADG] Aggregate spreadsheet with IE changes Message-ID: Skipped content of type multipart/alternative-------------- next part -------------- A non-text attachment was scrubbed... Name: IE 10 05 05 nCubeCompleteChanges.xls Type: application/vnd.ms-excel Size: 46080 bytes Desc: not available Url : http://www.icpsr.umich.edu/pipermail/ddi-adg/attachments/20051005/7eb4a4f1/IE100505nCubeCompleteChanges-0001.xls From mcneillh at MIT.EDU Thu Oct 6 17:25:11 2005 From: mcneillh at MIT.EDU (Katherine McNeill-Harman) Date: Thu Oct 6 17:25:27 2005 Subject: [DDI-ADG] Aggregate spreadsheet with KM changes Message-ID: <5.2.1.1.2.20051006171531.04412ed0@po11.mit.edu> Attached is the latest version of the aggregate data spreadsheet--revised the one Ilona just sent out--with the following changes: - I changed the name of the sheet to be called Aggregate Data, as I think that's a better description of what we're describing (ncubes are one aspect of the structure). - I went through the original notes column and fixed a few typos. - I added a sentence describing each module to the top of the sheet; someone suggested also labelling the sheets as such, but I didn't see a way to do that concisely; someone else could always try. And some questions (not enough to justify a notes column): - On the logical structure sheet, should 7-16 have descriptions? I don't know enough to give it a try. - On the same sheet, weren't we going to rename line 22 something different, as "cohort" has another meaning w/in the survey context? I can't find details in my notes from our meeting when we discussed it ~2 weeks ago, but I recall Mary mentioning something like subcategory or something. I'll be spending the rest of my time before Tuesday drafting the proposal document. I know J wanted to also finalize the spreadsheet by then, so it'll take a fair amount of someone's work to combine/reconcile the notes columns (plus deal w/the lingering questions). Kate ___________________________________________ Katherine McNeill-Harman Data Services Librarian Dewey Library for Management and Social Sciences Massachusetts Institute of Technology 77 Massachusetts Avenue, E53-100 Cambridge, MA 02139 mcneillh@mit.edu 617-253-0787 -------------- next part -------------- A non-text attachment was scrubbed... Name: KM 10 06 05 AggregateData.xls Type: application/octet-stream Size: 44032 bytes Desc: not available Url : http://www.icpsr.umich.edu/pipermail/ddi-adg/attachments/20051006/6c779481/KM100605AggregateData-0001.obj From mcneillh at MIT.EDU Fri Oct 7 16:43:07 2005 From: mcneillh at MIT.EDU (Katherine McNeill-Harman) Date: Fri Oct 7 16:43:28 2005 Subject: [DDI-ADG] draft Aggregate Change Proposal Message-ID: <5.2.1.1.2.20051007143610.00bf0610@po11.mit.edu> (for those of you who weren't in last week's meeting, we're trying to organize the Aggregate data proposal quickly to prepare for the Ann Arbor meeting; this is a draft of part of that proposal) Attached is a first draft of the Aggregate Change Proposal; please send comments/feedback to the group (and feel free to add to it and send out new versions). It is missing the use cases (which J will insert), but in writing it I realized I also have the following specific questions for the group: 1) In geography proposal, J had section for known issues (that is, things that we know are problems with the proposal, that we want to make clear). Do we have any of these? 2) How is the description of the logical structure, specifically the attributes? Do we need more detail (and if so, what)? 3) The description of the modules in the physical structure is fairly general, without references to/descriptions of specific elements. Do people feel more detail is needed? If so, I'm still not crystal clear of exactly the role of all elements w/in the different modules (w/o the notes being finalized, found myself confused), therefore feel free to expand where you think it would be useful. Kate ___________________________________________ Katherine McNeill-Harman Data Services Librarian Dewey Library for Management and Social Sciences Massachusetts Institute of Technology 77 Massachusetts Avenue, E53-100 Cambridge, MA 02139 mcneillh@mit.edu 617-253-0787 -------------- next part -------------- A non-text attachment was scrubbed... Name: AggregateChangeProposal.doc Type: application/msword Size: 32768 bytes Desc: not available Url : http://www.icpsr.umich.edu/pipermail/ddi-adg/attachments/20051007/6f59e82a/AggregateChangeProposal-0001.doc From j.b.gager at gmail.com Tue Oct 11 08:35:29 2005 From: j.b.gager at gmail.com (J Gager) Date: Tue Oct 11 08:36:09 2005 Subject: [DDI-ADG] Today's Meeting Message-ID: <001801c5ce60$43ff6340$6401a8c0@JGAGERLT> Hello All - I am wondering if today's meeting is necessary. I have everything I need for the proposals, and am hard at work finishing them. I think time may be better spent finishing these as opposed to meeting. Any objections? J -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.icpsr.umich.edu/pipermail/ddi-adg/attachments/20051011/c7194d10/attachment.html From sandai at umich.edu Tue Oct 11 08:46:13 2005 From: sandai at umich.edu (Sanda Ionescu) Date: Tue Oct 11 08:50:00 2005 Subject: [DDI-ADG] Today's Meeting Message-ID: <93111EED84D98E4C95F33D6114AB396830F7D5@isr-mail2.ad.isr.umich.edu> No problem - I can cancel, but I would like as many of you as possible to respond before, say, 10:30 am (EST) if they agree with canceling. OK? I personally don't need today's call, but I would like a chance to look at the aggregate data proposal in its final form, before it is actually sent out to the SRG. I suggest that J resends everything to the group when he's done, and we agree not to take longer than a couple of hours to read it through, so that he can submit it ASAP. Thanks :-) Sanda. ________________________________ From: ddi-adg-bounces@icpsr.umich.edu [mailto:ddi-adg-bounces@icpsr.umich.edu] On Behalf Of J Gager Sent: Tuesday, October 11, 2005 8:35 AM To: DDI-ADG Subject: [DDI-ADG] Today's Meeting Hello All - I am wondering if today's meeting is necessary. I have everything I need for the proposals, and am hard at work finishing them. I think time may be better spent finishing these as opposed to meeting. Any objections? J -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.icpsr.umich.edu/pipermail/ddi-adg/attachments/20051011/dc4709d7/attachment.html From mcneillh at MIT.EDU Tue Oct 11 09:03:32 2005 From: mcneillh at MIT.EDU (Katherine McNeill-Harman) Date: Tue Oct 11 09:03:47 2005 Subject: [DDI-ADG] Today's Meeting In-Reply-To: <93111EED84D98E4C95F33D6114AB396830F7D5@isr-mail2.ad.isr.um ich.edu> Message-ID: <5.2.1.1.2.20051011090245.01689318@po11.mit.edu> All sounds good. As I said in my earlier email, there still seems to be a fair amount of work needed to be done on the spreadsheet (i.e. combining the notes fields), so I agree with sending those to the group prior to mailing. Kate At 08:46 AM 10/11/2005 -0400, Sanda Ionescu wrote: >Content-class: urn:content-classes:message >Content-Type: multipart/alternative; > boundary="----_=_NextPart_001_01C5CE61.C3182FC2" > >No problem I can cancel, but I would like as many of you as possible to >respond before, say, 10:30 am (EST) if they agree with canceling. > >OK? > > > >I personally dont need todays call, but I would like a chance to look at >the aggregate data proposal in its final form, before it is actually sent >out to the SRG. > >I suggest that J resends everything to the group when hes done, and we >agree not to take longer than a couple of hours to read it through, so >that he can submit it ASAP. > > > >Thanks J > >Sanda. > > > >---------- >From: ddi-adg-bounces@icpsr.umich.edu >[mailto:ddi-adg-bounces@icpsr.umich.edu] On Behalf Of J Gager >Sent: Tuesday, October 11, 2005 8:35 AM >To: DDI-ADG >Subject: [DDI-ADG] Today's Meeting > > > >Hello All - > > > >I am wondering if today's meeting is necessary. I have everything I need >for the proposals, and am hard at work finishing them. I think time may >be better spent finishing these as opposed to meeting. Any objections? > > > >J >_______________________________________________ >DDI-ADG mailing list >DDI-ADG@icpsr.umich.edu >http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg ___________________________________________ Katherine McNeill-Harman Data Services Librarian Dewey Library for Management and Social Sciences Massachusetts Institute of Technology 77 Massachusetts Avenue, E53-100 Cambridge, MA 02139 mcneillh@mit.edu 617-253-0787 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.icpsr.umich.edu/pipermail/ddi-adg/attachments/20051011/04b1ceda/attachment.html From vardigan at umich.edu Tue Oct 11 09:13:40 2005 From: vardigan at umich.edu (Mary Vardigan) Date: Tue Oct 11 09:14:46 2005 Subject: [DDI-ADG] Today's Meeting Message-ID: <93111EED84D98E4C95F33D6114AB3968311D75@isr-mail2.ad.isr.umich.edu> I agree with the others that the meeting doesn't sound necessary. Thanks so much to everyone for all you have done. This group has been quite amazing. Mary ________________________________ From: ddi-adg-bounces@icpsr.umich.edu [mailto:ddi-adg-bounces@icpsr.umich.edu] On Behalf Of J Gager Sent: Tuesday, October 11, 2005 8:35 AM To: DDI-ADG Subject: [DDI-ADG] Today's Meeting Hello All - I am wondering if today's meeting is necessary. I have everything I need for the proposals, and am hard at work finishing them. I think time may be better spent finishing these as opposed to meeting. Any objections? J -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.icpsr.umich.edu/pipermail/ddi-adg/attachments/20051011/571fa706/attachment.html From j.b.gager at gmail.com Tue Oct 11 09:29:45 2005 From: j.b.gager at gmail.com (J Gager) Date: Tue Oct 11 09:30:42 2005 Subject: [DDI-ADG] Latest Spreadsheet Message-ID: <003201c5ce67$db5596e0$6401a8c0@JGAGERLT> Skipped content of type multipart/alternative-------------- next part -------------- A non-text attachment was scrubbed... Name: FINAL_AggregateData.xls Type: application/vnd.ms-excel Size: 49152 bytes Desc: not available Url : http://www.icpsr.umich.edu/pipermail/ddi-adg/attachments/20051011/5ea091bb/FINAL_AggregateData-0001.xls From wlt at pop.umn.edu Tue Oct 11 10:09:46 2005 From: wlt at pop.umn.edu (Wendy Thomas) Date: Tue Oct 11 10:10:37 2005 Subject: [DDI-ADG] Today's Meeting In-Reply-To: <93111EED84D98E4C95F33D6114AB396830F7D5@isr-mail2.ad.isr.umich.edu> Message-ID: works for me...I have budget meeting this morning and would have to leave early anyway wlt On Tue, 11 Oct 2005, Sanda Ionescu wrote: > No problem - I can cancel, but I would like as many of you as possible > to respond before, say, 10:30 am (EST) if they agree with canceling. > > OK? > > > > I personally don't need today's call, but I would like a chance to look > at the aggregate data proposal in its final form, before it is actually > sent out to the SRG. > > I suggest that J resends everything to the group when he's done, and we > agree not to take longer than a couple of hours to read it through, so > that he can submit it ASAP. > > > > Thanks :-) > > Sanda. > > > > ________________________________ > > From: ddi-adg-bounces@icpsr.umich.edu > [mailto:ddi-adg-bounces@icpsr.umich.edu] On Behalf Of J Gager > Sent: Tuesday, October 11, 2005 8:35 AM > To: DDI-ADG > Subject: [DDI-ADG] Today's Meeting > > > > Hello All - > > > > I am wondering if today's meeting is necessary. I have everything I > need for the proposals, and am hard at work finishing them. I think > time may be better spent finishing these as opposed to meeting. Any > objections? > > > > J > > Wendy L. Thomas Phone: +1 612.624.4389 Data Access Core Director Fax: +1 612.626.8375 Minnesota Population Center Email: wlt@pop.umn.edu University of Minnesota 50 Willey Hall 225 19th Avenue South Minneapolis, MN 55455 From mcneillh at MIT.EDU Tue Oct 11 12:02:48 2005 From: mcneillh at MIT.EDU (Katherine McNeill-Harman) Date: Tue Oct 11 12:03:01 2005 Subject: [DDI-ADG] Latest Spreadsheet In-Reply-To: <003201c5ce67$db5596e0$6401a8c0@JGAGERLT> Message-ID: <5.2.1.1.2.20051011112310.03f766e8@po11.mit.edu> J and others, Have a couple of questions/concerns about this new sheet; would be interested in others' opinions: 1) Can you explain a bit the reasons for changing the descriptions at the top of each module sheet? I don't care that we use exactly the words I drafted, but yours seem to have different meaning and I want to make sure we're all on the same page. Namely, - For modules 2 and 3, you seem to be emphasizing that it's for a "single nCube structure"--can you expand upon what you mean by that? - Also, I'd like to ask you to consider putting back some of the wording I'd written for module 3 that makes it clear that the metadata and data are in one single DDI file; I don't think that comes across in your phrasing. - Lastly, for module 1, I believe it'd be helpful to include some explicit reference to the fact that the external file contains no metadata (to distinguish it from module 2). 2) I'm not sure about including only the additions in the first sheet, as I think the other fields provide helpful context. Might there be a way to distinguish the new fields (as you had done, e.g., w/color) while still keeping all of them? 3) Your change to module 1, while I have no objections, seems to be significant enough that we should discuss it as a group. I don't quite understand the purpose of this. Is this something you think you could explain/we could discuss in more detail over email (or maybe phone)? 4) Plus a couple of other questions about the elements: -- in describing the attribute location choice, you refer separately to a data file vs. a spreadsheet. I understand what you're trying to do, but am a little concerned about the mutually-exclusive manner in which they're described (b/other places we use the term "data file" to include all sorts of formats, including spreadsheets, and think it should still keep that broad meaning). So I'd suggest changing the terms to say something like "fixed-format/delimited data file" and "spreadsheet data file" to distinguish the types to clarify that we consider them both data files. - In module 2 F18, what do you mean by " the structure describes all data and meta data for the cube"--that sounds to me more like module 3. - Module 3 F19; the notes contains a question; can that be deleted or should it be moved to G19? When I agreed to cancelling today's meeting, I didn't realize that you'd have such significant changes, so if it's best to discuss these over the phone, maybe we can arrange another call. Kate P.S. Plus a couple of typos - Module 1, F25, should be "measurement"--also applies to M2 F31 - Module 1, F20, ID should be capitalized - Module 2, F20, should be "coordinates" At 09:29 AM 10/11/2005 -0400, J Gager wrote: >All - > >Here is the latest spreadsheet. Note there are few significant changes >that stemmed from a long discussion Wendy and I had. > >The first is the Logical sheet. I have gone back to just including the >new fields. I felt it best to do this, since we weren't changing any >existing fields, and I want the focus to only be on these additions. > >The second is the Physical Sheets - I have changed the name of these to >Record Layout, since that is what we a truly representing. > >Finally, I have changed module 1, to allow for data items to exist outside >of nCubes. Basically what I have done is create a way to reference an >nCube and its attached attributes. The basic concept that we had >originally is still there, it is just less deviant from the original, and >oft used structure. > >Please let me know of any structural issues ASAP as the samples and write >up is based on this. > >J > >_______________________________________________ >DDI-ADG mailing list >DDI-ADG@icpsr.umich.edu >http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg ___________________________________________ Katherine McNeill-Harman Data Services Librarian Dewey Library for Management and Social Sciences Massachusetts Institute of Technology 77 Massachusetts Avenue, E53-100 Cambridge, MA 02139 mcneillh@mit.edu 617-253-0787 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.icpsr.umich.edu/pipermail/ddi-adg/attachments/20051011/24d9a317/attachment.html From j.b.gager at gmail.com Tue Oct 11 12:59:15 2005 From: j.b.gager at gmail.com (J Gager) Date: Tue Oct 11 13:00:05 2005 Subject: [DDI-ADG] Latest Spreadsheet In-Reply-To: <5.2.1.1.2.20051011112310.03f766e8@po11.mit.edu> Message-ID: <004901c5ce85$20deed20$6401a8c0@JGAGERLT> Kate - Thanks for your comments. Please see responses below. In general, I didn't think any change was significant enough to warrant further discussion. If anyone is still uncomfortable with these changes after this discussion, then we can schedule another meeting, but time is very very short for me, and the write ups for this aggregate piece are far more complicated and time consuming than I anticipated (there are a lot of details that need to be cleary explained). J -----Original Message----- From: Katherine McNeill-Harman [mailto:mcneillh@MIT.EDU] Sent: Tuesday, October 11, 2005 12:03 PM To: jgager@umich.edu; DDI-ADG Subject: Re: [DDI-ADG] Latest Spreadsheet J and others, Have a couple of questions/concerns about this new sheet; would be interested in others' opinions: 1) Can you explain a bit the reasons for changing the descriptions at the top of each module sheet? I don't care that we use exactly the words I drafted, but yours seem to have different meaning and I want to make sure we're all on the same page. Namely, - For modules 2 and 3, you seem to be emphasizing that it's for a "single nCube structure"--can you expand upon what you mean by that? What is meant by this for module 2 is that the file cannot contain multiple cubes. For instance if there were 2 cubes, say population by region, gender, and age (cube 1) and population by region and gender (cube 2), the combination of these 2 cubes in the data file would look something like this. MN M 50- 5300 MN M 50+ 6700 MN M 12000 MN F 50- 6800 MN F 50+ 5000 MN F 11800 Module 2 does not support this. Its intention is describe a file where all rows describe the same cube data. The same thing applies for module 3, since it is grouped by nCube. You would used module 3 to describe a single nCube at a time, and not a mix of nCubes and non cubed data. - Also, I'd like to ask you to consider putting back some of the wording I'd written for module 3 that makes it clear that the metadata and data are in one single DDI file; I don't think that comes across in your phrasing. I think it is too constricting. It does allow for all to be in one file, but don't we want to allow the data to also be used this way in a seperate file? - Lastly, for module 1, I believe it'd be helpful to include some explicit reference to the fact that the external file contains no metadata (to distinguish it from module 2). That may be misleading, since we are saying that attributes can exist in there, which are technically metadata. The distinction lies in the fact that module 1 data files do not state any of the cube coordinate values in them. 2) I'm not sure about including only the additions in the first sheet, as I think the other fields provide helpful context. Might there be a way to distinguish the new fields (as you had done, e.g., w/color) while still keeping all of them? It was really a time, and an issue of focusing attention. The model was incomplete to start with, and the effort of flushing out all existing things from the tag library, and putting in definitions for them is more work than it would be worth (in my opinion). 3) Your change to module 1, while I have no objections, seems to be significant enough that we should discuss it as a group. I don't quite understand the purpose of this. Is this something you think you could explain/we could discuss in more detail over email (or maybe phone)? The purpose of the change is to basically not change what is already in place. In speaking with Wendy, she pointed out how important it is to many people marking up data files to do so in the order in which the data occurs in the file. So there may be a mix of cubed and non cubed data. Further more, module 1 did not allow for any non cubed data (everything was grouped into a nCube container). The change simply replaced the inclusion of a data item into an nCube by containership, with inclusion by reference. The concept we initially had is still there, just represented differently. 4) Plus a couple of other questions about the elements: -- in describing the attribute location choice, you refer separately to a data file vs. a spreadsheet. I understand what you're trying to do, but am a little concerned about the mutually-exclusive manner in which they're described (b/other places we use the term "data file" to include all sorts of formats, including spreadsheets, and think it should still keep that broad meaning). So I'd suggest changing the terms to say something like "fixed-format/delimited data file" and "spreadsheet data file" to distinguish the types to clarify that we consider them both data files. - In module 2 F18, what do you mean by " the structure describes all data and meta data for the cube"--that sounds to me more like module 3. - Module 3 F19; the notes contains a question; can that be deleted or should it be moved to G19? I will make these corrections. When I agreed to cancelling today's meeting, I didn't realize that you'd have such significant changes, so if it's best to discuss these over the phone, maybe we can arrange another call. Time is VERY critical. We are presenting this to the SRG in one week, and need to send this out ASAP. I think the important thing is that we have something for the group to work with. I just don't have the time to finish the proposals AND meet. Kate P.S. Plus a couple of typos - Module 1, F25, should be "measurement"--also applies to M2 F31 - Module 1, F20, ID should be capitalized - Module 2, F20, should be "coordinates" Will fix. At 09:29 AM 10/11/2005 -0400, J Gager wrote: All - Here is the latest spreadsheet. Note there are few significant changes that stemmed from a long discussion Wendy and I had. The first is the Logical sheet. I have gone back to just including the new fields. I felt it best to do this, since we weren't changing any existing fields, and I want the focus to only be on these additions. The second is the Physical Sheets - I have changed the name of these to Record Layout, since that is what we a truly representing. Finally, I have changed module 1, to allow for data items to exist outside of nCubes. Basically what I have done is create a way to reference an nCube and its attached attributes. The basic concept that we had originally is still there, it is just less deviant from the original, and oft used structure. Please let me know of any structural issues ASAP as the samples and write up is based on this. J _______________________________________________ DDI-ADG mailing list DDI-ADG@icpsr.umich.edu http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg ___________________________________________ Katherine McNeill-Harman Data Services Librarian Dewey Library for Management and Social Sciences Massachusetts Institute of Technology 77 Massachusetts Avenue, E53-100 Cambridge, MA 02139 mcneillh@mit.edu 617-253-0787 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.icpsr.umich.edu/pipermail/ddi-adg/attachments/20051011/3eb28f8a/attachment-0001.html From mcneillh at MIT.EDU Tue Oct 11 14:29:04 2005 From: mcneillh at MIT.EDU (Katherine McNeill-Harman) Date: Tue Oct 11 14:29:18 2005 Subject: [DDI-ADG] Latest Spreadsheet In-Reply-To: <004901c5ce85$20deed20$6401a8c0@JGAGERLT> References: <5.2.1.1.2.20051011112310.03f766e8@po11.mit.edu> Message-ID: <5.2.1.1.2.20051011142341.04048e30@po11.mit.edu> Comments w/in (others, please comment as well; the most significant item starts w/a *** below): At 12:59 PM 10/11/2005 -0400, J Gager wrote: >Kate - > >Thanks for your comments. Please see responses below. In general, I >didn't think any change was significant enough to warrant further >discussion. If anyone is still uncomfortable with these changes after >this discussion, then we can schedule another meeting, but time is very >very short for me, and the write ups for this aggregate piece are far more >complicated and time consuming than I anticipated (there are a lot of >details that need to be cleary explained). > >J >-----Original Message----- >From: Katherine McNeill-Harman [mailto:mcneillh@MIT.EDU] >Sent: Tuesday, October 11, 2005 12:03 PM >To: jgager@umich.edu; DDI-ADG >Subject: Re: [DDI-ADG] Latest Spreadsheet > >J and others, > >Have a couple of questions/concerns about this new sheet; would be >interested in others' opinions: > >1) Can you explain a bit the reasons for changing the descriptions at the >top of each module sheet? I don't care that we use exactly the words I >drafted, but yours seem to have different meaning and I want to make sure >we're all on the same page. Namely, >- For modules 2 and 3, you seem to be emphasizing that it's for a "single >nCube structure"--can you expand upon what you mean by that? > >What is meant by this for module 2 is that the file cannot contain >multiple cubes. For instance if there were 2 cubes, say population by >region, gender, and age (cube 1) and population by region and gender (cube >2), the combination of these 2 cubes in the data file would look something >like this. >MN M 50- 5300 >MN M 50+ 6700 >MN M 12000 >MN F 50- 6800 >MN F 50+ 5000 >MN F 11800 >Module 2 does not support this. Its intention is describe a file where >all rows describe the same cube data. > >The same thing applies for module 3, since it is grouped by nCube. You >would used module 3 to describe a single nCube at a time, and not a mix of >nCubes and non cubed data. That's more clear, however, in module 2, how would one treat, e.g., a spreadsheet file containing multiple sheets with a different cube on each sheet? > > - Also, I'd like to ask you to consider putting back some of the wording > I'd written for module 3 that makes it clear that the metadata and data > are in one single DDI file; I don't think that comes across in your phrasing. > >I think it is too constricting. It does allow for all to be in one file, >but don't we want to allow the data to also be used this way in a seperate >file? ***I believe that it's only the former, that if it lives in a separate file it would be under module 1 or 2. Others, please confirm. > >- Lastly, for module 1, I believe it'd be helpful to include some explicit >reference to the fact that the external file contains no metadata (to >distinguish it from module 2). > >That may be misleading, since we are saying that attributes can exist in >there, which are technically metadata. The distinction lies in the fact >that module 1 data files do not state any of the cube coordinate values in >them. I can live w/that if others don't have any other suggestions. > >2) I'm not sure about including only the additions in the first sheet, as >I think the other fields provide helpful context. Might there be a way to >distinguish the new fields (as you had done, e.g., w/color) while still >keeping all of them? > >It was really a time, and an issue of focusing attention. The model was >incomplete to start with, and the effort of flushing out all existing >things from the tag library, and putting in definitions for them is more >work than it would be worth (in my opinion). Understand. I'd still lean the other way but will happily go w/the group concensus. > >3) Your change to module 1, while I have no objections, seems to be >significant enough that we should discuss it as a group. I don't quite >understand the purpose of this. Is this something you think you could >explain/we could discuss in more detail over email (or maybe phone)? > >The purpose of the change is to basically not change what is already in >place. In speaking with Wendy, she pointed out how important it is to >many people marking up data files to do so in the order in which the data >occurs in the file. So there may be a mix of cubed and non cubed >data. Further more, module 1 did not allow for any non cubed data >(everything was grouped into a nCube container). The change simply >replaced the inclusion of a data item into an nCube by containership, with >inclusion by reference. The concept we initially had is still there, just >represented differently. Sounds OK to me; I'll leave others to comment. > >4) Plus a couple of other questions about the elements: >-- in describing the attribute location choice, you refer separately to a >data file vs. a spreadsheet. I understand what you're trying to do, but >am a little concerned about the mutually-exclusive manner in which they're >described (b/other places we use the term "data file" to include all sorts >of formats, including spreadsheets, and think it should still keep that >broad meaning). So I'd suggest changing the terms to say something like >"fixed-format/delimited data file" and "spreadsheet data file" to >distinguish the types to clarify that we consider them both data files. >- In module 2 F18, what do you mean by " the structure describes all data >and meta data for the cube"--that sounds to me more like module 3. >- Module 3 F19; the notes contains a question; can that be deleted or >should it be moved to G19? > >I will make these corrections. > >When I agreed to cancelling today's meeting, I didn't realize that you'd >have such significant changes, so if it's best to discuss these over the >phone, maybe we can arrange another call. > >Time is VERY critical. We are presenting this to the SRG in one week, and >need to send this out ASAP. I think the important thing is that we have >something for the group to work with. I just don't have the time to >finish the proposals AND meet. > >Kate > >P.S. Plus a couple of typos >- Module 1, F25, should be "measurement"--also applies to M2 F31 >- Module 1, F20, ID should be capitalized >- Module 2, F20, should be "coordinates" > >Will fix. > >At 09:29 AM 10/11/2005 -0400, J Gager wrote: >>All - >> >>Here is the latest spreadsheet. Note there are few significant changes >>that stemmed from a long discussion Wendy and I had. >> >>The first is the Logical sheet. I have gone back to just including the >>new fields. I felt it best to do this, since we weren't changing any >>existing fields, and I want the focus to only be on these additions. >> >>The second is the Physical Sheets - I have changed the name of these to >>Record Layout, since that is what we a truly representing. >> >>Finally, I have changed module 1, to allow for data items to exist >>outside of nCubes. Basically what I have done is create a way to >>reference an nCube and its attached attributes. The basic concept that >>we had originally is still there, it is just less deviant from the >>original, and oft used structure. >> >>Please let me know of any structural issues ASAP as the samples and write >>up is based on this. >> >>J >> >>_______________________________________________ >>DDI-ADG mailing list >>DDI-ADG@icpsr.umich.edu >>http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg > >___________________________________________ >Katherine McNeill-Harman >Data Services Librarian >Dewey Library for Management and Social Sciences >Massachusetts Institute of Technology >77 Massachusetts Avenue, E53-100 >Cambridge, MA 02139 >mcneillh@mit.edu >617-253-0787 ___________________________________________ Katherine McNeill-Harman Data Services Librarian Dewey Library for Management and Social Sciences Massachusetts Institute of Technology 77 Massachusetts Avenue, E53-100 Cambridge, MA 02139 mcneillh@mit.edu 617-253-0787 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.icpsr.umich.edu/pipermail/ddi-adg/attachments/20051011/154b18ba/attachment.html From vardigan at umich.edu Tue Oct 11 16:07:17 2005 From: vardigan at umich.edu (Mary Vardigan) Date: Tue Oct 11 16:07:31 2005 Subject: [DDI-ADG] Latest Spreadsheet Message-ID: <93111EED84D98E4C95F33D6114AB3968311D96@isr-mail2.ad.isr.umich.edu> Kate, J, and others, I hesitate to put in my two cents since I haven't been as involved in this lately and may have missed some critical information, but I was under the same impression as Kate that Module 3 was designed to hold data values inline and not point to an external file. I know we are really pressed for time, though, so rather than discuss this over email or in a phone call, perhaps Sanda and I can raise it during the SRG meeting next week and get clarification there. We will then report back after the meeting. Does this work? Mary ________________________________ From: ddi-adg-bounces@icpsr.umich.edu [mailto:ddi-adg-bounces@icpsr.umich.edu] On Behalf Of Katherine McNeill-Harman Sent: Tuesday, October 11, 2005 2:29 PM To: jgager@umich.edu; 'DDI-ADG' Subject: RE: [DDI-ADG] Latest Spreadsheet Comments w/in (others, please comment as well; the most significant item starts w/a *** below): At 12:59 PM 10/11/2005 -0400, J Gager wrote: Kate - Thanks for your comments. Please see responses below. In general, I didn't think any change was significant enough to warrant further discussion. If anyone is still uncomfortable with these changes after this discussion, then we can schedule another meeting, but time is very very short for me, and the write ups for this aggregate piece are far more complicated and time consuming than I anticipated (there are a lot of details that need to be cleary explained). J -----Original Message----- From: Katherine McNeill-Harman [mailto:mcneillh@MIT.EDU] Sent: Tuesday, October 11, 2005 12:03 PM To: jgager@umich.edu; DDI-ADG Subject: Re: [DDI-ADG] Latest Spreadsheet J and others, Have a couple of questions/concerns about this new sheet; would be interested in others' opinions: 1) Can you explain a bit the reasons for changing the descriptions at the top of each module sheet? I don't care that we use exactly the words I drafted, but yours seem to have different meaning and I want to make sure we're all on the same page. Namely, - For modules 2 and 3, you seem to be emphasizing that it's for a "single nCube structure"--can you expand upon what you mean by that? What is meant by this for module 2 is that the file cannot contain multiple cubes. For instance if there were 2 cubes, say population by region, gender, and age (cube 1) and population by region and gender (cube 2), the combination of these 2 cubes in the data file would look something like this. MN M 50- 5300 MN M 50+ 6700 MN M 12000 MN F 50- 6800 MN F 50+ 5000 MN F 11800 Module 2 does not support this. Its intention is describe a file where all rows describe the same cube data. The same thing applies for module 3, since it is grouped by nCube. You would used module 3 to describe a single nCube at a time, and not a mix of nCubes and non cubed data. That's more clear, however, in module 2, how would one treat, e.g., a spreadsheet file containing multiple sheets with a different cube on each sheet? - Also, I'd like to ask you to consider putting back some of the wording I'd written for module 3 that makes it clear that the metadata and data are in one single DDI file; I don't think that comes across in your phrasing. I think it is too constricting. It does allow for all to be in one file, but don't we want to allow the data to also be used this way in a seperate file? ***I believe that it's only the former, that if it lives in a separate file it would be under module 1 or 2. Others, please confirm. - Lastly, for module 1, I believe it'd be helpful to include some explicit reference to the fact that the external file contains no metadata (to distinguish it from module 2). That may be misleading, since we are saying that attributes can exist in there, which are technically metadata. The distinction lies in the fact that module 1 data files do not state any of the cube coordinate values in them. I can live w/that if others don't have any other suggestions. 2) I'm not sure about including only the additions in the first sheet, as I think the other fields provide helpful context. Might there be a way to distinguish the new fields (as you had done, e.g., w/color) while still keeping all of them? It was really a time, and an issue of focusing attention. The model was incomplete to start with, and the effort of flushing out all existing things from the tag library, and putting in definitions for them is more work than it would be worth (in my opinion). Understand. I'd still lean the other way but will happily go w/the group concensus. 3) Your change to module 1, while I have no objections, seems to be significant enough that we should discuss it as a group. I don't quite understand the purpose of this. Is this something you think you could explain/we could discuss in more detail over email (or maybe phone)? The purpose of the change is to basically not change what is already in place. In speaking with Wendy, she pointed out how important it is to many people marking up data files to do so in the order in which the data occurs in the file. So there may be a mix of cubed and non cubed data. Further more, module 1 did not allow for any non cubed data (everything was grouped into a nCube container). The change simply replaced the inclusion of a data item into an nCube by containership, with inclusion by reference. The concept we initially had is still there, just represented differently. Sounds OK to me; I'll leave others to comment. 4) Plus a couple of other questions about the elements: -- in describing the attribute location choice, you refer separately to a data file vs. a spreadsheet. I understand what you're trying to do, but am a little concerned about the mutually-exclusive manner in which they're described (b/other places we use the term "data file" to include all sorts of formats, including spreadsheets, and think it should still keep that broad meaning). So I'd suggest changing the terms to say something like "fixed-format/delimited data file" and "spreadsheet data file" to distinguish the types to clarify that we consider them both data files. - In module 2 F18, what do you mean by " the structure describes all data and meta data for the cube"--that sounds to me more like module 3. - Module 3 F19; the notes contains a question; can that be deleted or should it be moved to G19? I will make these corrections. When I agreed to cancelling today's meeting, I didn't realize that you'd have such significant changes, so if it's best to discuss these over the phone, maybe we can arrange another call. Time is VERY critical. We are presenting this to the SRG in one week, and need to send this out ASAP. I think the important thing is that we have something for the group to work with. I just don't have the time to finish the proposals AND meet. Kate P.S. Plus a couple of typos - Module 1, F25, should be "measurement"--also applies to M2 F31 - Module 1, F20, ID should be capitalized - Module 2, F20, should be "coordinates" Will fix. At 09:29 AM 10/11/2005 -0400, J Gager wrote: All - Here is the latest spreadsheet. Note there are few significant changes that stemmed from a long discussion Wendy and I had. The first is the Logical sheet. I have gone back to just including the new fields. I felt it best to do this, since we weren't changing any existing fields, and I want the focus to only be on these additions. The second is the Physical Sheets - I have changed the name of these to Record Layout, since that is what we a truly representing. Finally, I have changed module 1, to allow for data items to exist outside of nCubes. Basically what I have done is create a way to reference an nCube and its attached attributes. The basic concept that we had originally is still there, it is just less deviant from the original, and oft used structure. Please let me know of any structural issues ASAP as the samples and write up is based on this. J _______________________________________________ DDI-ADG mailing list DDI-ADG@icpsr.umich.edu http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg ___________________________________________ Katherine McNeill-Harman Data Services Librarian Dewey Library for Management and Social Sciences Massachusetts Institute of Technology 77 Massachusetts Avenue, E53-100 Cambridge, MA 02139 mcneillh@mit.edu 617-253-0787 ___________________________________________ Katherine McNeill-Harman Data Services Librarian Dewey Library for Management and Social Sciences Massachusetts Institute of Technology 77 Massachusetts Avenue, E53-100 Cambridge, MA 02139 mcneillh@mit.edu 617-253-0787 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.icpsr.umich.edu/pipermail/ddi-adg/attachments/20051011/8fbbbbfc/attachment-0001.html From j.b.gager at gmail.com Tue Oct 11 16:14:48 2005 From: j.b.gager at gmail.com (J Gager) Date: Tue Oct 11 16:15:49 2005 Subject: [DDI-ADG] Latest Spreadsheet In-Reply-To: <93111EED84D98E4C95F33D6114AB3968311D96@isr-mail2.ad.isr.umich.edu> Message-ID: <006c01c5cea0$743f93a0$6401a8c0@JGAGERLT> Module 3 *is* designed to hold data inline. The point I was trying to make is that I am not sure we want to say the data always has to be in the same DDI instance of the meta data. Module 3 does not support any external data files. -----Original Message----- From: Mary Vardigan [mailto:vardigan@umich.edu] Sent: Tuesday, October 11, 2005 4:07 PM To: Katherine McNeill-Harman; jgager@umich.edu; DDI-ADG Subject: RE: [DDI-ADG] Latest Spreadsheet Kate, J, and others, I hesitate to put in my two cents since I haven't been as involved in this lately and may have missed some critical information, but I was under the same impression as Kate that Module 3 was designed to hold data values inline and not point to an external file. I know we are really pressed for time, though, so rather than discuss this over email or in a phone call, perhaps Sanda and I can raise it during the SRG meeting next week and get clarification there. We will then report back after the meeting. Does this work? Mary _____ From: ddi-adg-bounces@icpsr.umich.edu [mailto:ddi-adg-bounces@icpsr.umich.edu] On Behalf Of Katherine McNeill-Harman Sent: Tuesday, October 11, 2005 2:29 PM To: jgager@umich.edu; 'DDI-ADG' Subject: RE: [DDI-ADG] Latest Spreadsheet Comments w/in (others, please comment as well; the most significant item starts w/a *** below): At 12:59 PM 10/11/2005 -0400, J Gager wrote: Kate - Thanks for your comments. Please see responses below. In general, I didn't think any change was significant enough to warrant further discussion. If anyone is still uncomfortable with these changes after this discussion, then we can schedule another meeting, but time is very very short for me, and the write ups for this aggregate piece are far more complicated and time consuming than I anticipated (there are a lot of details that need to be cleary explained). J -----Original Message----- From: Katherine McNeill-Harman [mailto:mcneillh@MIT.EDU] Sent: Tuesday, October 11, 2005 12:03 PM To: jgager@umich.edu; DDI-ADG Subject: Re: [DDI-ADG] Latest Spreadsheet J and others, Have a couple of questions/concerns about this new sheet; would be interested in others' opinions: 1) Can you explain a bit the reasons for changing the descriptions at the top of each module sheet? I don't care that we use exactly the words I drafted, but yours seem to have different meaning and I want to make sure we're all on the same page. Namely, - For modules 2 and 3, you seem to be emphasizing that it's for a "single nCube structure"--can you expand upon what you mean by that? What is meant by this for module 2 is that the file cannot contain multiple cubes. For instance if there were 2 cubes, say population by region, gender, and age (cube 1) and population by region and gender (cube 2), the combination of these 2 cubes in the data file would look something like this. MN M 50- 5300 MN M 50+ 6700 MN M 12000 MN F 50- 6800 MN F 50+ 5000 MN F 11800 Module 2 does not support this. Its intention is describe a file where all rows describe the same cube data. The same thing applies for module 3, since it is grouped by nCube. You would used module 3 to describe a single nCube at a time, and not a mix of nCubes and non cubed data. That's more clear, however, in module 2, how would one treat, e.g., a spreadsheet file containing multiple sheets with a different cube on each sheet? - Also, I'd like to ask you to consider putting back some of the wording I'd written for module 3 that makes it clear that the metadata and data are in one single DDI file; I don't think that comes across in your phrasing. I think it is too constricting. It does allow for all to be in one file, but don't we want to allow the data to also be used this way in a seperate file? ***I believe that it's only the former, that if it lives in a separate file it would be under module 1 or 2. Others, please confirm. - Lastly, for module 1, I believe it'd be helpful to include some explicit reference to the fact that the external file contains no metadata (to distinguish it from module 2). That may be misleading, since we are saying that attributes can exist in there, which are technically metadata. The distinction lies in the fact that module 1 data files do not state any of the cube coordinate values in them. I can live w/that if others don't have any other suggestions. 2) I'm not sure about including only the additions in the first sheet, as I think the other fields provide helpful context. Might there be a way to distinguish the new fields (as you had done, e.g., w/color) while still keeping all of them? It was really a time, and an issue of focusing attention. The model was incomplete to start with, and the effort of flushing out all existing things from the tag library, and putting in definitions for them is more work than it would be worth (in my opinion). Understand. I'd still lean the other way but will happily go w/the group concensus. 3) Your change to module 1, while I have no objections, seems to be significant enough that we should discuss it as a group. I don't quite understand the purpose of this. Is this something you think you could explain/we could discuss in more detail over email (or maybe phone)? The purpose of the change is to basically not change what is already in place. In speaking with Wendy, she pointed out how important it is to many people marking up data files to do so in the order in which the data occurs in the file. So there may be a mix of cubed and non cubed data. Further more, module 1 did not allow for any non cubed data (everything was grouped into a nCube container). The change simply replaced the inclusion of a data item into an nCube by containership, with inclusion by reference. The concept we initially had is still there, just represented differently. Sounds OK to me; I'll leave others to comment. 4) Plus a couple of other questions about the elements: -- in describing the attribute location choice, you refer separately to a data file vs. a spreadsheet. I understand what you're trying to do, but am a little concerned about the mutually-exclusive manner in which they're described (b/other places we use the term "data file" to include all sorts of formats, including spreadsheets, and think it should still keep that broad meaning). So I'd suggest changing the terms to say something like "fixed-format/delimited data file" and "spreadsheet data file" to distinguish the types to clarify that we consider them both data files. - In module 2 F18, what do you mean by " the structure describes all data and meta data for the cube"--that sounds to me more like module 3. - Module 3 F19; the notes contains a question; can that be deleted or should it be moved to G19? I will make these corrections. When I agreed to cancelling today's meeting, I didn't realize that you'd have such significant changes, so if it's best to discuss these over the phone, maybe we can arrange another call. Time is VERY critical. We are presenting this to the SRG in one week, and need to send this out ASAP. I think the important thing is that we have something for the group to work with. I just don't have the time to finish the proposals AND meet. Kate P.S. Plus a couple of typos - Module 1, F25, should be "measurement"--also applies to M2 F31 - Module 1, F20, ID should be capitalized - Module 2, F20, should be "coordinates" Will fix. At 09:29 AM 10/11/2005 -0400, J Gager wrote: All - Here is the latest spreadsheet. Note there are few significant changes that stemmed from a long discussion Wendy and I had. The first is the Logical sheet. I have gone back to just including the new fields. I felt it best to do this, since we weren't changing any existing fields, and I want the focus to only be on these additions. The second is the Physical Sheets - I have changed the name of these to Record Layout, since that is what we a truly representing. Finally, I have changed module 1, to allow for data items to exist outside of nCubes. Basically what I have done is create a way to reference an nCube and its attached attributes. The basic concept that we had originally is still there, it is just less deviant from the original, and oft used structure. Please let me know of any structural issues ASAP as the samples and write up is based on this. J _______________________________________________ DDI-ADG mailing list DDI-ADG@icpsr.umich.edu http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg ___________________________________________ Katherine McNeill-Harman Data Services Librarian Dewey Library for Management and Social Sciences Massachusetts Institute of Technology 77 Massachusetts Avenue, E53-100 Cambridge, MA 02139 mcneillh@mit.edu 617-253-0787 ___________________________________________ Katherine McNeill-Harman Data Services Librarian Dewey Library for Management and Social Sciences Massachusetts Institute of Technology 77 Massachusetts Avenue, E53-100 Cambridge, MA 02139 mcneillh@mit.edu 617-253-0787 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.icpsr.umich.edu/pipermail/ddi-adg/attachments/20051011/7f09c682/attachment-0001.html From wlt at pop.umn.edu Tue Oct 11 16:21:18 2005 From: wlt at pop.umn.edu (Wendy Thomas) Date: Tue Oct 11 16:21:48 2005 Subject: [DDI-ADG] Latest Spreadsheet In-Reply-To: <006c01c5cea0$743f93a0$6401a8c0@JGAGERLT> Message-ID: Module 1 describes data that resides in an external file of data only Module 2 describes data that resides in an external file that has both data and some level of metadata (category labels, title line etc) Module 3 describes a data that resides in the metadata, there is no external file At least that's the way it reads to me. wendy On Tue, 11 Oct 2005, J Gager wrote: > Module 3 *is* designed to hold data inline. The point I was trying to > make is that I am not sure we want to say the data always has to be in > the same DDI instance of the meta data. Module 3 does not support any > external data files. > > -----Original Message----- > From: Mary Vardigan [mailto:vardigan@umich.edu] > Sent: Tuesday, October 11, 2005 4:07 PM > To: Katherine McNeill-Harman; jgager@umich.edu; DDI-ADG > Subject: RE: [DDI-ADG] Latest Spreadsheet > > > > Kate, J, and others, > > > > I hesitate to put in my two cents since I haven't been as involved in > this lately and may have missed some critical information, but I was > under the same impression as Kate that Module 3 was designed to hold > data values inline and not point to an external file. I know we are > really pressed for time, though, so rather than discuss this over email > or in a phone call, perhaps Sanda and I can raise it during the SRG > meeting next week and get clarification there. We will then report back > after the meeting. Does this work? > > > > Mary > > > _____ > > > From: ddi-adg-bounces@icpsr.umich.edu > [mailto:ddi-adg-bounces@icpsr.umich.edu] On Behalf Of Katherine > McNeill-Harman > Sent: Tuesday, October 11, 2005 2:29 PM > To: jgager@umich.edu; 'DDI-ADG' > Subject: RE: [DDI-ADG] Latest Spreadsheet > > > > Comments w/in (others, please comment as well; the most significant item > starts w/a *** below): > > At 12:59 PM 10/11/2005 -0400, J Gager wrote: > > > > Kate - > > Thanks for your comments. Please see responses below. In general, I > didn't think any change was significant enough to warrant further > discussion. If anyone is still uncomfortable with these changes after > this discussion, then we can schedule another meeting, but time is very > very short for me, and the write ups for this aggregate piece are far > more complicated and time consuming than I anticipated (there are a lot > of details that need to be cleary explained). > > J > > -----Original Message----- > > From: Katherine McNeill-Harman [mailto:mcneillh@MIT.EDU] > > Sent: Tuesday, October 11, 2005 12:03 PM > > To: jgager@umich.edu; DDI-ADG > > Subject: Re: [DDI-ADG] Latest Spreadsheet > > J and others, > > Have a couple of questions/concerns about this new sheet; would be > interested in others' opinions: > > 1) Can you explain a bit the reasons for changing the descriptions at > the top of each module sheet? I don't care that we use exactly the > words I drafted, but yours seem to have different meaning and I want to > make sure we're all on the same page. Namely, > > - For modules 2 and 3, you seem to be emphasizing that it's for a > "single nCube structure"--can you expand upon what you mean by that? > > > > What is meant by this for module 2 is that the file cannot contain > multiple cubes. For instance if there were 2 cubes, say population by > region, gender, and age (cube 1) and population by region and gender > (cube 2), the combination of these 2 cubes in the data file would look > something like this. > > MN M 50- 5300 > > MN M 50+ 6700 > > MN M 12000 > > MN F 50- 6800 > > MN F 50+ 5000 > > MN F 11800 > > Module 2 does not support this. Its intention is describe a file where > all rows describe the same cube data. > > > > The same thing applies for module 3, since it is grouped by nCube. You > would used module 3 to describe a single nCube at a time, and not a mix > of nCubes and non cubed data. > > > That's more clear, however, in module 2, how would one treat, e.g., a > spreadsheet file containing multiple sheets with a different cube on > each sheet? > > > > > > > - Also, I'd like to ask you to consider putting back some of the > wording I'd written for module 3 that makes it clear that the metadata > and data are in one single DDI file; I don't think that comes across in > your phrasing. > > > > I think it is too constricting. It does allow for all to be in one > file, but don't we want to allow the data to also be used this way in a > seperate file? > > > ***I believe that it's only the former, that if it lives in a separate > file it would be under module 1 or 2. Others, please confirm. > > > > > > > - Lastly, for module 1, I believe it'd be helpful to include some > explicit reference to the fact that the external file contains no > metadata (to distinguish it from module 2). > > > > That may be misleading, since we are saying that attributes can exist in > there, which are technically metadata. The distinction lies in the fact > that module 1 data files do not state any of the cube coordinate values > in them. > > > I can live w/that if others don't have any other suggestions. > > > > > > > 2) I'm not sure about including only the additions in the first sheet, > as I think the other fields provide helpful context. Might there be a > way to distinguish the new fields (as you had done, e.g., w/color) while > still keeping all of them? > > > > It was really a time, and an issue of focusing attention. The model was > incomplete to start with, and the effort of flushing out all existing > things from the tag library, and putting in definitions for them is more > work than it would be worth (in my opinion). > > > Understand. I'd still lean the other way but will happily go w/the > group concensus. > > > > > > > 3) Your change to module 1, while I have no objections, seems to be > significant enough that we should discuss it as a group. I don't quite > understand the purpose of this. Is this something you think you could > explain/we could discuss in more detail over email (or maybe phone)? > > > > The purpose of the change is to basically not change what is already in > place. In speaking with Wendy, she pointed out how important it is to > many people marking up data files to do so in the order in which the > data occurs in the file. So there may be a mix of cubed and non cubed > data. Further more, module 1 did not allow for any non cubed data > (everything was grouped into a nCube container). The change simply > replaced the inclusion of a data item into an nCube by containership, > with inclusion by reference. The concept we initially had is still > there, just represented differently. > > > Sounds OK to me; I'll leave others to comment. > > > > > > > 4) Plus a couple of other questions about the elements: > > -- in describing the attribute location choice, you refer separately to > a data file vs. a spreadsheet. I understand what you're trying to do, > but am a little concerned about the mutually-exclusive manner in which > they're described (b/other places we use the term "data file" to include > all sorts of formats, including spreadsheets, and think it should still > keep that broad meaning). So I'd suggest changing the terms to say > something like "fixed-format/delimited data file" and "spreadsheet data > file" to distinguish the types to clarify that we consider them both > data files. > > - In module 2 F18, what do you mean by " the structure describes all > data and meta data for the cube"--that sounds to me more like module 3. > > - Module 3 F19; the notes contains a question; can that be deleted or > should it be moved to G19? > > > > I will make these corrections. > > > > When I agreed to cancelling today's meeting, I didn't realize that you'd > have such significant changes, so if it's best to discuss these over the > phone, maybe we can arrange another call. > > > > Time is VERY critical. We are presenting this to the SRG in one week, > and need to send this out ASAP. I think the important thing is that we > have something for the group to work with. I just don't have the time > to finish the proposals AND meet. > > > > Kate > > P.S. Plus a couple of typos > > - Module 1, F25, should be "measurement"--also applies to M2 F31 > > - Module 1, F20, ID should be capitalized > > - Module 2, F20, should be "coordinates" > > > > Will fix. > > > > At 09:29 AM 10/11/2005 -0400, J Gager wrote: > > > > All - > > > > Here is the latest spreadsheet. Note there are few significant changes > that stemmed from a long discussion Wendy and I had. > > > > The first is the Logical sheet. I have gone back to just including the > new fields. I felt it best to do this, since we weren't changing any > existing fields, and I want the focus to only be on these additions. > > > > The second is the Physical Sheets - I have changed the name of these to > Record Layout, since that is what we a truly representing. > > > > Finally, I have changed module 1, to allow for data items to exist > outside of nCubes. Basically what I have done is create a way to > reference an nCube and its attached attributes. The basic concept that > we had originally is still there, it is just less deviant from the > original, and oft used structure. > > > > Please let me know of any structural issues ASAP as the samples and > write up is based on this. > > > > J > > _______________________________________________ > > DDI-ADG mailing list > > DDI-ADG@icpsr.umich.edu > > http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg > > > > ___________________________________________ > > Katherine McNeill-Harman > > Data Services Librarian > > Dewey Library for Management and Social Sciences > > Massachusetts Institute of Technology > > 77 Massachusetts Avenue, E53-100 > > Cambridge, MA 02139 > > mcneillh@mit.edu > > 617-253-0787 > > ___________________________________________ > Katherine McNeill-Harman > Data Services Librarian > Dewey Library for Management and Social Sciences > Massachusetts Institute of Technology > 77 Massachusetts Avenue, E53-100 > Cambridge, MA 02139 > mcneillh@mit.edu > 617-253-0787 > > Wendy L. Thomas Phone: +1 612.624.4389 Data Access Core Director Fax: +1 612.626.8375 Minnesota Population Center Email: wlt@pop.umn.edu University of Minnesota 50 Willey Hall 225 19th Avenue South Minneapolis, MN 55455 From mcneillh at MIT.EDU Tue Oct 11 16:37:22 2005 From: mcneillh at MIT.EDU (Katherine McNeill-Harman) Date: Tue Oct 11 16:37:37 2005 Subject: [DDI-ADG] Latest Spreadsheet In-Reply-To: References: <006c01c5cea0$743f93a0$6401a8c0@JGAGERLT> Message-ID: <5.2.1.1.2.20051011163509.015bf4e0@po11.mit.edu> Based on J's response, I guess I would wonder when the data would not be in the same DDI instance of the meta data. You would have two DDI metadata files and only one would have the data? I'm having a hard time conceptualizing this. Know we're short on time, but think this is a basic thing we should try to agree on quickly if possible (i.e. to get on the same page ourselves about our recommendation as opposed to waiting to talk to the SRG). Kate At 03:21 PM 10/11/2005 -0500, Wendy Thomas wrote: >Module 1 describes data that resides in an external file of data only >Module 2 describes data that resides in an external file that has both >data and some level of metadata (category labels, title line etc) >Module 3 describes a data that resides in the metadata, there is no >external file > >At least that's the way it reads to me. > >wendy > > > > >On Tue, 11 Oct 2005, J Gager wrote: > > > Module 3 *is* designed to hold data inline. The point I was trying to > > make is that I am not sure we want to say the data always has to be in > > the same DDI instance of the meta data. Module 3 does not support any > > external data files. > > > > -----Original Message----- > > From: Mary Vardigan [mailto:vardigan@umich.edu] > > Sent: Tuesday, October 11, 2005 4:07 PM > > To: Katherine McNeill-Harman; jgager@umich.edu; DDI-ADG > > Subject: RE: [DDI-ADG] Latest Spreadsheet > > > > > > > > Kate, J, and others, > > > > > > > > I hesitate to put in my two cents since I haven't been as involved in > > this lately and may have missed some critical information, but I was > > under the same impression as Kate that Module 3 was designed to hold > > data values inline and not point to an external file. I know we are > > really pressed for time, though, so rather than discuss this over email > > or in a phone call, perhaps Sanda and I can raise it during the SRG > > meeting next week and get clarification there. We will then report back > > after the meeting. Does this work? > > > > > > > > Mary > > > > > > _____ > > > > > > From: ddi-adg-bounces@icpsr.umich.edu > > [mailto:ddi-adg-bounces@icpsr.umich.edu] On Behalf Of Katherine > > McNeill-Harman > > Sent: Tuesday, October 11, 2005 2:29 PM > > To: jgager@umich.edu; 'DDI-ADG' > > Subject: RE: [DDI-ADG] Latest Spreadsheet > > > > > > > > Comments w/in (others, please comment as well; the most significant item > > starts w/a *** below): > > > > At 12:59 PM 10/11/2005 -0400, J Gager wrote: > > > > > > > > Kate - > > > > Thanks for your comments. Please see responses below. In general, I > > didn't think any change was significant enough to warrant further > > discussion. If anyone is still uncomfortable with these changes after > > this discussion, then we can schedule another meeting, but time is very > > very short for me, and the write ups for this aggregate piece are far > > more complicated and time consuming than I anticipated (there are a lot > > of details that need to be cleary explained). > > > > J > > > > -----Original Message----- > > > > From: Katherine McNeill-Harman [mailto:mcneillh@MIT.EDU] > > > > Sent: Tuesday, October 11, 2005 12:03 PM > > > > To: jgager@umich.edu; DDI-ADG > > > > Subject: Re: [DDI-ADG] Latest Spreadsheet > > > > J and others, > > > > Have a couple of questions/concerns about this new sheet; would be > > interested in others' opinions: > > > > 1) Can you explain a bit the reasons for changing the descriptions at > > the top of each module sheet? I don't care that we use exactly the > > words I drafted, but yours seem to have different meaning and I want to > > make sure we're all on the same page. Namely, > > > > - For modules 2 and 3, you seem to be emphasizing that it's for a > > "single nCube structure"--can you expand upon what you mean by that? > > > > > > > > What is meant by this for module 2 is that the file cannot contain > > multiple cubes. For instance if there were 2 cubes, say population by > > region, gender, and age (cube 1) and population by region and gender > > (cube 2), the combination of these 2 cubes in the data file would look > > something like this. > > > > MN M 50- 5300 > > > > MN M 50+ 6700 > > > > MN M 12000 > > > > MN F 50- 6800 > > > > MN F 50+ 5000 > > > > MN F 11800 > > > > Module 2 does not support this. Its intention is describe a file where > > all rows describe the same cube data. > > > > > > > > The same thing applies for module 3, since it is grouped by nCube. You > > would used module 3 to describe a single nCube at a time, and not a mix > > of nCubes and non cubed data. > > > > > > That's more clear, however, in module 2, how would one treat, e.g., a > > spreadsheet file containing multiple sheets with a different cube on > > each sheet? > > > > > > > > > > > > > > - Also, I'd like to ask you to consider putting back some of the > > wording I'd written for module 3 that makes it clear that the metadata > > and data are in one single DDI file; I don't think that comes across in > > your phrasing. > > > > > > > > I think it is too constricting. It does allow for all to be in one > > file, but don't we want to allow the data to also be used this way in a > > seperate file? > > > > > > ***I believe that it's only the former, that if it lives in a separate > > file it would be under module 1 or 2. Others, please confirm. > > > > > > > > > > > > > > - Lastly, for module 1, I believe it'd be helpful to include some > > explicit reference to the fact that the external file contains no > > metadata (to distinguish it from module 2). > > > > > > > > That may be misleading, since we are saying that attributes can exist in > > there, which are technically metadata. The distinction lies in the fact > > that module 1 data files do not state any of the cube coordinate values > > in them. > > > > > > I can live w/that if others don't have any other suggestions. > > > > > > > > > > > > > > 2) I'm not sure about including only the additions in the first sheet, > > as I think the other fields provide helpful context. Might there be a > > way to distinguish the new fields (as you had done, e.g., w/color) while > > still keeping all of them? > > > > > > > > It was really a time, and an issue of focusing attention. The model was > > incomplete to start with, and the effort of flushing out all existing > > things from the tag library, and putting in definitions for them is more > > work than it would be worth (in my opinion). > > > > > > Understand. I'd still lean the other way but will happily go w/the > > group concensus. > > > > > > > > > > > > > > 3) Your change to module 1, while I have no objections, seems to be > > significant enough that we should discuss it as a group. I don't quite > > understand the purpose of this. Is this something you think you could > > explain/we could discuss in more detail over email (or maybe phone)? > > > > > > > > The purpose of the change is to basically not change what is already in > > place. In speaking with Wendy, she pointed out how important it is to > > many people marking up data files to do so in the order in which the > > data occurs in the file. So there may be a mix of cubed and non cubed > > data. Further more, module 1 did not allow for any non cubed data > > (everything was grouped into a nCube container). The change simply > > replaced the inclusion of a data item into an nCube by containership, > > with inclusion by reference. The concept we initially had is still > > there, just represented differently. > > > > > > Sounds OK to me; I'll leave others to comment. > > > > > > > > > > > > > > 4) Plus a couple of other questions about the elements: > > > > -- in describing the attribute location choice, you refer separately to > > a data file vs. a spreadsheet. I understand what you're trying to do, > > but am a little concerned about the mutually-exclusive manner in which > > they're described (b/other places we use the term "data file" to include > > all sorts of formats, including spreadsheets, and think it should still > > keep that broad meaning). So I'd suggest changing the terms to say > > something like "fixed-format/delimited data file" and "spreadsheet data > > file" to distinguish the types to clarify that we consider them both > > data files. > > > > - In module 2 F18, what do you mean by " the structure describes all > > data and meta data for the cube"--that sounds to me more like module 3. > > > > - Module 3 F19; the notes contains a question; can that be deleted or > > should it be moved to G19? > > > > > > > > I will make these corrections. > > > > > > > > When I agreed to cancelling today's meeting, I didn't realize that you'd > > have such significant changes, so if it's best to discuss these over the > > phone, maybe we can arrange another call. > > > > > > > > Time is VERY critical. We are presenting this to the SRG in one week, > > and need to send this out ASAP. I think the important thing is that we > > have something for the group to work with. I just don't have the time > > to finish the proposals AND meet. > > > > > > > > Kate > > > > P.S. Plus a couple of typos > > > > - Module 1, F25, should be "measurement"--also applies to M2 F31 > > > > - Module 1, F20, ID should be capitalized > > > > - Module 2, F20, should be "coordinates" > > > > > > > > Will fix. > > > > > > > > At 09:29 AM 10/11/2005 -0400, J Gager wrote: > > > > > > > > All - > > > > > > > > Here is the latest spreadsheet. Note there are few significant changes > > that stemmed from a long discussion Wendy and I had. > > > > > > > > The first is the Logical sheet. I have gone back to just including the > > new fields. I felt it best to do this, since we weren't changing any > > existing fields, and I want the focus to only be on these additions. > > > > > > > > The second is the Physical Sheets - I have changed the name of these to > > Record Layout, since that is what we a truly representing. > > > > > > > > Finally, I have changed module 1, to allow for data items to exist > > outside of nCubes. Basically what I have done is create a way to > > reference an nCube and its attached attributes. The basic concept that > > we had originally is still there, it is just less deviant from the > > original, and oft used structure. > > > > > > > > Please let me know of any structural issues ASAP as the samples and > > write up is based on this. > > > > > > > > J > > > > _______________________________________________ > > > > DDI-ADG mailing list > > > > DDI-ADG@icpsr.umich.edu > > > > http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg > > > > > > > > ___________________________________________ > > > > Katherine McNeill-Harman > > > > Data Services Librarian > > > > Dewey Library for Management and Social Sciences > > > > Massachusetts Institute of Technology > > > > 77 Massachusetts Avenue, E53-100 > > > > Cambridge, MA 02139 > > > > mcneillh@mit.edu > > > > 617-253-0787 > > > > ___________________________________________ > > Katherine McNeill-Harman > > Data Services Librarian > > Dewey Library for Management and Social Sciences > > Massachusetts Institute of Technology > > 77 Massachusetts Avenue, E53-100 > > Cambridge, MA 02139 > > mcneillh@mit.edu > > 617-253-0787 > > > > > >Wendy L. Thomas Phone: +1 612.624.4389 >Data Access Core Director Fax: +1 612.626.8375 >Minnesota Population Center Email: wlt@pop.umn.edu >University of Minnesota >50 Willey Hall >225 19th Avenue South >Minneapolis, MN 55455 ___________________________________________ Katherine McNeill-Harman Data Services Librarian Dewey Library for Management and Social Sciences Massachusetts Institute of Technology 77 Massachusetts Avenue, E53-100 Cambridge, MA 02139 mcneillh@mit.edu 617-253-0787 From ilona_e at berkeley.edu Tue Oct 11 17:00:50 2005 From: ilona_e at berkeley.edu (Ilona Einowski) Date: Tue Oct 11 16:59:55 2005 Subject: [DDI-ADG] Latest Spreadsheet In-Reply-To: <5.2.1.1.2.20051011163509.015bf4e0@po11.mit.edu> Message-ID: OK...and here is my 2cents.... I though Module 3 was an example of a spreadsheet where the whole shebang - table titles, row labels, column headers, data cells, footnotes, etc were represented.... Did I miss the boat on this???? Ilona -----Original Message----- From: ddi-adg-bounces@icpsr.umich.edu [mailto:ddi-adg-bounces@icpsr.umich.edu] On Behalf Of Katherine McNeill-Harman Sent: Tuesday, October 11, 2005 1:37 PM To: Wendy Thomas; jgager@umich.edu Cc: 'DDI-ADG' Subject: RE: [DDI-ADG] Latest Spreadsheet Based on J's response, I guess I would wonder when the data would not be in the same DDI instance of the meta data. You would have two DDI metadata files and only one would have the data? I'm having a hard time conceptualizing this. Know we're short on time, but think this is a basic thing we should try to agree on quickly if possible (i.e. to get on the same page ourselves about our recommendation as opposed to waiting to talk to the SRG). Kate At 03:21 PM 10/11/2005 -0500, Wendy Thomas wrote: >Module 1 describes data that resides in an external file of data only >Module 2 describes data that resides in an external file that has both >data and some level of metadata (category labels, title line etc) >Module 3 describes a data that resides in the metadata, there is no >external file > >At least that's the way it reads to me. > >wendy > > > > >On Tue, 11 Oct 2005, J Gager wrote: > > > Module 3 *is* designed to hold data inline. The point I was trying > > to make is that I am not sure we want to say the data always has to > > be in the same DDI instance of the meta data. Module 3 does not > > support any external data files. > > > > -----Original Message----- > > From: Mary Vardigan [mailto:vardigan@umich.edu] > > Sent: Tuesday, October 11, 2005 4:07 PM > > To: Katherine McNeill-Harman; jgager@umich.edu; DDI-ADG > > Subject: RE: [DDI-ADG] Latest Spreadsheet > > > > > > > > Kate, J, and others, > > > > > > > > I hesitate to put in my two cents since I haven't been as involved > > in this lately and may have missed some critical information, but I > > was under the same impression as Kate that Module 3 was designed to > > hold data values inline and not point to an external file. I know we > > are really pressed for time, though, so rather than discuss this > > over email or in a phone call, perhaps Sanda and I can raise it > > during the SRG meeting next week and get clarification there. We > > will then report back after the meeting. Does this work? > > > > > > > > Mary > > > > > > _____ > > > > > > From: ddi-adg-bounces@icpsr.umich.edu > > [mailto:ddi-adg-bounces@icpsr.umich.edu] On Behalf Of Katherine > > McNeill-Harman > > Sent: Tuesday, October 11, 2005 2:29 PM > > To: jgager@umich.edu; 'DDI-ADG' > > Subject: RE: [DDI-ADG] Latest Spreadsheet > > > > > > > > Comments w/in (others, please comment as well; the most significant > > item starts w/a *** below): > > > > At 12:59 PM 10/11/2005 -0400, J Gager wrote: > > > > > > > > Kate - > > > > Thanks for your comments. Please see responses below. In general, > > I didn't think any change was significant enough to warrant further > > discussion. If anyone is still uncomfortable with these changes > > after this discussion, then we can schedule another meeting, but > > time is very very short for me, and the write ups for this aggregate > > piece are far more complicated and time consuming than I anticipated > > (there are a lot of details that need to be cleary explained). > > > > J > > > > -----Original Message----- > > > > From: Katherine McNeill-Harman [mailto:mcneillh@MIT.EDU] > > > > Sent: Tuesday, October 11, 2005 12:03 PM > > > > To: jgager@umich.edu; DDI-ADG > > > > Subject: Re: [DDI-ADG] Latest Spreadsheet > > > > J and others, > > > > Have a couple of questions/concerns about this new sheet; would be > > interested in others' opinions: > > > > 1) Can you explain a bit the reasons for changing the descriptions > > at the top of each module sheet? I don't care that we use exactly > > the words I drafted, but yours seem to have different meaning and I > > want to make sure we're all on the same page. Namely, > > > > - For modules 2 and 3, you seem to be emphasizing that it's for a > > "single nCube structure"--can you expand upon what you mean by that? > > > > > > > > What is meant by this for module 2 is that the file cannot contain > > multiple cubes. For instance if there were 2 cubes, say population > > by region, gender, and age (cube 1) and population by region and > > gender (cube 2), the combination of these 2 cubes in the data file > > would look something like this. > > > > MN M 50- 5300 > > > > MN M 50+ 6700 > > > > MN M 12000 > > > > MN F 50- 6800 > > > > MN F 50+ 5000 > > > > MN F 11800 > > > > Module 2 does not support this. Its intention is describe a file > > where all rows describe the same cube data. > > > > > > > > The same thing applies for module 3, since it is grouped by nCube. > > You would used module 3 to describe a single nCube at a time, and > > not a mix of nCubes and non cubed data. > > > > > > That's more clear, however, in module 2, how would one treat, e.g., > > a spreadsheet file containing multiple sheets with a different cube > > on each sheet? > > > > > > > > > > > > > > - Also, I'd like to ask you to consider putting back some of the > > wording I'd written for module 3 that makes it clear that the > > metadata and data are in one single DDI file; I don't think that > > comes across in your phrasing. > > > > > > > > I think it is too constricting. It does allow for all to be in one > > file, but don't we want to allow the data to also be used this way > > in a seperate file? > > > > > > ***I believe that it's only the former, that if it lives in a > > separate file it would be under module 1 or 2. Others, please confirm. > > > > > > > > > > > > > > - Lastly, for module 1, I believe it'd be helpful to include some > > explicit reference to the fact that the external file contains no > > metadata (to distinguish it from module 2). > > > > > > > > That may be misleading, since we are saying that attributes can > > exist in there, which are technically metadata. The distinction > > lies in the fact that module 1 data files do not state any of the > > cube coordinate values in them. > > > > > > I can live w/that if others don't have any other suggestions. > > > > > > > > > > > > > > 2) I'm not sure about including only the additions in the first > > sheet, as I think the other fields provide helpful context. Might > > there be a way to distinguish the new fields (as you had done, e.g., > > w/color) while still keeping all of them? > > > > > > > > It was really a time, and an issue of focusing attention. The model > > was incomplete to start with, and the effort of flushing out all > > existing things from the tag library, and putting in definitions for > > them is more work than it would be worth (in my opinion). > > > > > > Understand. I'd still lean the other way but will happily go w/the > > group concensus. > > > > > > > > > > > > > > 3) Your change to module 1, while I have no objections, seems to be > > significant enough that we should discuss it as a group. I don't > > quite understand the purpose of this. Is this something you think > > you could explain/we could discuss in more detail over email (or maybe phone)? > > > > > > > > The purpose of the change is to basically not change what is already > > in place. In speaking with Wendy, she pointed out how important it > > is to many people marking up data files to do so in the order in > > which the data occurs in the file. So there may be a mix of cubed > > and non cubed data. Further more, module 1 did not allow for any > > non cubed data (everything was grouped into a nCube container). The > > change simply replaced the inclusion of a data item into an nCube by > > containership, with inclusion by reference. The concept we > > initially had is still there, just represented differently. > > > > > > Sounds OK to me; I'll leave others to comment. > > > > > > > > > > > > > > 4) Plus a couple of other questions about the elements: > > > > -- in describing the attribute location choice, you refer separately > > to a data file vs. a spreadsheet. I understand what you're trying > > to do, but am a little concerned about the mutually-exclusive manner > > in which they're described (b/other places we use the term "data > > file" to include all sorts of formats, including spreadsheets, and > > think it should still keep that broad meaning). So I'd suggest > > changing the terms to say something like "fixed-format/delimited > > data file" and "spreadsheet data file" to distinguish the types to > > clarify that we consider them both data files. > > > > - In module 2 F18, what do you mean by " the structure describes all > > data and meta data for the cube"--that sounds to me more like module 3. > > > > - Module 3 F19; the notes contains a question; can that be deleted > > or should it be moved to G19? > > > > > > > > I will make these corrections. > > > > > > > > When I agreed to cancelling today's meeting, I didn't realize that > > you'd have such significant changes, so if it's best to discuss > > these over the phone, maybe we can arrange another call. > > > > > > > > Time is VERY critical. We are presenting this to the SRG in one > > week, and need to send this out ASAP. I think the important thing > > is that we have something for the group to work with. I just don't > > have the time to finish the proposals AND meet. > > > > > > > > Kate > > > > P.S. Plus a couple of typos > > > > - Module 1, F25, should be "measurement"--also applies to M2 F31 > > > > - Module 1, F20, ID should be capitalized > > > > - Module 2, F20, should be "coordinates" > > > > > > > > Will fix. > > > > > > > > At 09:29 AM 10/11/2005 -0400, J Gager wrote: > > > > > > > > All - > > > > > > > > Here is the latest spreadsheet. Note there are few significant > > changes that stemmed from a long discussion Wendy and I had. > > > > > > > > The first is the Logical sheet. I have gone back to just including > > the new fields. I felt it best to do this, since we weren't > > changing any existing fields, and I want the focus to only be on these additions. > > > > > > > > The second is the Physical Sheets - I have changed the name of these > > to Record Layout, since that is what we a truly representing. > > > > > > > > Finally, I have changed module 1, to allow for data items to exist > > outside of nCubes. Basically what I have done is create a way to > > reference an nCube and its attached attributes. The basic concept > > that we had originally is still there, it is just less deviant from > > the original, and oft used structure. > > > > > > > > Please let me know of any structural issues ASAP as the samples and > > write up is based on this. > > > > > > > > J > > > > _______________________________________________ > > > > DDI-ADG mailing list > > > > DDI-ADG@icpsr.umich.edu > > > > http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg > > > > > > > > ___________________________________________ > > > > Katherine McNeill-Harman > > > > Data Services Librarian > > > > Dewey Library for Management and Social Sciences > > > > Massachusetts Institute of Technology > > > > 77 Massachusetts Avenue, E53-100 > > > > Cambridge, MA 02139 > > > > mcneillh@mit.edu > > > > 617-253-0787 > > > > ___________________________________________ > > Katherine McNeill-Harman > > Data Services Librarian > > Dewey Library for Management and Social Sciences Massachusetts > > Institute of Technology > > 77 Massachusetts Avenue, E53-100 > > Cambridge, MA 02139 > > mcneillh@mit.edu > > 617-253-0787 > > > > > >Wendy L. Thomas Phone: +1 612.624.4389 >Data Access Core Director Fax: +1 612.626.8375 >Minnesota Population Center Email: wlt@pop.umn.edu >University of Minnesota >50 Willey Hall >225 19th Avenue South >Minneapolis, MN 55455 ___________________________________________ Katherine McNeill-Harman Data Services Librarian Dewey Library for Management and Social Sciences Massachusetts Institute of Technology 77 Massachusetts Avenue, E53-100 Cambridge, MA 02139 mcneillh@mit.edu 617-253-0787 _______________________________________________ DDI-ADG mailing list DDI-ADG@icpsr.umich.edu http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg From j.b.gager at gmail.com Tue Oct 11 17:09:48 2005 From: j.b.gager at gmail.com (J Gager) Date: Tue Oct 11 17:10:37 2005 Subject: [DDI-ADG] Aggregate Proposal Message-ID: <007c01c5cea8$202488e0$6401a8c0@JGAGERLT> Skipped content of type multipart/alternative-------------- next part -------------- A non-text attachment was scrubbed... Name: AggregateProposal.zip Type: application/x-zip-compressed Size: 24129 bytes Desc: not available Url : http://www.icpsr.umich.edu/pipermail/ddi-adg/attachments/20051011/0567784e/AggregateProposal-0001.bin From j.b.gager at gmail.com Tue Oct 11 17:03:34 2005 From: j.b.gager at gmail.com (J Gager) Date: Tue Oct 11 17:16:54 2005 Subject: [DDI-ADG] Latest Spreadsheet In-Reply-To: Message-ID: <007b01c5cea7$41856a50$6401a8c0@JGAGERLT> No, module 2 describes a spreadsheet containing data, but it is a little more limited than what you described. Wendy and I discussed the case you described below, and it is out of scope of our groups work, but it is on the radar for the SRG. This sort of information would be better suited in the gross file description. Hypothetically, the spreadsheet you described below could contain a section of data that could be used in module 2, however, the table must contain data for only 1 nCube (see my response to Kate ealier for claification on this). -----Original Message----- From: Ilona Einowski [mailto:ilona_e@berkeley.edu] Sent: Tuesday, October 11, 2005 5:01 PM To: 'Katherine McNeill-Harman'; 'Wendy Thomas'; jgager@umich.edu Cc: 'DDI-ADG' Subject: RE: [DDI-ADG] Latest Spreadsheet OK...and here is my 2cents.... I though Module 3 was an example of a spreadsheet where the whole shebang - table titles, row labels, column headers, data cells, footnotes, etc were represented.... Did I miss the boat on this???? Ilona -----Original Message----- From: ddi-adg-bounces@icpsr.umich.edu [mailto:ddi-adg-bounces@icpsr.umich.edu] On Behalf Of Katherine McNeill-Harman Sent: Tuesday, October 11, 2005 1:37 PM To: Wendy Thomas; jgager@umich.edu Cc: 'DDI-ADG' Subject: RE: [DDI-ADG] Latest Spreadsheet Based on J's response, I guess I would wonder when the data would not be in the same DDI instance of the meta data. You would have two DDI metadata files and only one would have the data? I'm having a hard time conceptualizing this. Know we're short on time, but think this is a basic thing we should try to agree on quickly if possible (i.e. to get on the same page ourselves about our recommendation as opposed to waiting to talk to the SRG). Kate At 03:21 PM 10/11/2005 -0500, Wendy Thomas wrote: >Module 1 describes data that resides in an external file of data only >Module 2 describes data that resides in an external file that has both >data and some level of metadata (category labels, title line etc) >Module 3 describes a data that resides in the metadata, there is no >external file > >At least that's the way it reads to me. > >wendy > > > > >On Tue, 11 Oct 2005, J Gager wrote: > > > Module 3 *is* designed to hold data inline. The point I was trying > > to make is that I am not sure we want to say the data always has to > > be in the same DDI instance of the meta data. Module 3 does not > > support any external data files. > > > > -----Original Message----- > > From: Mary Vardigan [mailto:vardigan@umich.edu] > > Sent: Tuesday, October 11, 2005 4:07 PM > > To: Katherine McNeill-Harman; jgager@umich.edu; DDI-ADG > > Subject: RE: [DDI-ADG] Latest Spreadsheet > > > > > > > > Kate, J, and others, > > > > > > > > I hesitate to put in my two cents since I haven't been as involved > > in this lately and may have missed some critical information, but I > > was under the same impression as Kate that Module 3 was designed to > > hold data values inline and not point to an external file. I know we > > are really pressed for time, though, so rather than discuss this > > over email or in a phone call, perhaps Sanda and I can raise it > > during the SRG meeting next week and get clarification there. We > > will then report back after the meeting. Does this work? > > > > > > > > Mary > > > > > > _____ > > > > > > From: ddi-adg-bounces@icpsr.umich.edu > > [mailto:ddi-adg-bounces@icpsr.umich.edu] On Behalf Of Katherine > > McNeill-Harman > > Sent: Tuesday, October 11, 2005 2:29 PM > > To: jgager@umich.edu; 'DDI-ADG' > > Subject: RE: [DDI-ADG] Latest Spreadsheet > > > > > > > > Comments w/in (others, please comment as well; the most significant > > item starts w/a *** below): > > > > At 12:59 PM 10/11/2005 -0400, J Gager wrote: > > > > > > > > Kate - > > > > Thanks for your comments. Please see responses below. In general, > > I didn't think any change was significant enough to warrant further > > discussion. If anyone is still uncomfortable with these changes > > after this discussion, then we can schedule another meeting, but > > time is very very short for me, and the write ups for this aggregate > > piece are far more complicated and time consuming than I anticipated > > (there are a lot of details that need to be cleary explained). > > > > J > > > > -----Original Message----- > > > > From: Katherine McNeill-Harman [mailto:mcneillh@MIT.EDU] > > > > Sent: Tuesday, October 11, 2005 12:03 PM > > > > To: jgager@umich.edu; DDI-ADG > > > > Subject: Re: [DDI-ADG] Latest Spreadsheet > > > > J and others, > > > > Have a couple of questions/concerns about this new sheet; would be > > interested in others' opinions: > > > > 1) Can you explain a bit the reasons for changing the descriptions > > at the top of each module sheet? I don't care that we use exactly > > the words I drafted, but yours seem to have different meaning and I > > want to make sure we're all on the same page. Namely, > > > > - For modules 2 and 3, you seem to be emphasizing that it's for a > > "single nCube structure"--can you expand upon what you mean by that? > > > > > > > > What is meant by this for module 2 is that the file cannot contain > > multiple cubes. For instance if there were 2 cubes, say population > > by region, gender, and age (cube 1) and population by region and > > gender (cube 2), the combination of these 2 cubes in the data file > > would look something like this. > > > > MN M 50- 5300 > > > > MN M 50+ 6700 > > > > MN M 12000 > > > > MN F 50- 6800 > > > > MN F 50+ 5000 > > > > MN F 11800 > > > > Module 2 does not support this. Its intention is describe a file > > where all rows describe the same cube data. > > > > > > > > The same thing applies for module 3, since it is grouped by nCube. > > You would used module 3 to describe a single nCube at a time, and > > not a mix of nCubes and non cubed data. > > > > > > That's more clear, however, in module 2, how would one treat, e.g., > > a spreadsheet file containing multiple sheets with a different cube > > on each sheet? > > > > > > > > > > > > > > - Also, I'd like to ask you to consider putting back some of the > > wording I'd written for module 3 that makes it clear that the > > metadata and data are in one single DDI file; I don't think that > > comes across in your phrasing. > > > > > > > > I think it is too constricting. It does allow for all to be in one > > file, but don't we want to allow the data to also be used this way > > in a seperate file? > > > > > > ***I believe that it's only the former, that if it lives in a > > separate file it would be under module 1 or 2. Others, please confirm. > > > > > > > > > > > > > > - Lastly, for module 1, I believe it'd be helpful to include some > > explicit reference to the fact that the external file contains no > > metadata (to distinguish it from module 2). > > > > > > > > That may be misleading, since we are saying that attributes can > > exist in there, which are technically metadata. The distinction > > lies in the fact that module 1 data files do not state any of the > > cube coordinate values in them. > > > > > > I can live w/that if others don't have any other suggestions. > > > > > > > > > > > > > > 2) I'm not sure about including only the additions in the first > > sheet, as I think the other fields provide helpful context. Might > > there be a way to distinguish the new fields (as you had done, e.g., > > w/color) while still keeping all of them? > > > > > > > > It was really a time, and an issue of focusing attention. The model > > was incomplete to start with, and the effort of flushing out all > > existing things from the tag library, and putting in definitions for > > them is more work than it would be worth (in my opinion). > > > > > > Understand. I'd still lean the other way but will happily go w/the > > group concensus. > > > > > > > > > > > > > > 3) Your change to module 1, while I have no objections, seems to be > > significant enough that we should discuss it as a group. I don't > > quite understand the purpose of this. Is this something you think > > you could explain/we could discuss in more detail over email (or maybe phone)? > > > > > > > > The purpose of the change is to basically not change what is already > > in place. In speaking with Wendy, she pointed out how important it > > is to many people marking up data files to do so in the order in > > which the data occurs in the file. So there may be a mix of cubed > > and non cubed data. Further more, module 1 did not allow for any > > non cubed data (everything was grouped into a nCube container). The > > change simply replaced the inclusion of a data item into an nCube by > > containership, with inclusion by reference. The concept we > > initially had is still there, just represented differently. > > > > > > Sounds OK to me; I'll leave others to comment. > > > > > > > > > > > > > > 4) Plus a couple of other questions about the elements: > > > > -- in describing the attribute location choice, you refer separately > > to a data file vs. a spreadsheet. I understand what you're trying > > to do, but am a little concerned about the mutually-exclusive manner > > in which they're described (b/other places we use the term "data > > file" to include all sorts of formats, including spreadsheets, and > > think it should still keep that broad meaning). So I'd suggest > > changing the terms to say something like "fixed-format/delimited > > data file" and "spreadsheet data file" to distinguish the types to > > clarify that we consider them both data files. > > > > - In module 2 F18, what do you mean by " the structure describes all > > data and meta data for the cube"--that sounds to me more like module 3. > > > > - Module 3 F19; the notes contains a question; can that be deleted > > or should it be moved to G19? > > > > > > > > I will make these corrections. > > > > > > > > When I agreed to cancelling today's meeting, I didn't realize that > > you'd have such significant changes, so if it's best to discuss > > these over the phone, maybe we can arrange another call. > > > > > > > > Time is VERY critical. We are presenting this to the SRG in one > > week, and need to send this out ASAP. I think the important thing > > is that we have something for the group to work with. I just don't > > have the time to finish the proposals AND meet. > > > > > > > > Kate > > > > P.S. Plus a couple of typos > > > > - Module 1, F25, should be "measurement"--also applies to M2 F31 > > > > - Module 1, F20, ID should be capitalized > > > > - Module 2, F20, should be "coordinates" > > > > > > > > Will fix. > > > > > > > > At 09:29 AM 10/11/2005 -0400, J Gager wrote: > > > > > > > > All - > > > > > > > > Here is the latest spreadsheet. Note there are few significant > > changes that stemmed from a long discussion Wendy and I had. > > > > > > > > The first is the Logical sheet. I have gone back to just including > > the new fields. I felt it best to do this, since we weren't > > changing any existing fields, and I want the focus to only be on these additions. > > > > > > > > The second is the Physical Sheets - I have changed the name of these > > to Record Layout, since that is what we a truly representing. > > > > > > > > Finally, I have changed module 1, to allow for data items to exist > > outside of nCubes. Basically what I have done is create a way to > > reference an nCube and its attached attributes. The basic concept > > that we had originally is still there, it is just less deviant from > > the original, and oft used structure. > > > > > > > > Please let me know of any structural issues ASAP as the samples and > > write up is based on this. > > > > > > > > J > > > > _______________________________________________ > > > > DDI-ADG mailing list > > > > DDI-ADG@icpsr.umich.edu > > > > http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg > > > > > > > > ___________________________________________ > > > > Katherine McNeill-Harman > > > > Data Services Librarian > > > > Dewey Library for Management and Social Sciences > > > > Massachusetts Institute of Technology > > > > 77 Massachusetts Avenue, E53-100 > > > > Cambridge, MA 02139 > > > > mcneillh@mit.edu > > > > 617-253-0787 > > > > ___________________________________________ > > Katherine McNeill-Harman > > Data Services Librarian > > Dewey Library for Management and Social Sciences Massachusetts > > Institute of Technology > > 77 Massachusetts Avenue, E53-100 > > Cambridge, MA 02139 > > mcneillh@mit.edu > > 617-253-0787 > > > > > >Wendy L. Thomas Phone: +1 612.624.4389 >Data Access Core Director Fax: +1 612.626.8375 >Minnesota Population Center Email: wlt@pop.umn.edu >University of Minnesota >50 Willey Hall >225 19th Avenue South >Minneapolis, MN 55455 ___________________________________________ Katherine McNeill-Harman Data Services Librarian Dewey Library for Management and Social Sciences Massachusetts Institute of Technology 77 Massachusetts Avenue, E53-100 Cambridge, MA 02139 mcneillh@mit.edu 617-253-0787 _______________________________________________ DDI-ADG mailing list DDI-ADG@icpsr.umich.edu http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg From ilona_e at berkeley.edu Tue Oct 11 17:22:42 2005 From: ilona_e at berkeley.edu (Ilona Einowski) Date: Tue Oct 11 17:24:13 2005 Subject: [DDI-ADG] Latest Spreadsheet In-Reply-To: <007b01c5cea7$41856a50$6401a8c0@JGAGERLT> Message-ID: OK...only 1 nCube...that makes sense...I guess I still don't understand the situation for Module 3... On another topic...small point but, Jay, the header on sheet 1, the FINAL Aggregate.xls, still says "Logical Additions IE Comments" ---it needs to be changed to Logical Structure - Additions. Ilona -----Original Message----- From: J Gager [mailto:j.b.gager@gmail.com] Sent: Tuesday, October 11, 2005 2:04 PM To: ilona_e@berkeley.edu; 'Katherine McNeill-Harman'; 'Wendy Thomas' Cc: 'DDI-ADG' Subject: RE: [DDI-ADG] Latest Spreadsheet No, module 2 describes a spreadsheet containing data, but it is a little more limited than what you described. Wendy and I discussed the case you described below, and it is out of scope of our groups work, but it is on the radar for the SRG. This sort of information would be better suited in the gross file description. Hypothetically, the spreadsheet you described below could contain a section of data that could be used in module 2, however, the table must contain data for only 1 nCube (see my response to Kate ealier for claification on this). -----Original Message----- From: Ilona Einowski [mailto:ilona_e@berkeley.edu] Sent: Tuesday, October 11, 2005 5:01 PM To: 'Katherine McNeill-Harman'; 'Wendy Thomas'; jgager@umich.edu Cc: 'DDI-ADG' Subject: RE: [DDI-ADG] Latest Spreadsheet OK...and here is my 2cents.... I though Module 3 was an example of a spreadsheet where the whole shebang - table titles, row labels, column headers, data cells, footnotes, etc were represented.... Did I miss the boat on this???? Ilona -----Original Message----- From: ddi-adg-bounces@icpsr.umich.edu [mailto:ddi-adg-bounces@icpsr.umich.edu] On Behalf Of Katherine McNeill-Harman Sent: Tuesday, October 11, 2005 1:37 PM To: Wendy Thomas; jgager@umich.edu Cc: 'DDI-ADG' Subject: RE: [DDI-ADG] Latest Spreadsheet Based on J's response, I guess I would wonder when the data would not be in the same DDI instance of the meta data. You would have two DDI metadata files and only one would have the data? I'm having a hard time conceptualizing this. Know we're short on time, but think this is a basic thing we should try to agree on quickly if possible (i.e. to get on the same page ourselves about our recommendation as opposed to waiting to talk to the SRG). Kate At 03:21 PM 10/11/2005 -0500, Wendy Thomas wrote: >Module 1 describes data that resides in an external file of data only >Module 2 describes data that resides in an external file that has both >data and some level of metadata (category labels, title line etc) >Module 3 describes a data that resides in the metadata, there is no >external file > >At least that's the way it reads to me. > >wendy > > > > >On Tue, 11 Oct 2005, J Gager wrote: > > > Module 3 *is* designed to hold data inline. The point I was trying > > to make is that I am not sure we want to say the data always has to > > be in the same DDI instance of the meta data. Module 3 does not > > support any external data files. > > > > -----Original Message----- > > From: Mary Vardigan [mailto:vardigan@umich.edu] > > Sent: Tuesday, October 11, 2005 4:07 PM > > To: Katherine McNeill-Harman; jgager@umich.edu; DDI-ADG > > Subject: RE: [DDI-ADG] Latest Spreadsheet > > > > > > > > Kate, J, and others, > > > > > > > > I hesitate to put in my two cents since I haven't been as involved > > in this lately and may have missed some critical information, but I > > was under the same impression as Kate that Module 3 was designed to > > hold data values inline and not point to an external file. I know we > > are really pressed for time, though, so rather than discuss this > > over email or in a phone call, perhaps Sanda and I can raise it > > during the SRG meeting next week and get clarification there. We > > will then report back after the meeting. Does this work? > > > > > > > > Mary > > > > > > _____ > > > > > > From: ddi-adg-bounces@icpsr.umich.edu > > [mailto:ddi-adg-bounces@icpsr.umich.edu] On Behalf Of Katherine > > McNeill-Harman > > Sent: Tuesday, October 11, 2005 2:29 PM > > To: jgager@umich.edu; 'DDI-ADG' > > Subject: RE: [DDI-ADG] Latest Spreadsheet > > > > > > > > Comments w/in (others, please comment as well; the most significant > > item starts w/a *** below): > > > > At 12:59 PM 10/11/2005 -0400, J Gager wrote: > > > > > > > > Kate - > > > > Thanks for your comments. Please see responses below. In general, > > I didn't think any change was significant enough to warrant further > > discussion. If anyone is still uncomfortable with these changes > > after this discussion, then we can schedule another meeting, but > > time is very very short for me, and the write ups for this aggregate > > piece are far more complicated and time consuming than I anticipated > > (there are a lot of details that need to be cleary explained). > > > > J > > > > -----Original Message----- > > > > From: Katherine McNeill-Harman [mailto:mcneillh@MIT.EDU] > > > > Sent: Tuesday, October 11, 2005 12:03 PM > > > > To: jgager@umich.edu; DDI-ADG > > > > Subject: Re: [DDI-ADG] Latest Spreadsheet > > > > J and others, > > > > Have a couple of questions/concerns about this new sheet; would be > > interested in others' opinions: > > > > 1) Can you explain a bit the reasons for changing the descriptions > > at the top of each module sheet? I don't care that we use exactly > > the words I drafted, but yours seem to have different meaning and I > > want to make sure we're all on the same page. Namely, > > > > - For modules 2 and 3, you seem to be emphasizing that it's for a > > "single nCube structure"--can you expand upon what you mean by that? > > > > > > > > What is meant by this for module 2 is that the file cannot contain > > multiple cubes. For instance if there were 2 cubes, say population > > by region, gender, and age (cube 1) and population by region and > > gender (cube 2), the combination of these 2 cubes in the data file > > would look something like this. > > > > MN M 50- 5300 > > > > MN M 50+ 6700 > > > > MN M 12000 > > > > MN F 50- 6800 > > > > MN F 50+ 5000 > > > > MN F 11800 > > > > Module 2 does not support this. Its intention is describe a file > > where all rows describe the same cube data. > > > > > > > > The same thing applies for module 3, since it is grouped by nCube. > > You would used module 3 to describe a single nCube at a time, and > > not a mix of nCubes and non cubed data. > > > > > > That's more clear, however, in module 2, how would one treat, e.g., > > a spreadsheet file containing multiple sheets with a different cube > > on each sheet? > > > > > > > > > > > > > > - Also, I'd like to ask you to consider putting back some of the > > wording I'd written for module 3 that makes it clear that the > > metadata and data are in one single DDI file; I don't think that > > comes across in your phrasing. > > > > > > > > I think it is too constricting. It does allow for all to be in one > > file, but don't we want to allow the data to also be used this way > > in a seperate file? > > > > > > ***I believe that it's only the former, that if it lives in a > > separate file it would be under module 1 or 2. Others, please confirm. > > > > > > > > > > > > > > - Lastly, for module 1, I believe it'd be helpful to include some > > explicit reference to the fact that the external file contains no > > metadata (to distinguish it from module 2). > > > > > > > > That may be misleading, since we are saying that attributes can > > exist in there, which are technically metadata. The distinction > > lies in the fact that module 1 data files do not state any of the > > cube coordinate values in them. > > > > > > I can live w/that if others don't have any other suggestions. > > > > > > > > > > > > > > 2) I'm not sure about including only the additions in the first > > sheet, as I think the other fields provide helpful context. Might > > there be a way to distinguish the new fields (as you had done, e.g., > > w/color) while still keeping all of them? > > > > > > > > It was really a time, and an issue of focusing attention. The model > > was incomplete to start with, and the effort of flushing out all > > existing things from the tag library, and putting in definitions for > > them is more work than it would be worth (in my opinion). > > > > > > Understand. I'd still lean the other way but will happily go w/the > > group concensus. > > > > > > > > > > > > > > 3) Your change to module 1, while I have no objections, seems to be > > significant enough that we should discuss it as a group. I don't > > quite understand the purpose of this. Is this something you think > > you could explain/we could discuss in more detail over email (or maybe phone)? > > > > > > > > The purpose of the change is to basically not change what is already > > in place. In speaking with Wendy, she pointed out how important it > > is to many people marking up data files to do so in the order in > > which the data occurs in the file. So there may be a mix of cubed > > and non cubed data. Further more, module 1 did not allow for any > > non cubed data (everything was grouped into a nCube container). The > > change simply replaced the inclusion of a data item into an nCube by > > containership, with inclusion by reference. The concept we > > initially had is still there, just represented differently. > > > > > > Sounds OK to me; I'll leave others to comment. > > > > > > > > > > > > > > 4) Plus a couple of other questions about the elements: > > > > -- in describing the attribute location choice, you refer separately > > to a data file vs. a spreadsheet. I understand what you're trying > > to do, but am a little concerned about the mutually-exclusive manner > > in which they're described (b/other places we use the term "data > > file" to include all sorts of formats, including spreadsheets, and > > think it should still keep that broad meaning). So I'd suggest > > changing the terms to say something like "fixed-format/delimited > > data file" and "spreadsheet data file" to distinguish the types to > > clarify that we consider them both data files. > > > > - In module 2 F18, what do you mean by " the structure describes all > > data and meta data for the cube"--that sounds to me more like module 3. > > > > - Module 3 F19; the notes contains a question; can that be deleted > > or should it be moved to G19? > > > > > > > > I will make these corrections. > > > > > > > > When I agreed to cancelling today's meeting, I didn't realize that > > you'd have such significant changes, so if it's best to discuss > > these over the phone, maybe we can arrange another call. > > > > > > > > Time is VERY critical. We are presenting this to the SRG in one > > week, and need to send this out ASAP. I think the important thing > > is that we have something for the group to work with. I just don't > > have the time to finish the proposals AND meet. > > > > > > > > Kate > > > > P.S. Plus a couple of typos > > > > - Module 1, F25, should be "measurement"--also applies to M2 F31 > > > > - Module 1, F20, ID should be capitalized > > > > - Module 2, F20, should be "coordinates" > > > > > > > > Will fix. > > > > > > > > At 09:29 AM 10/11/2005 -0400, J Gager wrote: > > > > > > > > All - > > > > > > > > Here is the latest spreadsheet. Note there are few significant > > changes that stemmed from a long discussion Wendy and I had. > > > > > > > > The first is the Logical sheet. I have gone back to just including > > the new fields. I felt it best to do this, since we weren't > > changing any existing fields, and I want the focus to only be on these additions. > > > > > > > > The second is the Physical Sheets - I have changed the name of these > > to Record Layout, since that is what we a truly representing. > > > > > > > > Finally, I have changed module 1, to allow for data items to exist > > outside of nCubes. Basically what I have done is create a way to > > reference an nCube and its attached attributes. The basic concept > > that we had originally is still there, it is just less deviant from > > the original, and oft used structure. > > > > > > > > Please let me know of any structural issues ASAP as the samples and > > write up is based on this. > > > > > > > > J > > > > _______________________________________________ > > > > DDI-ADG mailing list > > > > DDI-ADG@icpsr.umich.edu > > > > http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg > > > > > > > > ___________________________________________ > > > > Katherine McNeill-Harman > > > > Data Services Librarian > > > > Dewey Library for Management and Social Sciences > > > > Massachusetts Institute of Technology > > > > 77 Massachusetts Avenue, E53-100 > > > > Cambridge, MA 02139 > > > > mcneillh@mit.edu > > > > 617-253-0787 > > > > ___________________________________________ > > Katherine McNeill-Harman > > Data Services Librarian > > Dewey Library for Management and Social Sciences Massachusetts > > Institute of Technology > > 77 Massachusetts Avenue, E53-100 > > Cambridge, MA 02139 > > mcneillh@mit.edu > > 617-253-0787 > > > > > >Wendy L. Thomas Phone: +1 612.624.4389 >Data Access Core Director Fax: +1 612.626.8375 >Minnesota Population Center Email: wlt@pop.umn.edu >University of Minnesota >50 Willey Hall >225 19th Avenue South >Minneapolis, MN 55455 ___________________________________________ Katherine McNeill-Harman Data Services Librarian Dewey Library for Management and Social Sciences Massachusetts Institute of Technology 77 Massachusetts Avenue, E53-100 Cambridge, MA 02139 mcneillh@mit.edu 617-253-0787 _______________________________________________ DDI-ADG mailing list DDI-ADG@icpsr.umich.edu http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg From mcneillh at MIT.EDU Tue Oct 11 17:24:50 2005 From: mcneillh at MIT.EDU (Katherine McNeill-Harman) Date: Tue Oct 11 17:25:05 2005 Subject: [DDI-ADG] Latest Spreadsheet In-Reply-To: <007b01c5cea7$41856a50$6401a8c0@JGAGERLT> References: Message-ID: <5.2.1.1.2.20051011172214.00bcb228@po11.mit.edu> Understood--and others, don't be confused by the order of responses; I believe J was responding to my question about a spreadsheet that contains multiple sheets, not to Ilona's comment about Module 3. And I see that J also sent out a "final" version of the package to us; I sent out a separate email directly to him suggesting that we stick w/what seems to be our collective understanding of module 3 referring to a single combined ddi/data file, and that--given the time pressure--it's late to be recommending a change. So I hope that feedback is taken and incorporated into the truly final version sent on. Kate At 05:03 PM 10/11/2005 -0400, J Gager wrote: >No, module 2 describes a spreadsheet containing data, but it is a little >more limited than what you described. Wendy and I discussed the case >you described below, and it is out of scope of our groups work, but it >is on the radar for the SRG. This sort of information would be better >suited in the gross file description. > >Hypothetically, the spreadsheet you described below could contain a >section of data that could be used in module 2, however, the table must >contain data for only 1 nCube (see my response to Kate ealier for >claification on this). > >-----Original Message----- >From: Ilona Einowski [mailto:ilona_e@berkeley.edu] >Sent: Tuesday, October 11, 2005 5:01 PM >To: 'Katherine McNeill-Harman'; 'Wendy Thomas'; jgager@umich.edu >Cc: 'DDI-ADG' >Subject: RE: [DDI-ADG] Latest Spreadsheet > > >OK...and here is my 2cents.... > >I though Module 3 was an example of a spreadsheet where the whole >shebang - table titles, row labels, column headers, data cells, >footnotes, etc were represented.... > >Did I miss the boat on this???? > >Ilona > >-----Original Message----- >From: ddi-adg-bounces@icpsr.umich.edu >[mailto:ddi-adg-bounces@icpsr.umich.edu] On Behalf Of Katherine >McNeill-Harman >Sent: Tuesday, October 11, 2005 1:37 PM >To: Wendy Thomas; jgager@umich.edu >Cc: 'DDI-ADG' >Subject: RE: [DDI-ADG] Latest Spreadsheet > >Based on J's response, I guess I would wonder when the data would not be >in the same DDI instance of the meta data. You would have two DDI >metadata files and only one would have the data? I'm having a hard time >conceptualizing this. Know we're short on time, but think this is a >basic thing we should try to agree on quickly if possible (i.e. to get >on the same page ourselves about our recommendation as opposed to >waiting to talk to the SRG). > >Kate > >At 03:21 PM 10/11/2005 -0500, Wendy Thomas wrote: > >Module 1 describes data that resides in an external file of data only > >Module 2 describes data that resides in an external file that has both > >data and some level of metadata (category labels, title line etc) > >Module 3 describes a data that resides in the metadata, there is no > >external file > > > >At least that's the way it reads to me. > > > >wendy > > > > > > > > > >On Tue, 11 Oct 2005, J Gager wrote: > > > > > Module 3 *is* designed to hold data inline. The point I was trying > > > to make is that I am not sure we want to say the data always has to > > > be in the same DDI instance of the meta data. Module 3 does not > > > support any external data files. > > > > > > -----Original Message----- > > > From: Mary Vardigan [mailto:vardigan@umich.edu] > > > Sent: Tuesday, October 11, 2005 4:07 PM > > > To: Katherine McNeill-Harman; jgager@umich.edu; DDI-ADG > > > Subject: RE: [DDI-ADG] Latest Spreadsheet > > > > > > > > > > > > Kate, J, and others, > > > > > > > > > > > > I hesitate to put in my two cents since I haven't been as involved > > > in this lately and may have missed some critical information, but I > > > was under the same impression as Kate that Module 3 was designed to > > > hold data values inline and not point to an external file. I know we > > > > are really pressed for time, though, so rather than discuss this > > > over email or in a phone call, perhaps Sanda and I can raise it > > > during the SRG meeting next week and get clarification there. We > > > will then report back after the meeting. Does this work? > > > > > > > > > > > > Mary > > > > > > > > > _____ > > > > > > > > > From: ddi-adg-bounces@icpsr.umich.edu > > > [mailto:ddi-adg-bounces@icpsr.umich.edu] On Behalf Of Katherine > > > McNeill-Harman > > > Sent: Tuesday, October 11, 2005 2:29 PM > > > To: jgager@umich.edu; 'DDI-ADG' > > > Subject: RE: [DDI-ADG] Latest Spreadsheet > > > > > > > > > > > > Comments w/in (others, please comment as well; the most significant > > > item starts w/a *** below): > > > > > > At 12:59 PM 10/11/2005 -0400, J Gager wrote: > > > > > > > > > > > > Kate - > > > > > > Thanks for your comments. Please see responses below. In general, > > > I didn't think any change was significant enough to warrant further > > > discussion. If anyone is still uncomfortable with these changes > > > after this discussion, then we can schedule another meeting, but > > > time is very very short for me, and the write ups for this aggregate > > > > piece are far more complicated and time consuming than I anticipated > > > > (there are a lot of details that need to be cleary explained). > > > > > > J > > > > > > -----Original Message----- > > > > > > From: Katherine McNeill-Harman [mailto:mcneillh@MIT.EDU] > > > > > > Sent: Tuesday, October 11, 2005 12:03 PM > > > > > > To: jgager@umich.edu; DDI-ADG > > > > > > Subject: Re: [DDI-ADG] Latest Spreadsheet > > > > > > J and others, > > > > > > Have a couple of questions/concerns about this new sheet; would be > > > interested in others' opinions: > > > > > > 1) Can you explain a bit the reasons for changing the descriptions > > > at the top of each module sheet? I don't care that we use exactly > > > the words I drafted, but yours seem to have different meaning and I > > > want to make sure we're all on the same page. Namely, > > > > > > - For modules 2 and 3, you seem to be emphasizing that it's for a > > > "single nCube structure"--can you expand upon what you mean by that? > > > > > > > > > > > > What is meant by this for module 2 is that the file cannot contain > > > multiple cubes. For instance if there were 2 cubes, say population > > > by region, gender,