[DDI-ADG] Latest Spreadsheet

Katherine McNeill-Harman mcneillh at MIT.EDU
Tue Oct 11 14:29:04 EDT 2005


Comments w/in (others, please comment as well; the most significant item 
starts w/a *** below):

At 12:59 PM 10/11/2005 -0400, J Gager wrote:
>Kate -
>
>Thanks for your comments.  Please see responses below.  In general, I 
>didn't think any change was significant enough to warrant further 
>discussion.  If anyone is still uncomfortable with these changes after 
>this discussion, then we can schedule another meeting, but time is very 
>very short for me, and the write ups for this aggregate piece are far more 
>complicated and time consuming than I anticipated (there are a lot of 
>details that need to be cleary explained).
>
>J
>-----Original Message-----
>From: Katherine McNeill-Harman [mailto:mcneillh at MIT.EDU]
>Sent: Tuesday, October 11, 2005 12:03 PM
>To: jgager at umich.edu; DDI-ADG
>Subject: Re: [DDI-ADG] Latest Spreadsheet
>
>J and others,
>
>Have a couple of questions/concerns about this new sheet; would be 
>interested in others' opinions:
>
>1) Can you explain a bit the reasons for changing the descriptions at the 
>top of each module sheet?  I don't care that we use exactly the words I 
>drafted, but yours seem to have different meaning and I want to make sure 
>we're all on the same page.  Namely,
>- For modules 2 and 3, you seem to be emphasizing that it's for a "single 
>nCube structure"--can you expand upon what you mean by that?
>
>What is meant by this for module 2 is that the file cannot contain 
>multiple cubes.  For instance if there were 2 cubes, say population by 
>region, gender, and age (cube 1) and population by region and gender (cube 
>2), the combination of these 2 cubes in the data file would look something 
>like this.
>MN    M    50-    5300
>MN    M    50+   6700
>MN    M    12000
>MN    F    50-    6800
>MN    F    50+    5000
>MN    F    11800
>Module 2 does not support this.  Its intention is describe a file where 
>all rows describe the same cube data.
>
>The same thing applies for module 3, since it is grouped by nCube.  You 
>would used module 3 to describe a single nCube at a time, and not a mix of 
>nCubes and non cubed data.

That's more clear, however, in module 2, how would one treat, e.g., a 
spreadsheet file containing multiple sheets with a different cube on each 
sheet?

>
>  - Also, I'd like to ask you to consider putting back some of the wording 
> I'd written for module 3 that makes it clear that the metadata and data 
> are in one single DDI file; I don't think that comes across in your phrasing.
>
>I think it is too constricting.  It does allow for all to be in one file, 
>but don't we want to allow the data to also be used this way in a seperate 
>file?

***I believe that it's only the former, that if it lives in a separate file 
it would be under module 1 or 2.  Others, please confirm.

>
>- Lastly, for module 1, I believe it'd be helpful to include some explicit 
>reference to the fact that the external file contains no metadata (to 
>distinguish it from module 2).
>
>That may be misleading, since we are saying that attributes can exist in 
>there, which are technically metadata.  The distinction lies in the fact 
>that module 1 data files do not state any of the cube coordinate values in 
>them.

I can live w/that if others don't have any other suggestions.

>
>2) I'm not sure about including only the additions in the first sheet, as 
>I think the other fields provide helpful context.  Might there be a way to 
>distinguish the new fields (as you had done, e.g., w/color) while still 
>keeping all of them?
>
>It was really a time, and an issue of focusing attention.  The model was 
>incomplete to start with, and the effort of flushing out all existing 
>things from the tag library, and putting in definitions for them is more 
>work than it would be worth (in my opinion).

Understand.  I'd still lean the other way but will happily go w/the group 
concensus.

>
>3) Your change to module 1, while I have no objections, seems to be 
>significant enough that we should discuss it as a group.  I don't quite 
>understand the purpose of this.  Is this something you think you could 
>explain/we could discuss in more detail over email (or maybe phone)?
>
>The purpose of the change is to basically not change what is already in 
>place.  In speaking with Wendy, she pointed out how important it is to 
>many people marking up data files to do so in the order in which the data 
>occurs in the file.  So there may be a mix of cubed and non cubed 
>data.  Further more, module 1 did not allow for any non cubed data 
>(everything was grouped into a nCube container).  The change simply 
>replaced the inclusion of a data item into an nCube by containership, with 
>inclusion by reference.  The concept we initially had is still there, just 
>represented differently.

Sounds OK to me; I'll leave others to comment.

>
>4) Plus a couple of other questions about the elements:
>-- in describing the attribute location choice, you refer separately to a 
>data file vs. a spreadsheet.  I understand what you're trying to do, but 
>am a little concerned about the mutually-exclusive manner in which they're 
>described (b/other places we use the term "data file" to include all sorts 
>of formats, including spreadsheets, and think it should still keep that 
>broad meaning).  So I'd suggest changing the terms to say something like 
>"fixed-format/delimited data file" and "spreadsheet data file" to 
>distinguish the types to clarify that we consider them both data files.
>- In module 2 F18, what do you mean by " the structure describes all data 
>and meta data for the cube"--that sounds to me more like module 3.
>- Module 3 F19; the notes contains a question; can that be deleted or 
>should it be moved to G19?
>
>I will make these corrections.
>
>When I agreed to cancelling today's meeting, I didn't realize that you'd 
>have such significant changes, so if it's best to discuss these over the 
>phone, maybe we can arrange another call.
>
>Time is VERY critical.  We are presenting this to the SRG in one week, and 
>need to send this out ASAP.  I think the important thing is that we have 
>something for the group to work with.  I just don't have the time to 
>finish the proposals AND meet.
>
>Kate
>
>P.S.  Plus a couple of typos
>- Module 1, F25, should be "measurement"--also applies to M2 F31
>- Module 1, F20, ID should be capitalized
>- Module 2, F20, should be "coordinates"
>
>Will fix.
>
>At 09:29 AM 10/11/2005 -0400, J Gager wrote:
>>All -
>>
>>Here is the latest spreadsheet.  Note there are few significant changes 
>>that stemmed from a long discussion Wendy and I had.
>>
>>The first is the Logical sheet.  I have gone back to just including the 
>>new fields.  I felt it best to do this, since we weren't changing any 
>>existing fields, and I want the focus to only be on these additions.
>>
>>The second is the Physical Sheets - I have changed the name of these to 
>>Record Layout, since that is what we a truly representing.
>>
>>Finally, I have changed module 1, to allow for data items to exist 
>>outside of nCubes.  Basically what I have done is create a way to 
>>reference an nCube and its attached attributes.  The basic concept that 
>>we had originally is still there, it is just less deviant from the 
>>original, and oft used structure.
>>
>>Please let me know of any structural issues ASAP as the samples and write 
>>up is based on this.
>>
>>J
>>
>>_______________________________________________
>>DDI-ADG mailing list
>>DDI-ADG at icpsr.umich.edu
>>http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg
>
>___________________________________________
>Katherine McNeill-Harman
>Data Services Librarian
>Dewey Library for Management and Social Sciences
>Massachusetts Institute of Technology
>77 Massachusetts Avenue, E53-100
>Cambridge, MA 02139
>mcneillh at mit.edu
>617-253-0787

___________________________________________
Katherine McNeill-Harman
Data Services Librarian
Dewey Library for Management and Social Sciences
Massachusetts Institute of Technology
77 Massachusetts Avenue, E53-100
Cambridge, MA 02139
mcneillh at mit.edu
617-253-0787 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.icpsr.umich.edu/pipermail/ddi-adg/attachments/20051011/154b18ba/attachment.html


More information about the DDI-ADG mailing list