[DDI-ADG] Latest Spreadsheet
Katherine McNeill-Harman
mcneillh at MIT.EDU
Tue Oct 11 14:29:04 EDT 2005
Comments w/in (others, please comment as well; the most significant item
starts w/a *** below):
At 12:59 PM 10/11/2005 -0400, J Gager wrote:
>Kate -
>
>Thanks for your comments. Please see responses below. In general, I
>didn't think any change was significant enough to warrant further
>discussion. If anyone is still uncomfortable with these changes after
>this discussion, then we can schedule another meeting, but time is very
>very short for me, and the write ups for this aggregate piece are far more
>complicated and time consuming than I anticipated (there are a lot of
>details that need to be cleary explained).
>
>J
>-----Original Message-----
>From: Katherine McNeill-Harman [mailto:mcneillh at MIT.EDU]
>Sent: Tuesday, October 11, 2005 12:03 PM
>To: jgager at umich.edu; DDI-ADG
>Subject: Re: [DDI-ADG] Latest Spreadsheet
>
>J and others,
>
>Have a couple of questions/concerns about this new sheet; would be
>interested in others' opinions:
>
>1) Can you explain a bit the reasons for changing the descriptions at the
>top of each module sheet? I don't care that we use exactly the words I
>drafted, but yours seem to have different meaning and I want to make sure
>we're all on the same page. Namely,
>- For modules 2 and 3, you seem to be emphasizing that it's for a "single
>nCube structure"--can you expand upon what you mean by that?
>
>What is meant by this for module 2 is that the file cannot contain
>multiple cubes. For instance if there were 2 cubes, say population by
>region, gender, and age (cube 1) and population by region and gender (cube
>2), the combination of these 2 cubes in the data file would look something
>like this.
>MN M 50- 5300
>MN M 50+ 6700
>MN M 12000
>MN F 50- 6800
>MN F 50+ 5000
>MN F 11800
>Module 2 does not support this. Its intention is describe a file where
>all rows describe the same cube data.
>
>The same thing applies for module 3, since it is grouped by nCube. You
>would used module 3 to describe a single nCube at a time, and not a mix of
>nCubes and non cubed data.
That's more clear, however, in module 2, how would one treat, e.g., a
spreadsheet file containing multiple sheets with a different cube on each
sheet?
>
> - Also, I'd like to ask you to consider putting back some of the wording
> I'd written for module 3 that makes it clear that the metadata and data
> are in one single DDI file; I don't think that comes across in your phrasing.
>
>I think it is too constricting. It does allow for all to be in one file,
>but don't we want to allow the data to also be used this way in a seperate
>file?
***I believe that it's only the former, that if it lives in a separate file
it would be under module 1 or 2. Others, please confirm.
>
>- Lastly, for module 1, I believe it'd be helpful to include some explicit
>reference to the fact that the external file contains no metadata (to
>distinguish it from module 2).
>
>That may be misleading, since we are saying that attributes can exist in
>there, which are technically metadata. The distinction lies in the fact
>that module 1 data files do not state any of the cube coordinate values in
>them.
I can live w/that if others don't have any other suggestions.
>
>2) I'm not sure about including only the additions in the first sheet, as
>I think the other fields provide helpful context. Might there be a way to
>distinguish the new fields (as you had done, e.g., w/color) while still
>keeping all of them?
>
>It was really a time, and an issue of focusing attention. The model was
>incomplete to start with, and the effort of flushing out all existing
>things from the tag library, and putting in definitions for them is more
>work than it would be worth (in my opinion).
Understand. I'd still lean the other way but will happily go w/the group
concensus.
>
>3) Your change to module 1, while I have no objections, seems to be
>significant enough that we should discuss it as a group. I don't quite
>understand the purpose of this. Is this something you think you could
>explain/we could discuss in more detail over email (or maybe phone)?
>
>The purpose of the change is to basically not change what is already in
>place. In speaking with Wendy, she pointed out how important it is to
>many people marking up data files to do so in the order in which the data
>occurs in the file. So there may be a mix of cubed and non cubed
>data. Further more, module 1 did not allow for any non cubed data
>(everything was grouped into a nCube container). The change simply
>replaced the inclusion of a data item into an nCube by containership, with
>inclusion by reference. The concept we initially had is still there, just
>represented differently.
Sounds OK to me; I'll leave others to comment.
>
>4) Plus a couple of other questions about the elements:
>-- in describing the attribute location choice, you refer separately to a
>data file vs. a spreadsheet. I understand what you're trying to do, but
>am a little concerned about the mutually-exclusive manner in which they're
>described (b/other places we use the term "data file" to include all sorts
>of formats, including spreadsheets, and think it should still keep that
>broad meaning). So I'd suggest changing the terms to say something like
>"fixed-format/delimited data file" and "spreadsheet data file" to
>distinguish the types to clarify that we consider them both data files.
>- In module 2 F18, what do you mean by " the structure describes all data
>and meta data for the cube"--that sounds to me more like module 3.
>- Module 3 F19; the notes contains a question; can that be deleted or
>should it be moved to G19?
>
>I will make these corrections.
>
>When I agreed to cancelling today's meeting, I didn't realize that you'd
>have such significant changes, so if it's best to discuss these over the
>phone, maybe we can arrange another call.
>
>Time is VERY critical. We are presenting this to the SRG in one week, and
>need to send this out ASAP. I think the important thing is that we have
>something for the group to work with. I just don't have the time to
>finish the proposals AND meet.
>
>Kate
>
>P.S. Plus a couple of typos
>- Module 1, F25, should be "measurement"--also applies to M2 F31
>- Module 1, F20, ID should be capitalized
>- Module 2, F20, should be "coordinates"
>
>Will fix.
>
>At 09:29 AM 10/11/2005 -0400, J Gager wrote:
>>All -
>>
>>Here is the latest spreadsheet. Note there are few significant changes
>>that stemmed from a long discussion Wendy and I had.
>>
>>The first is the Logical sheet. I have gone back to just including the
>>new fields. I felt it best to do this, since we weren't changing any
>>existing fields, and I want the focus to only be on these additions.
>>
>>The second is the Physical Sheets - I have changed the name of these to
>>Record Layout, since that is what we a truly representing.
>>
>>Finally, I have changed module 1, to allow for data items to exist
>>outside of nCubes. Basically what I have done is create a way to
>>reference an nCube and its attached attributes. The basic concept that
>>we had originally is still there, it is just less deviant from the
>>original, and oft used structure.
>>
>>Please let me know of any structural issues ASAP as the samples and write
>>up is based on this.
>>
>>J
>>
>>_______________________________________________
>>DDI-ADG mailing list
>>DDI-ADG at icpsr.umich.edu
>>http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg
>
>___________________________________________
>Katherine McNeill-Harman
>Data Services Librarian
>Dewey Library for Management and Social Sciences
>Massachusetts Institute of Technology
>77 Massachusetts Avenue, E53-100
>Cambridge, MA 02139
>mcneillh at mit.edu
>617-253-0787
___________________________________________
Katherine McNeill-Harman
Data Services Librarian
Dewey Library for Management and Social Sciences
Massachusetts Institute of Technology
77 Massachusetts Avenue, E53-100
Cambridge, MA 02139
mcneillh at mit.edu
617-253-0787
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.icpsr.umich.edu/pipermail/ddi-adg/attachments/20051011/154b18ba/attachment.html
More information about the DDI-ADG
mailing list