[DDI-ADG] Latest Spreadsheet

Katherine McNeill-Harman mcneillh at MIT.EDU
Tue Oct 11 16:37:22 EDT 2005


Based on J's response, I guess I would wonder when the data would not be in 
the same DDI instance of the meta data.  You would have two DDI metadata 
files and only one would have the data?  I'm having a hard time 
conceptualizing this.  Know we're short on time, but think this is a basic 
thing we should try to agree on quickly if possible (i.e. to get on the 
same page ourselves about our recommendation as opposed to waiting to talk 
to the SRG).

Kate

At 03:21 PM 10/11/2005 -0500, Wendy Thomas wrote:
>Module 1 describes data that resides in an external file of data only
>Module 2 describes data that resides in an external file that has both
>data and some level of metadata (category labels, title line etc)
>Module 3 describes a data that resides in the metadata, there is no
>external file
>
>At least that's the way it reads to me.
>
>wendy
>
>
>
>
>On Tue, 11 Oct 2005, J Gager wrote:
>
> > Module 3 *is* designed to hold data inline.  The point I was trying to
> > make is that I am not sure we want to say the data always has to be in
> > the same DDI instance of the meta data.  Module 3 does not support any
> > external data files.
> >
> > -----Original Message-----
> > From: Mary Vardigan [mailto:vardigan at umich.edu]
> > Sent: Tuesday, October 11, 2005 4:07 PM
> > To: Katherine McNeill-Harman; jgager at umich.edu; DDI-ADG
> > Subject: RE: [DDI-ADG] Latest Spreadsheet
> >
> >
> >
> > Kate, J, and others,
> >
> >
> >
> > I hesitate to put in my two cents since I haven't been as involved in
> > this lately and may have missed some critical information, but I was
> > under the same impression as Kate that Module 3 was designed to hold
> > data values inline and not point to an external file. I know we are
> > really pressed for time, though, so rather than discuss this over email
> > or in a phone call, perhaps Sanda and I can raise it during the SRG
> > meeting next week and get clarification there. We will then report back
> > after the meeting. Does this work?
> >
> >
> >
> > Mary
> >
> >
> >   _____
> >
> >
> > From: ddi-adg-bounces at icpsr.umich.edu
> > [mailto:ddi-adg-bounces at icpsr.umich.edu] On Behalf Of Katherine
> > McNeill-Harman
> > Sent: Tuesday, October 11, 2005 2:29 PM
> > To: jgager at umich.edu; 'DDI-ADG'
> > Subject: RE: [DDI-ADG] Latest Spreadsheet
> >
> >
> >
> > Comments w/in (others, please comment as well; the most significant item
> > starts w/a *** below):
> >
> > At 12:59 PM 10/11/2005 -0400, J Gager wrote:
> >
> >
> >
> > Kate -
> >
> > Thanks for your comments.  Please see responses below.  In general, I
> > didn't think any change was significant enough to warrant further
> > discussion.  If anyone is still uncomfortable with these changes after
> > this discussion, then we can schedule another meeting, but time is very
> > very short for me, and the write ups for this aggregate piece are far
> > more complicated and time consuming than I anticipated (there are a lot
> > of details that need to be cleary explained).
> >
> > J
> >
> > -----Original Message-----
> >
> > From: Katherine McNeill-Harman [mailto:mcneillh at MIT.EDU]
> >
> > Sent: Tuesday, October 11, 2005 12:03 PM
> >
> > To: jgager at umich.edu; DDI-ADG
> >
> > Subject: Re: [DDI-ADG] Latest Spreadsheet
> >
> > J and others,
> >
> > Have a couple of questions/concerns about this new sheet; would be
> > interested in others' opinions:
> >
> > 1) Can you explain a bit the reasons for changing the descriptions at
> > the top of each module sheet?  I don't care that we use exactly the
> > words I drafted, but yours seem to have different meaning and I want to
> > make sure we're all on the same page.  Namely,
> >
> > - For modules 2 and 3, you seem to be emphasizing that it's for a
> > "single nCube structure"--can you expand upon what you mean by that?
> >
> >
> >
> > What is meant by this for module 2 is that the file cannot contain
> > multiple cubes.  For instance if there were 2 cubes, say population by
> > region, gender, and age (cube 1) and population by region and gender
> > (cube 2), the combination of these 2 cubes in the data file would look
> > something like this.
> >
> > MN    M    50-    5300
> >
> > MN    M    50+   6700
> >
> > MN    M    12000
> >
> > MN    F    50-    6800
> >
> > MN    F    50+    5000
> >
> > MN    F    11800
> >
> > Module 2 does not support this.  Its intention is describe a file where
> > all rows describe the same cube data.
> >
> >
> >
> > The same thing applies for module 3, since it is grouped by nCube.  You
> > would used module 3 to describe a single nCube at a time, and not a mix
> > of nCubes and non cubed data.
> >
> >
> > That's more clear, however, in module 2, how would one treat, e.g., a
> > spreadsheet file containing multiple sheets with a different cube on
> > each sheet?
> >
> >
> >
> >
> >
> >
> >  - Also, I'd like to ask you to consider putting back some of the
> > wording I'd written for module 3 that makes it clear that the metadata
> > and data are in one single DDI file; I don't think that comes across in
> > your phrasing.
> >
> >
> >
> > I think it is too constricting.  It does allow for all to be in one
> > file, but don't we want to allow the data to also be used this way in a
> > seperate file?
> >
> >
> > ***I believe that it's only the former, that if it lives in a separate
> > file it would be under module 1 or 2.  Others, please confirm.
> >
> >
> >
> >
> >
> >
> > - Lastly, for module 1, I believe it'd be helpful to include some
> > explicit reference to the fact that the external file contains no
> > metadata (to distinguish it from module 2).
> >
> >
> >
> > That may be misleading, since we are saying that attributes can exist in
> > there, which are technically metadata.  The distinction lies in the fact
> > that module 1 data files do not state any of the cube coordinate values
> > in them.
> >
> >
> > I can live w/that if others don't have any other suggestions.
> >
> >
> >
> >
> >
> >
> > 2) I'm not sure about including only the additions in the first sheet,
> > as I think the other fields provide helpful context.  Might there be a
> > way to distinguish the new fields (as you had done, e.g., w/color) while
> > still keeping all of them?
> >
> >
> >
> > It was really a time, and an issue of focusing attention.  The model was
> > incomplete to start with, and the effort of flushing out all existing
> > things from the tag library, and putting in definitions for them is more
> > work than it would be worth (in my opinion).
> >
> >
> > Understand.  I'd still lean the other way but will happily go w/the
> > group concensus.
> >
> >
> >
> >
> >
> >
> > 3) Your change to module 1, while I have no objections, seems to be
> > significant enough that we should discuss it as a group.  I don't quite
> > understand the purpose of this.  Is this something you think you could
> > explain/we could discuss in more detail over email (or maybe phone)?
> >
> >
> >
> > The purpose of the change is to basically not change what is already in
> > place.  In speaking with Wendy, she pointed out how important it is to
> > many people marking up data files to do so in the order in which the
> > data occurs in the file.  So there may be a mix of cubed and non cubed
> > data.  Further more, module 1 did not allow for any non cubed data
> > (everything was grouped into a nCube container).  The change simply
> > replaced the inclusion of a data item into an nCube by containership,
> > with inclusion by reference.  The concept we initially had is still
> > there, just represented differently.
> >
> >
> > Sounds OK to me; I'll leave others to comment.
> >
> >
> >
> >
> >
> >
> > 4) Plus a couple of other questions about the elements:
> >
> > -- in describing the attribute location choice, you refer separately to
> > a data file vs. a spreadsheet.  I understand what you're trying to do,
> > but am a little concerned about the mutually-exclusive manner in which
> > they're described (b/other places we use the term "data file" to include
> > all sorts of formats, including spreadsheets, and think it should still
> > keep that broad meaning).  So I'd suggest changing the terms to say
> > something like "fixed-format/delimited data file" and "spreadsheet data
> > file" to distinguish the types to clarify that we consider them both
> > data files.
> >
> > - In module 2 F18, what do you mean by " the structure describes all
> > data and meta data for the cube"--that sounds to me more like module 3.
> >
> > - Module 3 F19; the notes contains a question; can that be deleted or
> > should it be moved to G19?
> >
> >
> >
> > I will make these corrections.
> >
> >
> >
> > When I agreed to cancelling today's meeting, I didn't realize that you'd
> > have such significant changes, so if it's best to discuss these over the
> > phone, maybe we can arrange another call.
> >
> >
> >
> > Time is VERY critical.  We are presenting this to the SRG in one week,
> > and need to send this out ASAP.  I think the important thing is that we
> > have something for the group to work with.  I just don't have the time
> > to finish the proposals AND meet.
> >
> >
> >
> > Kate
> >
> > P.S.  Plus a couple of typos
> >
> > - Module 1, F25, should be "measurement"--also applies to M2 F31
> >
> > - Module 1, F20, ID should be capitalized
> >
> > - Module 2, F20, should be "coordinates"
> >
> >
> >
> > Will fix.
> >
> >
> >
> > At 09:29 AM 10/11/2005 -0400, J Gager wrote:
> >
> >
> >
> > All -
> >
> >
> >
> > Here is the latest spreadsheet.  Note there are few significant changes
> > that stemmed from a long discussion Wendy and I had.
> >
> >
> >
> > The first is the Logical sheet.  I have gone back to just including the
> > new fields.  I felt it best to do this, since we weren't changing any
> > existing fields, and I want the focus to only be on these additions.
> >
> >
> >
> > The second is the Physical Sheets - I have changed the name of these to
> > Record Layout, since that is what we a truly representing.
> >
> >
> >
> > Finally, I have changed module 1, to allow for data items to exist
> > outside of nCubes.  Basically what I have done is create a way to
> > reference an nCube and its attached attributes.  The basic concept that
> > we had originally is still there, it is just less deviant from the
> > original, and oft used structure.
> >
> >
> >
> > Please let me know of any structural issues ASAP as the samples and
> > write up is based on this.
> >
> >
> >
> > J
> >
> > _______________________________________________
> >
> > DDI-ADG mailing list
> >
> > DDI-ADG at icpsr.umich.edu
> >
> > http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg
> >
> >
> >
> > ___________________________________________
> >
> > Katherine McNeill-Harman
> >
> > Data Services Librarian
> >
> > Dewey Library for Management and Social Sciences
> >
> > Massachusetts Institute of Technology
> >
> > 77 Massachusetts Avenue, E53-100
> >
> > Cambridge, MA 02139
> >
> > mcneillh at mit.edu
> >
> > 617-253-0787
> >
> > ___________________________________________
> > Katherine McNeill-Harman
> > Data Services Librarian
> > Dewey Library for Management and Social Sciences
> > Massachusetts Institute of Technology
> > 77 Massachusetts Avenue, E53-100
> > Cambridge, MA 02139
> > mcneillh at mit.edu
> > 617-253-0787
> >
> >
>
>Wendy L. Thomas                          Phone: +1 612.624.4389
>Data Access Core Director               Fax:   +1 612.626.8375
>Minnesota Population Center              Email: wlt at pop.umn.edu
>University of Minnesota
>50 Willey Hall
>225 19th Avenue South
>Minneapolis, MN 55455

___________________________________________
Katherine McNeill-Harman
Data Services Librarian
Dewey Library for Management and Social Sciences
Massachusetts Institute of Technology
77 Massachusetts Avenue, E53-100
Cambridge, MA 02139
mcneillh at mit.edu
617-253-0787 



More information about the DDI-ADG mailing list