[DDI-ADG] Latest Spreadsheet

Wendy Thomas wlt at pop.umn.edu
Tue Oct 11 16:21:18 EDT 2005


Module 1 describes data that resides in an external file of data only
Module 2 describes data that resides in an external file that has both
data and some level of metadata (category labels, title line etc)
Module 3 describes a data that resides in the metadata, there is no
external file

At least that's the way it reads to me.

wendy




On Tue, 11 Oct 2005, J Gager wrote:

> Module 3 *is* designed to hold data inline.  The point I was trying to
> make is that I am not sure we want to say the data always has to be in
> the same DDI instance of the meta data.  Module 3 does not support any
> external data files.
>
> -----Original Message-----
> From: Mary Vardigan [mailto:vardigan at umich.edu]
> Sent: Tuesday, October 11, 2005 4:07 PM
> To: Katherine McNeill-Harman; jgager at umich.edu; DDI-ADG
> Subject: RE: [DDI-ADG] Latest Spreadsheet
>
>
>
> Kate, J, and others,
>
>
>
> I hesitate to put in my two cents since I haven't been as involved in
> this lately and may have missed some critical information, but I was
> under the same impression as Kate that Module 3 was designed to hold
> data values inline and not point to an external file. I know we are
> really pressed for time, though, so rather than discuss this over email
> or in a phone call, perhaps Sanda and I can raise it during the SRG
> meeting next week and get clarification there. We will then report back
> after the meeting. Does this work?
>
>
>
> Mary
>
>
>   _____
>
>
> From: ddi-adg-bounces at icpsr.umich.edu
> [mailto:ddi-adg-bounces at icpsr.umich.edu] On Behalf Of Katherine
> McNeill-Harman
> Sent: Tuesday, October 11, 2005 2:29 PM
> To: jgager at umich.edu; 'DDI-ADG'
> Subject: RE: [DDI-ADG] Latest Spreadsheet
>
>
>
> Comments w/in (others, please comment as well; the most significant item
> starts w/a *** below):
>
> At 12:59 PM 10/11/2005 -0400, J Gager wrote:
>
>
>
> Kate -
>
> Thanks for your comments.  Please see responses below.  In general, I
> didn't think any change was significant enough to warrant further
> discussion.  If anyone is still uncomfortable with these changes after
> this discussion, then we can schedule another meeting, but time is very
> very short for me, and the write ups for this aggregate piece are far
> more complicated and time consuming than I anticipated (there are a lot
> of details that need to be cleary explained).
>
> J
>
> -----Original Message-----
>
> From: Katherine McNeill-Harman [mailto:mcneillh at MIT.EDU]
>
> Sent: Tuesday, October 11, 2005 12:03 PM
>
> To: jgager at umich.edu; DDI-ADG
>
> Subject: Re: [DDI-ADG] Latest Spreadsheet
>
> J and others,
>
> Have a couple of questions/concerns about this new sheet; would be
> interested in others' opinions:
>
> 1) Can you explain a bit the reasons for changing the descriptions at
> the top of each module sheet?  I don't care that we use exactly the
> words I drafted, but yours seem to have different meaning and I want to
> make sure we're all on the same page.  Namely,
>
> - For modules 2 and 3, you seem to be emphasizing that it's for a
> "single nCube structure"--can you expand upon what you mean by that?
>
>
>
> What is meant by this for module 2 is that the file cannot contain
> multiple cubes.  For instance if there were 2 cubes, say population by
> region, gender, and age (cube 1) and population by region and gender
> (cube 2), the combination of these 2 cubes in the data file would look
> something like this.
>
> MN    M    50-    5300
>
> MN    M    50+   6700
>
> MN    M    12000
>
> MN    F    50-    6800
>
> MN    F    50+    5000
>
> MN    F    11800
>
> Module 2 does not support this.  Its intention is describe a file where
> all rows describe the same cube data.
>
>
>
> The same thing applies for module 3, since it is grouped by nCube.  You
> would used module 3 to describe a single nCube at a time, and not a mix
> of nCubes and non cubed data.
>
>
> That's more clear, however, in module 2, how would one treat, e.g., a
> spreadsheet file containing multiple sheets with a different cube on
> each sheet?
>
>
>
>
>
>
>  - Also, I'd like to ask you to consider putting back some of the
> wording I'd written for module 3 that makes it clear that the metadata
> and data are in one single DDI file; I don't think that comes across in
> your phrasing.
>
>
>
> I think it is too constricting.  It does allow for all to be in one
> file, but don't we want to allow the data to also be used this way in a
> seperate file?
>
>
> ***I believe that it's only the former, that if it lives in a separate
> file it would be under module 1 or 2.  Others, please confirm.
>
>
>
>
>
>
> - Lastly, for module 1, I believe it'd be helpful to include some
> explicit reference to the fact that the external file contains no
> metadata (to distinguish it from module 2).
>
>
>
> That may be misleading, since we are saying that attributes can exist in
> there, which are technically metadata.  The distinction lies in the fact
> that module 1 data files do not state any of the cube coordinate values
> in them.
>
>
> I can live w/that if others don't have any other suggestions.
>
>
>
>
>
>
> 2) I'm not sure about including only the additions in the first sheet,
> as I think the other fields provide helpful context.  Might there be a
> way to distinguish the new fields (as you had done, e.g., w/color) while
> still keeping all of them?
>
>
>
> It was really a time, and an issue of focusing attention.  The model was
> incomplete to start with, and the effort of flushing out all existing
> things from the tag library, and putting in definitions for them is more
> work than it would be worth (in my opinion).
>
>
> Understand.  I'd still lean the other way but will happily go w/the
> group concensus.
>
>
>
>
>
>
> 3) Your change to module 1, while I have no objections, seems to be
> significant enough that we should discuss it as a group.  I don't quite
> understand the purpose of this.  Is this something you think you could
> explain/we could discuss in more detail over email (or maybe phone)?
>
>
>
> The purpose of the change is to basically not change what is already in
> place.  In speaking with Wendy, she pointed out how important it is to
> many people marking up data files to do so in the order in which the
> data occurs in the file.  So there may be a mix of cubed and non cubed
> data.  Further more, module 1 did not allow for any non cubed data
> (everything was grouped into a nCube container).  The change simply
> replaced the inclusion of a data item into an nCube by containership,
> with inclusion by reference.  The concept we initially had is still
> there, just represented differently.
>
>
> Sounds OK to me; I'll leave others to comment.
>
>
>
>
>
>
> 4) Plus a couple of other questions about the elements:
>
> -- in describing the attribute location choice, you refer separately to
> a data file vs. a spreadsheet.  I understand what you're trying to do,
> but am a little concerned about the mutually-exclusive manner in which
> they're described (b/other places we use the term "data file" to include
> all sorts of formats, including spreadsheets, and think it should still
> keep that broad meaning).  So I'd suggest changing the terms to say
> something like "fixed-format/delimited data file" and "spreadsheet data
> file" to distinguish the types to clarify that we consider them both
> data files.
>
> - In module 2 F18, what do you mean by " the structure describes all
> data and meta data for the cube"--that sounds to me more like module 3.
>
> - Module 3 F19; the notes contains a question; can that be deleted or
> should it be moved to G19?
>
>
>
> I will make these corrections.
>
>
>
> When I agreed to cancelling today's meeting, I didn't realize that you'd
> have such significant changes, so if it's best to discuss these over the
> phone, maybe we can arrange another call.
>
>
>
> Time is VERY critical.  We are presenting this to the SRG in one week,
> and need to send this out ASAP.  I think the important thing is that we
> have something for the group to work with.  I just don't have the time
> to finish the proposals AND meet.
>
>
>
> Kate
>
> P.S.  Plus a couple of typos
>
> - Module 1, F25, should be "measurement"--also applies to M2 F31
>
> - Module 1, F20, ID should be capitalized
>
> - Module 2, F20, should be "coordinates"
>
>
>
> Will fix.
>
>
>
> At 09:29 AM 10/11/2005 -0400, J Gager wrote:
>
>
>
> All -
>
>
>
> Here is the latest spreadsheet.  Note there are few significant changes
> that stemmed from a long discussion Wendy and I had.
>
>
>
> The first is the Logical sheet.  I have gone back to just including the
> new fields.  I felt it best to do this, since we weren't changing any
> existing fields, and I want the focus to only be on these additions.
>
>
>
> The second is the Physical Sheets - I have changed the name of these to
> Record Layout, since that is what we a truly representing.
>
>
>
> Finally, I have changed module 1, to allow for data items to exist
> outside of nCubes.  Basically what I have done is create a way to
> reference an nCube and its attached attributes.  The basic concept that
> we had originally is still there, it is just less deviant from the
> original, and oft used structure.
>
>
>
> Please let me know of any structural issues ASAP as the samples and
> write up is based on this.
>
>
>
> J
>
> _______________________________________________
>
> DDI-ADG mailing list
>
> DDI-ADG at icpsr.umich.edu
>
> http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg
>
>
>
> ___________________________________________
>
> Katherine McNeill-Harman
>
> Data Services Librarian
>
> Dewey Library for Management and Social Sciences
>
> Massachusetts Institute of Technology
>
> 77 Massachusetts Avenue, E53-100
>
> Cambridge, MA 02139
>
> mcneillh at mit.edu
>
> 617-253-0787
>
> ___________________________________________
> Katherine McNeill-Harman
> Data Services Librarian
> Dewey Library for Management and Social Sciences
> Massachusetts Institute of Technology
> 77 Massachusetts Avenue, E53-100
> Cambridge, MA 02139
> mcneillh at mit.edu
> 617-253-0787
>
>

Wendy L. Thomas                          Phone: +1 612.624.4389
Data Access Core Director		 Fax:   +1 612.626.8375
Minnesota Population Center              Email: wlt at pop.umn.edu
University of Minnesota
50 Willey Hall
225 19th Avenue South
Minneapolis, MN 55455



More information about the DDI-ADG mailing list