[DDI-ADG] Latest Spreadsheet
Katherine McNeill-Harman
mcneillh at MIT.EDU
Tue Oct 11 17:24:50 EDT 2005
Understood--and others, don't be confused by the order of responses; I
believe J was responding to my question about a
spreadsheet that contains multiple sheets, not to Ilona's comment about
Module 3.
And I see that J also sent out a "final" version of the package to us; I
sent out a separate email directly to him suggesting that we stick w/what
seems to be our collective understanding of module 3 referring to a single
combined ddi/data file, and that--given the time pressure--it's late to be
recommending a change. So I hope that feedback is taken and incorporated
into the truly final version sent on.
Kate
At 05:03 PM 10/11/2005 -0400, J Gager wrote:
>No, module 2 describes a spreadsheet containing data, but it is a little
>more limited than what you described. Wendy and I discussed the case
>you described below, and it is out of scope of our groups work, but it
>is on the radar for the SRG. This sort of information would be better
>suited in the gross file description.
>
>Hypothetically, the spreadsheet you described below could contain a
>section of data that could be used in module 2, however, the table must
>contain data for only 1 nCube (see my response to Kate ealier for
>claification on this).
>
>-----Original Message-----
>From: Ilona Einowski [mailto:ilona_e at berkeley.edu]
>Sent: Tuesday, October 11, 2005 5:01 PM
>To: 'Katherine McNeill-Harman'; 'Wendy Thomas'; jgager at umich.edu
>Cc: 'DDI-ADG'
>Subject: RE: [DDI-ADG] Latest Spreadsheet
>
>
>OK...and here is my 2cents....
>
>I though Module 3 was an example of a spreadsheet where the whole
>shebang - table titles, row labels, column headers, data cells,
>footnotes, etc were represented....
>
>Did I miss the boat on this????
>
>Ilona
>
>-----Original Message-----
>From: ddi-adg-bounces at icpsr.umich.edu
>[mailto:ddi-adg-bounces at icpsr.umich.edu] On Behalf Of Katherine
>McNeill-Harman
>Sent: Tuesday, October 11, 2005 1:37 PM
>To: Wendy Thomas; jgager at umich.edu
>Cc: 'DDI-ADG'
>Subject: RE: [DDI-ADG] Latest Spreadsheet
>
>Based on J's response, I guess I would wonder when the data would not be
>in the same DDI instance of the meta data. You would have two DDI
>metadata files and only one would have the data? I'm having a hard time
>conceptualizing this. Know we're short on time, but think this is a
>basic thing we should try to agree on quickly if possible (i.e. to get
>on the same page ourselves about our recommendation as opposed to
>waiting to talk to the SRG).
>
>Kate
>
>At 03:21 PM 10/11/2005 -0500, Wendy Thomas wrote:
> >Module 1 describes data that resides in an external file of data only
> >Module 2 describes data that resides in an external file that has both
> >data and some level of metadata (category labels, title line etc)
> >Module 3 describes a data that resides in the metadata, there is no
> >external file
> >
> >At least that's the way it reads to me.
> >
> >wendy
> >
> >
> >
> >
> >On Tue, 11 Oct 2005, J Gager wrote:
> >
> > > Module 3 *is* designed to hold data inline. The point I was trying
> > > to make is that I am not sure we want to say the data always has to
> > > be in the same DDI instance of the meta data. Module 3 does not
> > > support any external data files.
> > >
> > > -----Original Message-----
> > > From: Mary Vardigan [mailto:vardigan at umich.edu]
> > > Sent: Tuesday, October 11, 2005 4:07 PM
> > > To: Katherine McNeill-Harman; jgager at umich.edu; DDI-ADG
> > > Subject: RE: [DDI-ADG] Latest Spreadsheet
> > >
> > >
> > >
> > > Kate, J, and others,
> > >
> > >
> > >
> > > I hesitate to put in my two cents since I haven't been as involved
> > > in this lately and may have missed some critical information, but I
> > > was under the same impression as Kate that Module 3 was designed to
> > > hold data values inline and not point to an external file. I know we
>
> > > are really pressed for time, though, so rather than discuss this
> > > over email or in a phone call, perhaps Sanda and I can raise it
> > > during the SRG meeting next week and get clarification there. We
> > > will then report back after the meeting. Does this work?
> > >
> > >
> > >
> > > Mary
> > >
> > >
> > > _____
> > >
> > >
> > > From: ddi-adg-bounces at icpsr.umich.edu
> > > [mailto:ddi-adg-bounces at icpsr.umich.edu] On Behalf Of Katherine
> > > McNeill-Harman
> > > Sent: Tuesday, October 11, 2005 2:29 PM
> > > To: jgager at umich.edu; 'DDI-ADG'
> > > Subject: RE: [DDI-ADG] Latest Spreadsheet
> > >
> > >
> > >
> > > Comments w/in (others, please comment as well; the most significant
> > > item starts w/a *** below):
> > >
> > > At 12:59 PM 10/11/2005 -0400, J Gager wrote:
> > >
> > >
> > >
> > > Kate -
> > >
> > > Thanks for your comments. Please see responses below. In general,
> > > I didn't think any change was significant enough to warrant further
> > > discussion. If anyone is still uncomfortable with these changes
> > > after this discussion, then we can schedule another meeting, but
> > > time is very very short for me, and the write ups for this aggregate
>
> > > piece are far more complicated and time consuming than I anticipated
>
> > > (there are a lot of details that need to be cleary explained).
> > >
> > > J
> > >
> > > -----Original Message-----
> > >
> > > From: Katherine McNeill-Harman [mailto:mcneillh at MIT.EDU]
> > >
> > > Sent: Tuesday, October 11, 2005 12:03 PM
> > >
> > > To: jgager at umich.edu; DDI-ADG
> > >
> > > Subject: Re: [DDI-ADG] Latest Spreadsheet
> > >
> > > J and others,
> > >
> > > Have a couple of questions/concerns about this new sheet; would be
> > > interested in others' opinions:
> > >
> > > 1) Can you explain a bit the reasons for changing the descriptions
> > > at the top of each module sheet? I don't care that we use exactly
> > > the words I drafted, but yours seem to have different meaning and I
> > > want to make sure we're all on the same page. Namely,
> > >
> > > - For modules 2 and 3, you seem to be emphasizing that it's for a
> > > "single nCube structure"--can you expand upon what you mean by that?
> > >
> > >
> > >
> > > What is meant by this for module 2 is that the file cannot contain
> > > multiple cubes. For instance if there were 2 cubes, say population
> > > by region, gender, and age (cube 1) and population by region and
> > > gender (cube 2), the combination of these 2 cubes in the data file
> > > would look something like this.
> > >
> > > MN M 50- 5300
> > >
> > > MN M 50+ 6700
> > >
> > > MN M 12000
> > >
> > > MN F 50- 6800
> > >
> > > MN F 50+ 5000
> > >
> > > MN F 11800
> > >
> > > Module 2 does not support this. Its intention is describe a file
> > > where all rows describe the same cube data.
> > >
> > >
> > >
> > > The same thing applies for module 3, since it is grouped by nCube.
> > > You would used module 3 to describe a single nCube at a time, and
> > > not a mix of nCubes and non cubed data.
> > >
> > >
> > > That's more clear, however, in module 2, how would one treat, e.g.,
> > > a spreadsheet file containing multiple sheets with a different cube
> > > on each sheet?
> > >
> > >
> > >
> > >
> > >
> > >
> > > - Also, I'd like to ask you to consider putting back some of the
> > > wording I'd written for module 3 that makes it clear that the
> > > metadata and data are in one single DDI file; I don't think that
> > > comes across in your phrasing.
> > >
> > >
> > >
> > > I think it is too constricting. It does allow for all to be in one
> > > file, but don't we want to allow the data to also be used this way
> > > in a seperate file?
> > >
> > >
> > > ***I believe that it's only the former, that if it lives in a
> > > separate file it would be under module 1 or 2. Others, please
>confirm.
> > >
> > >
> > >
> > >
> > >
> > >
> > > - Lastly, for module 1, I believe it'd be helpful to include some
> > > explicit reference to the fact that the external file contains no
> > > metadata (to distinguish it from module 2).
> > >
> > >
> > >
> > > That may be misleading, since we are saying that attributes can
> > > exist in there, which are technically metadata. The distinction
> > > lies in the fact that module 1 data files do not state any of the
> > > cube coordinate values in them.
> > >
> > >
> > > I can live w/that if others don't have any other suggestions.
> > >
> > >
> > >
> > >
> > >
> > >
> > > 2) I'm not sure about including only the additions in the first
> > > sheet, as I think the other fields provide helpful context. Might
> > > there be a way to distinguish the new fields (as you had done, e.g.,
>
> > > w/color) while still keeping all of them?
> > >
> > >
> > >
> > > It was really a time, and an issue of focusing attention. The model
> > > was incomplete to start with, and the effort of flushing out all
> > > existing things from the tag library, and putting in definitions for
>
> > > them is more work than it would be worth (in my opinion).
> > >
> > >
> > > Understand. I'd still lean the other way but will happily go w/the
> > > group concensus.
> > >
> > >
> > >
> > >
> > >
> > >
> > > 3) Your change to module 1, while I have no objections, seems to be
> > > significant enough that we should discuss it as a group. I don't
> > > quite understand the purpose of this. Is this something you think
> > > you could explain/we could discuss in more detail over email (or
>maybe
>phone)?
> > >
> > >
> > >
> > > The purpose of the change is to basically not change what is already
> > > in place. In speaking with Wendy, she pointed out how important it
> > > is to many people marking up data files to do so in the order in
> > > which the data occurs in the file. So there may be a mix of cubed
> > > and non cubed data. Further more, module 1 did not allow for any
> > > non cubed data (everything was grouped into a nCube container). The
>
> > > change simply replaced the inclusion of a data item into an nCube by
>
> > > containership, with inclusion by reference. The concept we
> > > initially had is still there, just represented differently.
> > >
> > >
> > > Sounds OK to me; I'll leave others to comment.
> > >
> > >
> > >
> > >
> > >
> > >
> > > 4) Plus a couple of other questions about the elements:
> > >
> > > -- in describing the attribute location choice, you refer separately
> > > to a data file vs. a spreadsheet. I understand what you're trying
> > > to do, but am a little concerned about the mutually-exclusive manner
>
> > > in which they're described (b/other places we use the term "data
> > > file" to include all sorts of formats, including spreadsheets, and
> > > think it should still keep that broad meaning). So I'd suggest
> > > changing the terms to say something like "fixed-format/delimited
> > > data file" and "spreadsheet data file" to distinguish the types to
> > > clarify that we consider them both data files.
> > >
> > > - In module 2 F18, what do you mean by " the structure describes all
> > > data and meta data for the cube"--that sounds to me more like module
>3.
> > >
> > > - Module 3 F19; the notes contains a question; can that be deleted
> > > or should it be moved to G19?
> > >
> > >
> > >
> > > I will make these corrections.
> > >
> > >
> > >
> > > When I agreed to cancelling today's meeting, I didn't realize that
> > > you'd have such significant changes, so if it's best to discuss
> > > these over the phone, maybe we can arrange another call.
> > >
> > >
> > >
> > > Time is VERY critical. We are presenting this to the SRG in one
> > > week, and need to send this out ASAP. I think the important thing
> > > is that we have something for the group to work with. I just don't
> > > have the time to finish the proposals AND meet.
> > >
> > >
> > >
> > > Kate
> > >
> > > P.S. Plus a couple of typos
> > >
> > > - Module 1, F25, should be "measurement"--also applies to M2 F31
> > >
> > > - Module 1, F20, ID should be capitalized
> > >
> > > - Module 2, F20, should be "coordinates"
> > >
> > >
> > >
> > > Will fix.
> > >
> > >
> > >
> > > At 09:29 AM 10/11/2005 -0400, J Gager wrote:
> > >
> > >
> > >
> > > All -
> > >
> > >
> > >
> > > Here is the latest spreadsheet. Note there are few significant
> > > changes that stemmed from a long discussion Wendy and I had.
> > >
> > >
> > >
> > > The first is the Logical sheet. I have gone back to just including
> > > the new fields. I felt it best to do this, since we weren't
> > > changing any existing fields, and I want the focus to only be on
>these
>additions.
> > >
> > >
> > >
> > > The second is the Physical Sheets - I have changed the name of these
> > > to Record Layout, since that is what we a truly representing.
> > >
> > >
> > >
> > > Finally, I have changed module 1, to allow for data items to exist
> > > outside of nCubes. Basically what I have done is create a way to
> > > reference an nCube and its attached attributes. The basic concept
> > > that we had originally is still there, it is just less deviant from
> > > the original, and oft used structure.
> > >
> > >
> > >
> > > Please let me know of any structural issues ASAP as the samples and
> > > write up is based on this.
> > >
> > >
> > >
> > > J
> > >
> > > _______________________________________________
> > >
> > > DDI-ADG mailing list
> > >
> > > DDI-ADG at icpsr.umich.edu
> > >
> > > http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg
> > >
> > >
> > >
> > > ___________________________________________
> > >
> > > Katherine McNeill-Harman
> > >
> > > Data Services Librarian
> > >
> > > Dewey Library for Management and Social Sciences
> > >
> > > Massachusetts Institute of Technology
> > >
> > > 77 Massachusetts Avenue, E53-100
> > >
> > > Cambridge, MA 02139
> > >
> > > mcneillh at mit.edu
> > >
> > > 617-253-0787
> > >
> > > ___________________________________________
> > > Katherine McNeill-Harman
> > > Data Services Librarian
> > > Dewey Library for Management and Social Sciences Massachusetts
> > > Institute of Technology
> > > 77 Massachusetts Avenue, E53-100
> > > Cambridge, MA 02139
> > > mcneillh at mit.edu
> > > 617-253-0787
> > >
> > >
> >
> >Wendy L. Thomas Phone: +1 612.624.4389
> >Data Access Core Director Fax: +1 612.626.8375
> >Minnesota Population Center Email: wlt at pop.umn.edu
> >University of Minnesota
> >50 Willey Hall
> >225 19th Avenue South
> >Minneapolis, MN 55455
>
>___________________________________________
>Katherine McNeill-Harman
>Data Services Librarian
>Dewey Library for Management and Social Sciences Massachusetts Institute
>of Technology 77 Massachusetts Avenue, E53-100 Cambridge, MA 02139
>mcneillh at mit.edu 617-253-0787
>
>_______________________________________________
>DDI-ADG mailing list
>DDI-ADG at icpsr.umich.edu
>http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg
___________________________________________
Katherine McNeill-Harman
Data Services Librarian
Dewey Library for Management and Social Sciences
Massachusetts Institute of Technology
77 Massachusetts Avenue, E53-100
Cambridge, MA 02139
mcneillh at mit.edu
617-253-0787
More information about the DDI-ADG
mailing list