[DDI-ADG] Latest Spreadsheet
Mary Vardigan
vardigan at umich.edu
Wed Oct 12 11:03:04 EDT 2005
Yes, I understand it better now also.
Mary
-----Original Message-----
From: ddi-adg-bounces at icpsr.umich.edu
[mailto:ddi-adg-bounces at icpsr.umich.edu] On Behalf Of Ilona Einowski
Sent: Wednesday, October 12, 2005 10:55 AM
To: 'Katherine McNeill-Harman'; jgager at umich.edu; 'Wendy Thomas'
Cc: 'DDI-ADG'
Subject: RE: [DDI-ADG] Latest Spreadsheet
Thank you Kate for perusing this clarification...NOW it makes sense!
Ilona
-----Original Message-----
From: Katherine McNeill-Harman [mailto:mcneillh at MIT.EDU]
Sent: Wednesday, October 12, 2005 7:03 AM
To: jgager at umich.edu; ilona_e at berkeley.edu; 'Wendy Thomas'
Cc: 'DDI-ADG'
Subject: RE: [DDI-ADG] Latest Spreadsheet
For others' information, here's the response I got from J on the issue
of
how to describe module 3:
"The point I was trying to avoid is that module 3 *has* to exist in the
same
instance as the meta-data. Although it most likely often will, it may
not
be the case. I believe the model is constructed such that one could
have a
separate XML instance for each component (imagine the course of the life
cycle of a study). One could have defined the logical structure, and
everything else in one instance when designing the study. Some time
later,
the data may actually be collected, at which point they may create
another
instance (with module 3 containing the data), which references the
earlier
file containing the meta data. So they are separate files. The key to
module 3 is that the data doesn't exist external to the DDI instance
describing the physical structure. None the less, I am not worried
about
this point being lost on the SRG. I will make sure that the final
description accurately describes our intent."
Does this explain it to others? I guess I understand it a bit more, but
that does still seem to say that the data has to exist in the same
instance
as at least part of the metadata (the physical structure), even if you
end
up w/other DDI files applying to the study.
I just want to make sure that everyone is comfortable with what we're
sending.
Kate
At 05:24 PM 10/11/2005 -0400, Katherine McNeill-Harman wrote:
>Understood--and others, don't be confused by the order of responses; I
>believe J was responding to my question about a spreadsheet that
>contains multiple sheets, not to Ilona's comment about Module 3.
>
>And I see that J also sent out a "final" version of the package to us;
>I sent out a separate email directly to him suggesting that we stick
>w/what seems to be our collective understanding of module 3 referring
>to a single combined ddi/data file, and that--given the time
>pressure--it's late to be recommending a change. So I hope that
>feedback is taken and incorporated into the truly final version sent
on.
>
>Kate
>
>At 05:03 PM 10/11/2005 -0400, J Gager wrote:
>>No, module 2 describes a spreadsheet containing data, but it is a
>>little more limited than what you described. Wendy and I discussed
>>the case you described below, and it is out of scope of our groups
>>work, but it is on the radar for the SRG. This sort of information
>>would be better suited in the gross file description.
>>
>>Hypothetically, the spreadsheet you described below could contain a
>>section of data that could be used in module 2, however, the table
>>must contain data for only 1 nCube (see my response to Kate ealier for
>>claification on this).
>>
>>-----Original Message-----
>>From: Ilona Einowski [mailto:ilona_e at berkeley.edu]
>>Sent: Tuesday, October 11, 2005 5:01 PM
>>To: 'Katherine McNeill-Harman'; 'Wendy Thomas'; jgager at umich.edu
>>Cc: 'DDI-ADG'
>>Subject: RE: [DDI-ADG] Latest Spreadsheet
>>
>>
>>OK...and here is my 2cents....
>>
>>I though Module 3 was an example of a spreadsheet where the whole
>>shebang - table titles, row labels, column headers, data cells,
>>footnotes, etc were represented....
>>
>>Did I miss the boat on this????
>>
>>Ilona
>>
>>-----Original Message-----
>>From: ddi-adg-bounces at icpsr.umich.edu
>>[mailto:ddi-adg-bounces at icpsr.umich.edu] On Behalf Of Katherine
>>McNeill-Harman
>>Sent: Tuesday, October 11, 2005 1:37 PM
>>To: Wendy Thomas; jgager at umich.edu
>>Cc: 'DDI-ADG'
>>Subject: RE: [DDI-ADG] Latest Spreadsheet
>>
>>Based on J's response, I guess I would wonder when the data would not
>>be in the same DDI instance of the meta data. You would have two DDI
>>metadata files and only one would have the data? I'm having a hard
>>time conceptualizing this. Know we're short on time, but think this
>>is a basic thing we should try to agree on quickly if possible (i.e.
>>to get on the same page ourselves about our recommendation as opposed
>>to waiting to talk to the SRG).
>>
>>Kate
>>
>>At 03:21 PM 10/11/2005 -0500, Wendy Thomas wrote:
>> >Module 1 describes data that resides in an external file of data
>> >only Module 2 describes data that resides in an external file that
>> >has both data and some level of metadata (category labels, title
>> >line etc) Module 3 describes a data that resides in the metadata,
>> >there is no external file
>> >
>> >At least that's the way it reads to me.
>> >
>> >wendy
>> >
>> >
>> >
>> >
>> >On Tue, 11 Oct 2005, J Gager wrote:
>> >
>> > > Module 3 *is* designed to hold data inline. The point I was
>> > > trying to make is that I am not sure we want to say the data
>> > > always has to be in the same DDI instance of the meta data.
>> > > Module 3 does not support any external data files.
>> > >
>> > > -----Original Message-----
>> > > From: Mary Vardigan [mailto:vardigan at umich.edu]
>> > > Sent: Tuesday, October 11, 2005 4:07 PM
>> > > To: Katherine McNeill-Harman; jgager at umich.edu; DDI-ADG
>> > > Subject: RE: [DDI-ADG] Latest Spreadsheet
>> > >
>> > >
>> > >
>> > > Kate, J, and others,
>> > >
>> > >
>> > >
>> > > I hesitate to put in my two cents since I haven't been as
>> > > involved in this lately and may have missed some critical
>> > > information, but I was under the same impression as Kate that
>> > > Module 3 was designed to hold data values inline and not point to
>> > > an external file. I know we
>>
>> > > are really pressed for time, though, so rather than discuss this
>> > > over email or in a phone call, perhaps Sanda and I can raise it
>> > > during the SRG meeting next week and get clarification there. We
>> > > will then report back after the meeting. Does this work?
>> > >
>> > >
>> > >
>> > > Mary
>> > >
>> > >
>> > > _____
>> > >
>> > >
>> > > From: ddi-adg-bounces at icpsr.umich.edu
>> > > [mailto:ddi-adg-bounces at icpsr.umich.edu] On Behalf Of Katherine
>> > > McNeill-Harman
>> > > Sent: Tuesday, October 11, 2005 2:29 PM
>> > > To: jgager at umich.edu; 'DDI-ADG'
>> > > Subject: RE: [DDI-ADG] Latest Spreadsheet
>> > >
>> > >
>> > >
>> > > Comments w/in (others, please comment as well; the most
>> > > significant item starts w/a *** below):
>> > >
>> > > At 12:59 PM 10/11/2005 -0400, J Gager wrote:
>> > >
>> > >
>> > >
>> > > Kate -
>> > >
>> > > Thanks for your comments. Please see responses below. In
>> > > general, I didn't think any change was significant enough to
>> > > warrant further discussion. If anyone is still uncomfortable
>> > > with these changes after this discussion, then we can schedule
>> > > another meeting, but time is very very short for me, and the
>> > > write ups for this aggregate
>>
>> > > piece are far more complicated and time consuming than I
>> > > anticipated
>>
>> > > (there are a lot of details that need to be cleary explained).
>> > >
>> > > J
>> > >
>> > > -----Original Message-----
>> > >
>> > > From: Katherine McNeill-Harman [mailto:mcneillh at MIT.EDU]
>> > >
>> > > Sent: Tuesday, October 11, 2005 12:03 PM
>> > >
>> > > To: jgager at umich.edu; DDI-ADG
>> > >
>> > > Subject: Re: [DDI-ADG] Latest Spreadsheet
>> > >
>> > > J and others,
>> > >
>> > > Have a couple of questions/concerns about this new sheet; would
>> > > be interested in others' opinions:
>> > >
>> > > 1) Can you explain a bit the reasons for changing the
>> > > descriptions at the top of each module sheet? I don't care that
>> > > we use exactly the words I drafted, but yours seem to have
>> > > different meaning and I want to make sure we're all on the same
>> > > page. Namely,
>> > >
>> > > - For modules 2 and 3, you seem to be emphasizing that it's for a
>> > > "single nCube structure"--can you expand upon what you mean by
that?
>> > >
>> > >
>> > >
>> > > What is meant by this for module 2 is that the file cannot
>> > > contain multiple cubes. For instance if there were 2 cubes, say
>> > > population by region, gender, and age (cube 1) and population by
>> > > region and gender (cube 2), the combination of these 2 cubes in
>> > > the data file would look something like this.
>> > >
>> > > MN M 50- 5300
>> > >
>> > > MN M 50+ 6700
>> > >
>> > > MN M 12000
>> > >
>> > > MN F 50- 6800
>> > >
>> > > MN F 50+ 5000
>> > >
>> > > MN F 11800
>> > >
>> > > Module 2 does not support this. Its intention is describe a file
>> > > where all rows describe the same cube data.
>> > >
>> > >
>> > >
>> > > The same thing applies for module 3, since it is grouped by
nCube.
>> > > You would used module 3 to describe a single nCube at a time, and
>> > > not a mix of nCubes and non cubed data.
>> > >
>> > >
>> > > That's more clear, however, in module 2, how would one treat,
>> > > e.g., a spreadsheet file containing multiple sheets with a
>> > > different cube on each sheet?
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > - Also, I'd like to ask you to consider putting back some of the
>> > > wording I'd written for module 3 that makes it clear that the
>> > > metadata and data are in one single DDI file; I don't think that
>> > > comes across in your phrasing.
>> > >
>> > >
>> > >
>> > > I think it is too constricting. It does allow for all to be in
>> > > one file, but don't we want to allow the data to also be used
>> > > this way in a seperate file?
>> > >
>> > >
>> > > ***I believe that it's only the former, that if it lives in a
>> > > separate file it would be under module 1 or 2. Others, please
>>confirm.
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > - Lastly, for module 1, I believe it'd be helpful to include some
>> > > explicit reference to the fact that the external file contains no
>> > > metadata (to distinguish it from module 2).
>> > >
>> > >
>> > >
>> > > That may be misleading, since we are saying that attributes can
>> > > exist in there, which are technically metadata. The distinction
>> > > lies in the fact that module 1 data files do not state any of the
>> > > cube coordinate values in them.
>> > >
>> > >
>> > > I can live w/that if others don't have any other suggestions.
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > 2) I'm not sure about including only the additions in the first
>> > > sheet, as I think the other fields provide helpful context.
>> > > Might there be a way to distinguish the new fields (as you had
>> > > done, e.g.,
>>
>> > > w/color) while still keeping all of them?
>> > >
>> > >
>> > >
>> > > It was really a time, and an issue of focusing attention. The
>> > > model was incomplete to start with, and the effort of flushing
>> > > out all existing things from the tag library, and putting in
>> > > definitions for
>>
>> > > them is more work than it would be worth (in my opinion).
>> > >
>> > >
>> > > Understand. I'd still lean the other way but will happily go
>> > > w/the group concensus.
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > 3) Your change to module 1, while I have no objections, seems to
>> > > be significant enough that we should discuss it as a group. I
>> > > don't quite understand the purpose of this. Is this something
>> > > you think you could explain/we could discuss in more detail over
>> > > email (or
>>maybe
>>phone)?
>> > >
>> > >
>> > >
>> > > The purpose of the change is to basically not change what is
>> > > already in place. In speaking with Wendy, she pointed out how
>> > > important it is to many people marking up data files to do so in
>> > > the order in which the data occurs in the file. So there may be
>> > > a mix of cubed and non cubed data. Further more, module 1 did
>> > > not allow for any non cubed data (everything was grouped into a
>> > > nCube container). The
>>
>> > > change simply replaced the inclusion of a data item into an nCube
>> > > by
>>
>> > > containership, with inclusion by reference. The concept we
>> > > initially had is still there, just represented differently.
>> > >
>> > >
>> > > Sounds OK to me; I'll leave others to comment.
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > 4) Plus a couple of other questions about the elements:
>> > >
>> > > -- in describing the attribute location choice, you refer
>> > > separately to a data file vs. a spreadsheet. I understand what
>> > > you're trying to do, but am a little concerned about the
>> > > mutually-exclusive manner
>>
>> > > in which they're described (b/other places we use the term "data
>> > > file" to include all sorts of formats, including spreadsheets,
>> > > and think it should still keep that broad meaning). So I'd
>> > > suggest changing the terms to say something like
>> > > "fixed-format/delimited data file" and "spreadsheet data file" to
>> > > distinguish the types to clarify that we consider them both data
files.
>> > >
>> > > - In module 2 F18, what do you mean by " the structure describes
>> > > all data and meta data for the cube"--that sounds to me more like
>> > > module
>>3.
>> > >
>> > > - Module 3 F19; the notes contains a question; can that be
>> > > deleted or should it be moved to G19?
>> > >
>> > >
>> > >
>> > > I will make these corrections.
>> > >
>> > >
>> > >
>> > > When I agreed to cancelling today's meeting, I didn't realize
>> > > that you'd have such significant changes, so if it's best to
>> > > discuss these over the phone, maybe we can arrange another call.
>> > >
>> > >
>> > >
>> > > Time is VERY critical. We are presenting this to the SRG in one
>> > > week, and need to send this out ASAP. I think the important
>> > > thing is that we have something for the group to work with. I
>> > > just don't have the time to finish the proposals AND meet.
>> > >
>> > >
>> > >
>> > > Kate
>> > >
>> > > P.S. Plus a couple of typos
>> > >
>> > > - Module 1, F25, should be "measurement"--also applies to M2 F31
>> > >
>> > > - Module 1, F20, ID should be capitalized
>> > >
>> > > - Module 2, F20, should be "coordinates"
>> > >
>> > >
>> > >
>> > > Will fix.
>> > >
>> > >
>> > >
>> > > At 09:29 AM 10/11/2005 -0400, J Gager wrote:
>> > >
>> > >
>> > >
>> > > All -
>> > >
>> > >
>> > >
>> > > Here is the latest spreadsheet. Note there are few significant
>> > > changes that stemmed from a long discussion Wendy and I had.
>> > >
>> > >
>> > >
>> > > The first is the Logical sheet. I have gone back to just
>> > > including the new fields. I felt it best to do this, since we
>> > > weren't changing any existing fields, and I want the focus to
>> > > only be on
>>these
>>additions.
>> > >
>> > >
>> > >
>> > > The second is the Physical Sheets - I have changed the name of
>> > > these to Record Layout, since that is what we a truly
representing.
>> > >
>> > >
>> > >
>> > > Finally, I have changed module 1, to allow for data items to
>> > > exist outside of nCubes. Basically what I have done is create a
>> > > way to reference an nCube and its attached attributes. The basic
>> > > concept that we had originally is still there, it is just less
>> > > deviant from the original, and oft used structure.
>> > >
>> > >
>> > >
>> > > Please let me know of any structural issues ASAP as the samples
>> > > and write up is based on this.
>> > >
>> > >
>> > >
>> > > J
>> > >
>> > > _______________________________________________
>> > >
>> > > DDI-ADG mailing list
>> > >
>> > > DDI-ADG at icpsr.umich.edu
>> > >
>> > > http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg
>> > >
>> > >
>> > >
>> > > ___________________________________________
>> > >
>> > > Katherine McNeill-Harman
>> > >
>> > > Data Services Librarian
>> > >
>> > > Dewey Library for Management and Social Sciences
>> > >
>> > > Massachusetts Institute of Technology
>> > >
>> > > 77 Massachusetts Avenue, E53-100
>> > >
>> > > Cambridge, MA 02139
>> > >
>> > > mcneillh at mit.edu
>> > >
>> > > 617-253-0787
>> > >
>> > > ___________________________________________
>> > > Katherine McNeill-Harman
>> > > Data Services Librarian
>> > > Dewey Library for Management and Social Sciences Massachusetts
>> > > Institute of Technology
>> > > 77 Massachusetts Avenue, E53-100
>> > > Cambridge, MA 02139
>> > > mcneillh at mit.edu
>> > > 617-253-0787
>> > >
>> > >
>> >
>> >Wendy L. Thomas Phone: +1 612.624.4389
>> >Data Access Core Director Fax: +1 612.626.8375
>> >Minnesota Population Center Email: wlt at pop.umn.edu
>> >University of Minnesota
>> >50 Willey Hall
>> >225 19th Avenue South
>> >Minneapolis, MN 55455
>>
>>___________________________________________
>>Katherine McNeill-Harman
>>Data Services Librarian
>>Dewey Library for Management and Social Sciences Massachusetts
>>Institute of Technology 77 Massachusetts Avenue, E53-100 Cambridge, MA
>>02139 mcneillh at mit.edu 617-253-0787
>>
>>_______________________________________________
>>DDI-ADG mailing list
>>DDI-ADG at icpsr.umich.edu
>>http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg
>
>___________________________________________
>Katherine McNeill-Harman
>Data Services Librarian
>Dewey Library for Management and Social Sciences Massachusetts
>Institute of Technology
>77 Massachusetts Avenue, E53-100
>Cambridge, MA 02139
>mcneillh at mit.edu
>617-253-0787
>_______________________________________________
>DDI-ADG mailing list
>DDI-ADG at icpsr.umich.edu
>http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg
___________________________________________
Katherine McNeill-Harman
Data Services Librarian
Dewey Library for Management and Social Sciences Massachusetts Institute
of
Technology
77 Massachusetts Avenue, E53-100
Cambridge, MA 02139
mcneillh at mit.edu
617-253-0787
_______________________________________________
DDI-ADG mailing list
DDI-ADG at icpsr.umich.edu
http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg
More information about the DDI-ADG
mailing list