[DDI-ADG] Concerns about nCube spec?
Julie Linden
julie.linden at yale.edu
Fri Mar 19 12:37:50 EST 2004
Thanks again for your comments, Wendy. I don't feel that it's a wet
blanket at all -- instead, it helps to clarify where the AGT group can
best focus its efforts. I particularly like your suggestion:
Basically I suggest that the group
> brainstorm functionality that they'd like to have, relationships that
need
> to be defined etc. and work from that to determine what the current
model
> doesn't handle or what it could handle better.
...and would be very happy to see your list of desired refinements.
Do others think this is a good way to start? If so, perhaps we can move
the conversation over to ezBoard.
thanks,
Julie
On Fri, 19 Mar 2004, Wendy Thomas wrote:
> Actually, I discussed this with the SRG yesterday during our weekly
> meeting. This is really a storage issue and not an aggregate data issue.
> Any data can be stored in a variety of formats not currently defined
> within DDI. Even I-lin (the author of the statement in question) agreed
> that it has no more to do with aggregate data description than it has to
> do with microdata description, its just that aggregate data was the
> context within which he first heard about it.
>
> There are a number of things about the basic aggregate description model
> that can be improved. I can certainly try to come up with a quick list as
> we've been using it for a few years now and have run into a lot of
> refinements we'd like to have. Basically I suggest that the group
> brainstorm functionality that they'd like to have, relationships that need
> to be defined etc. and work from that to determine what the current model
> doesn't handle or what it could handle better. Also, there should be a
> Manual for Proposal Development coming out from Mary, Tom and SRG within a
> month which should also help the group frame the discussion and work.
>
> I'd really say let the description of other storage methods sit for a bit.
> The group already has three big chunks to deal with (aggregate, geography,
> time) and believe me, having made a stab at describing generic 2 and 3
> dimensional storage (from hense we have the element <blsht> "basic layer
> sheet" or supply your own vowels) you can get a sense of the framer's
> state of mind by the end of the day. Its really possible that the
> discussions of the SRG over the next few months will provide a better
> framework for dealing with the question of storage description.
>
> Don't intend to throw a wet blanket on this. If you all feel that this has
> priority over the other areas, go for it. You should just be aware that
> given the length of the review process and the date of Version 3.0 you
> would need to have any proposals described and ready to begin the review
> process by September 2004 for inclusion in Version 3.0 (assuming it moves
> smoothly through review). I think it would be good to have dealt with at
> least 2 or the 3 areas of this group in Version 3.0.
>
> Wendy
>
>
>
> On Fri, 19 Mar 2004, Julie Linden wrote:
>
> > Thanks for the response. I'm just trying to understand the issues here; I
> > apologize to others for whom these issues may be very clear and familiar.
> > Would one approach for our group be to try to figure out how to describe
> > these other types of storage systems that the DDI does not currently cover?
> > And if so, would we try to do so within the DDI framework? How would our
> > efforts fit into the work that the Structural Reform Group is doing -- do
> > we need to wait for the SRG to get further along in its work? If the DDI is
> > going to end up as more modular -- and maybe that's not the right word or
> > even the right concept, so I hope someone will correct me! -- then
> > aggregate data could potentially be described by the appropriate aggregate
> > data "module" -- e.g. there would be one module for fixed / delimited
> > records, one module for bundled arrays, one for CUBE storage, etc?
> >
> > Thanks
> > Julie
> >
> > At 01:14 PM 3/17/2004 -0600, Wendy Thomas wrote:
> > >Hi,
> > >
> > >I could be wrong but I believe this is addressing the issue that aggregate
> > >data is freqently held in 2 and 3 dimensional storage systems
> > >(spreadsheets, layered spreadsheets) or bundled as data objects where a
> > >"cell" contains an array of items in a fixed order. The locMap does not
> > >address any new types of storage and DDI only address fixed and delimited
> > >records. This is really a separate issue from aggregate data description
> > >as any type of file could be stored in these alternative formats. So what
> > >the locMap supplies is the link to the data item (cell) description by
> > >giving you its matrix (nCube) number and cell coordinates. Currently the
> > >phyLoc line can only provide a pointer to a fixed format or delimited
> > >file. This is one of the problems for the "European aggregate" data. CBS
> > >was using CUBE storage (3 dimensional) and I know that Jostein had
> > >described their data storage system as one of those using bundled arrays.
> > >
> > >NHGIS is making use of the current aggregate description to search for
> > >data items and tables, create the table template on the fly and populate
> > >it with data from a fixed format data file containing multiple nCubes per
> > >record of data. NESSTAR uses it for describing and manipulating files of
> > >single nCubes for multiple locations.
> > >
> > >Wendy Thomas
> > >
> > >On Wed, 17 Mar 2004, Julie Linden wrote:
> > >
> > > > Hello everyone,
> > > >
> > > > I've been thinking about how to address the "aggregate" part of our
> > > > Working Group's charge. The "Possible Configuration of DDI Working Groups"
> > > > document that was distributed at the Expert Committee meeting in October
> > > > states: "While considerable time and effort have already gone into the
> > > > creation of an aggregate/tabular extension to the existing DDI
> > > > specification (nCubes), there is concern that the aggregate model may be
> > > > overly complex. The group needs to take a fresh look at this issue."
> > > >
> > > > As someone who is just beginning to get familiar with how the current DDI
> > > > handles aggregate data, it's hard for me to begin envisioning how it could
> > > > be simplified or overhauled. I thought that perhaps a starting point would
> > > > be to review what concerns have been raised. I read through the Structural
> > > > Reform Group's postings on ezboard, and found one comment that suggests a
> > > > concern, but doesn't spell it out:
> > > >
> > > > "Logical Physical file format mapping: How are the logical concepts in
> > > > the DDI mapped to the underlying physical files? What kinds of physical
> > > > file formats are there (rectangular, cards, SPSS, STATA, SAS, Census
> > > > aggregate data, European aggregate data)? Should DDI even be tackling this
> > > > question? There is an existing difference of opinion already regarding
> > > > this in the nCubes specification."
> > > >
> > > > Can someone on this group describe the issues/concerns explicitly?
> > > >
> > > > thanks,
> > > > Julie
> > > >
> > > >
> > > >
> > > >
> > > > _______________________________________________
> > > > DDI-ADG mailing list
> > > > DDI-ADG at icpsr.umich.edu
> > > > http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg
> > > >
> > >
> > >Wendy L. Thomas Phone: +1 612.624.4389
> > >Data Access Core Director Fax: +1 612.626.8375
> > >Minnesota Population Center Email: wlt at pop.umn.edu
> > >University of Minnesota
> > >537 Heller Hall
> > >271 19th Avenue South
> > >Minneapolis, MN 55455
> >
> > _______________________________________________
> > DDI-ADG mailing list
> > DDI-ADG at icpsr.umich.edu
> > http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg
> >
>
> Wendy L. Thomas Phone: +1 612.624.4389
> Data Access Core Director Fax: +1 612.626.8375
> Minnesota Population Center Email: wlt at pop.umn.edu
> University of Minnesota
> 537 Heller Hall
> 271 19th Avenue South
> Minneapolis, MN 55455
>
More information about the DDI-ADG
mailing list