[DDI-ADG] More on aggregate data.

Sanda Ionescu sandai at icpsr.umich.edu
Fri Aug 26 15:16:20 EDT 2005


Hi, all.

First of all, thank you, Jostein, for your messages - I think they're 
really helpful in moving us along.
While we talk about aggregate data, I think it is important to keep in mind 
the modular structure we envisage for Version 3.0.
A desirable scenario for covering aggregate data might be to end up with 
three different modules:
1) a module documenting the logical structure of the data (provided for in 
the Logical Product- nCube package in the V 3.0 spreadsheet) and including 
dimensions and cubes descriptions.
2) a module primarily designed for data exchange, containing both data and 
some metadata, modeled after the SDMX specification - I particularly liked 
the example I sent yesterday, extracted from the Generic Sample, although I 
am not sure what level of validation we would actually need.
3) finally, a module describing the physical structure of an external data 
file, that we (the archive) might choose to describe and distribute in a 
legacy format (like Census data, etc.)
(this would be an (improved?) version of the Phys. Rec. Structure Package 
in the V 3.0 spreadsheet).
Obviously, there will be links (cross-references) between the modules, 
particularly between 1) and 3) and 1) and 2).

With these three modules, producers or distributors would have the 
flexibility to use any combination of data and metadata they would find 
suitable to their purposes, and the data could sit either within or outside 
the DDI instance.

Some questions:

Module 1) -- what do we need to add to make it more functional, (and 
SDMX-compatible) ? J ??
also, while I'm looking at the above-mentioned spreadsheet, I notice that 
variables are described twice, once as "variables" and once as "variable 
dimensions". I think that's probably a mistake -- in this module we only 
need to describe "dimensions."

Module 2) -- I fully agree with Jostein's remark that time variables need 
to be accounted for as "dimensions". Other than that, what other 
changes/adjustments do we need? And, I'm sure others will agree, even if we 
adopt a structure similar to SDMX, we might want tag names that are more 
suggestive of their contents. (J, I'm afraid we might need to rely on you 
to provide an outline of this section, when we agree on what goes in.)

Module 3) -- Right now the LocMap only provides for identifying cells in a 
flat delimited file. Do we want to add anything here?

Sanda.



Sanda Ionescu,
Research Associate
Inter-university Consortium for Political and Social Research (ICPSR)
The University of Michigan
P.O. Box 1248
Ann Arbor, MI 48106

Phone: (734) 615-7890
Fax: (734) 615-7890
        (734) 647-8200



More information about the DDI-ADG mailing list