[DDI-ADG] nCube elements
Mary Vardigan
maryv at icpsr.umich.edu
Tue Aug 30 07:49:27 EDT 2005
Jostein,
I hope you will be on the call today so we can discuss this further -- we
gave the operator your number, so they should be calling you.
I missed the call when this was discussed so have just written what I
thought was a summary of the discussion at this point, but I may not be
capturing the sense of the group correctly. This seems very important to
pin down, so we can really use your help.
Mary
At 04:52 AM 8/30/2005, Jostein Ryssevik wrote:
>At 16:20 29.08.2005 -0400, Mary Vardigan wrote:
>
>>
>><measure> 4.4.13 Measure
>>
>>
>> * Optional
>> * Repeatable
>> * Attributes:
>> <http://webapp.icpsr.umich.edu/cocoon/DDI-LIBRARY/?element-definition=codeBook>ID,
>> xml:lang, source, varRef, aggrMeth, measUnit, scale, origin, additivity
>>
>>Description: The element measure indicates the measurement features of
>>the cell content: type of aggregation used, measurement unit, and
>>measurement scale. An origin point is recorded for anchored scales, to be
>>used in determining relative movement along the scale. Additivity
>>indicates whether an aggregate is a stock (like the population at a given
>>point in time) or a flow (like the number of births or deaths over a
>>certain period of time). The non-additive flag is to be used for measures
>>that for logical reasons cannot be aggregated to a higher level - for
>>instance, data that only make sense at a certain level of aggregation,
>>like a classification. Two nCubes may be identical except for their
>>measure - for example, a count of persons by age and percent of persons
>>by age. Measure is an empty element that includes the following
>>attributes: "varRef" is an IDREF; "aggrMeth" indicates the type of
>>aggregation method used, for example 'sum', 'average', 'count';
>>"measUnit" records the measurement unit, for example 'km', 'miles', etc.;
>>"scale" records unit of scale, for example 'x1', 'x1000'; "origin"
>>records the point of origin for anchored scales;"additivity" records type
>>of additivity such as 'stock', 'flow', 'non-additive'.
>>
>>We are assuming that this element will be replaced by a "container" of
>>attributes or characteristics that will apply at all levels, some of
>>which J will be supplying from SDMX. Thus, the existing attributes like
>>"Aggregation Method" will become part of this larger set of
>>characteristics. "Measure" is a particularly problematic term because
>>Nesstar uses it in a different way to mean the variable itself.
>
>Let me explain why Nesstar is using the measure element in this way, and
>why I think it make sense to think twice before we make radical changes to
>this construct.
>
>In the nCube element, the dmns- and measure-elements are playing similar
>roles. Both of them are pointing back to var-elements in the variable
>description section and are in this way indicating how concrete variables
>are used to construct a multidimensional table. The dmns-elements lists
>the variables that establish the dimensionality of the table/cube, and the
>measure-elements list the variables that are populating the cells of the
>cube. This can be a single variable, or multiple variables in a
>multi-measure-cube. Both elements are in addition holding a series of
>attributes that adds specific cube-related variable information that are
>missing in the variable-elements. For dimensions the most important is
>cohort that is used to describe what parts of a variable/classification
>that actually in used in the dimension For measures, the most important
>attributes relates to the logical and mathematical properties of the
>measure, like aggregation method, additivity etc. In oo-terms you could
>say that measure as well as dmns inherits from their var-elements and add
>a few more attributes that are specific to the role the variables play in
>the cube.
>
>Please note that this use of the terms (as well as the logic) are fully
>compliant with the way the concepts dimension and measure are used in the
>OLAP, data warehouse and data mining community.
>
>One reason to forget that a measure really is a variable derives from the
>relationship between crosstabs and cubes. If you use SPSS to create a
>crosstab from micro data you are only specifying the dimension variables
>but you are still creating a cube. The reason is that the measure
>variable, which in this case is the counts/frequencies of the sample or
>population that the micro-data describe, is implicit (by crossing gender
>and age in a census, you get the population count for age- and
>gender-groups). Population is not a defined variable in the microdataset,
>but it still a variable in statistical terms. This can easily be seen if
>you instead of running a standard crosstab runs an old-fashioned
>crossbreak and add a variable like income as a "summary" variable (and ask
>for the aggregation method "mean"). You will then get another table/cube
>displaying mean income for age- and gender-groups.The cube has the same
>dimensionality as the previous one, but another measure variable.
>
>So, please do not make radical changes to the measure-attribute that will
>prevent us from meeting this very basic and standard requirement.
>
>All the best,
>Jostein
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
Mary Vardigan
Assistant Director
Inter-university Consortium for Political and Social Research (ICPSR)
University of Michigan
P.O. Box 1248, Ann Arbor, MI 48106-1248
Phone: 734-615-7908
Fax: 734-647-8200
www.icpsr.umich.edu
More information about the DDI-ADG
mailing list