[DDI-ADG] nCube elements

Mary Vardigan maryv at icpsr.umich.edu
Tue Aug 30 07:49:27 EDT 2005


Jostein,

I hope you will be on the call today so we can discuss this further -- we 
gave the operator your number, so they should be calling you.

I missed the call when this was discussed so have just written what I 
thought was a summary of the discussion at this point, but I may not be 
capturing the sense of the group correctly. This seems very important to 
pin down, so we can really use your help.

Mary

At 04:52 AM 8/30/2005, Jostein Ryssevik wrote:
>At 16:20 29.08.2005 -0400, Mary Vardigan wrote:
>
>>
>><measure> 4.4.13  Measure
>>
>>
>>    * Optional
>>    * Repeatable
>>    * Attributes: 
>> <http://webapp.icpsr.umich.edu/cocoon/DDI-LIBRARY/?element-definition=codeBook>ID, 
>> xml:lang, source, varRef, aggrMeth, measUnit, scale, origin, additivity
>>
>>Description: The element measure indicates the measurement features of 
>>the cell content: type of aggregation used, measurement unit, and 
>>measurement scale. An origin point is recorded for anchored scales, to be 
>>used in determining relative movement along the scale. Additivity 
>>indicates whether an aggregate is a stock (like the population at a given 
>>point in time) or a flow (like the number of births or deaths over a 
>>certain period of time). The non-additive flag is to be used for measures 
>>that for logical reasons cannot be aggregated to a higher level - for 
>>instance, data that only make sense at a certain level of aggregation, 
>>like a classification. Two nCubes may be identical except for their 
>>measure - for example, a count of persons by age and percent of persons 
>>by age. Measure is an empty element that includes the following 
>>attributes: "varRef" is an IDREF; "aggrMeth" indicates the type of 
>>aggregation method used, for example 'sum', 'average', 'count'; 
>>"measUnit" records the measurement unit, for example 'km', 'miles', etc.; 
>>"scale" records unit of scale, for example 'x1', 'x1000'; "origin" 
>>records the point of origin for anchored scales;"additivity" records type 
>>of additivity such as 'stock', 'flow', 'non-additive'.
>>
>>We are assuming that this element will be replaced by a "container" of 
>>attributes or characteristics that will apply at all levels, some of 
>>which J will be supplying from SDMX. Thus, the existing attributes like 
>>"Aggregation Method" will become part of this larger set of 
>>characteristics. "Measure" is a particularly problematic term because 
>>Nesstar uses it in a different way to mean the variable itself.
>
>Let me explain why Nesstar is using the measure element in this way, and 
>why I think it make sense to think twice before we make radical changes to 
>this construct.
>
>In the nCube element, the dmns- and measure-elements are playing similar 
>roles.  Both of them are pointing back to var-elements in the variable 
>description section and are in this way indicating how concrete variables 
>are used to construct a multidimensional table. The dmns-elements lists 
>the variables that establish the dimensionality of the table/cube, and the 
>measure-elements list the variables that are populating the cells of the 
>cube. This can be a single variable, or multiple variables in a 
>multi-measure-cube. Both elements are in addition holding a series of 
>attributes that adds specific cube-related variable information that are 
>missing in the variable-elements. For dimensions the most important is 
>cohort that is used to describe what parts of a variable/classification 
>that actually in used in the dimension  For measures, the most important 
>attributes relates to the logical and mathematical properties of the 
>measure, like aggregation method, additivity etc. In oo-terms you could 
>say that measure as well as dmns inherits from their var-elements and add 
>a few more attributes that are specific to the role the variables play in 
>the cube.
>
>Please note that this use of the terms (as well as the logic) are fully 
>compliant with the way the concepts dimension and measure are used in the 
>OLAP, data warehouse and data mining community.
>
>One reason to forget that a measure really is a variable derives from the 
>relationship between crosstabs and cubes. If you use SPSS to create a 
>crosstab from micro data you are only specifying the dimension variables 
>but you are still creating a cube. The reason is that the measure 
>variable, which in this case is the counts/frequencies of the sample or 
>population that the micro-data describe, is implicit (by crossing gender 
>and age in a census, you get the population count for age- and 
>gender-groups). Population is not a defined variable in the microdataset, 
>but it still a variable in statistical terms. This can easily be seen if 
>you instead of running a standard crosstab runs an old-fashioned 
>crossbreak and add a variable like income as a "summary" variable (and ask 
>for the aggregation method "mean"). You will then get another table/cube 
>displaying mean income for age- and gender-groups.The cube has the same 
>dimensionality as the previous one, but another measure variable.
>
>So, please do not make radical changes to the measure-attribute that will 
>prevent us from meeting this very basic and standard requirement.
>
>All the best,
>Jostein
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

Mary Vardigan
Assistant Director
Inter-university Consortium for Political and Social Research (ICPSR)
University of Michigan
P.O. Box 1248, Ann Arbor, MI 48106-1248
Phone: 734-615-7908
Fax: 734-647-8200
www.icpsr.umich.edu 



More information about the DDI-ADG mailing list