[DDI-ADG] nCube elements

Jostein Ryssevik Jostein.Ryssevik at nsd.uib.no
Tue Aug 30 08:11:26 EDT 2005


Mary,

Unfortunately I will not be on the call today (due to a dentist 
appointment). I will however be happy any questions that you may have by 
mail later this evening.

Regards,
Jostein


At 07:49 30.08.2005 -0400, Mary Vardigan wrote:
>Jostein,
>
>I hope you will be on the call today so we can discuss this further -- we 
>gave the operator your number, so they should be calling you.
>
>I missed the call when this was discussed so have just written what I 
>thought was a summary of the discussion at this point, but I may not be 
>capturing the sense of the group correctly. This seems very important to 
>pin down, so we can really use your help.
>
>Mary
>
>At 04:52 AM 8/30/2005, Jostein Ryssevik wrote:
>>At 16:20 29.08.2005 -0400, Mary Vardigan wrote:
>>
>>>
>>><measure> 4.4.13  Measure
>>>
>>>
>>>    * Optional
>>>    * Repeatable
>>>    * Attributes: 
>>> <http://webapp.icpsr.umich.edu/cocoon/DDI-LIBRARY/?element-definition=codeBook>ID, 
>>> xml:lang, source, varRef, aggrMeth, measUnit, scale, origin, additivity
>>>
>>>Description: The element measure indicates the measurement features of 
>>>the cell content: type of aggregation used, measurement unit, and 
>>>measurement scale. An origin point is recorded for anchored scales, to 
>>>be used in determining relative movement along the scale. Additivity 
>>>indicates whether an aggregate is a stock (like the population at a 
>>>given point in time) or a flow (like the number of births or deaths over 
>>>a certain period of time). The non-additive flag is to be used for 
>>>measures that for logical reasons cannot be aggregated to a higher level 
>>>- for instance, data that only make sense at a certain level of 
>>>aggregation, like a classification. Two nCubes may be identical except 
>>>for their measure - for example, a count of persons by age and percent 
>>>of persons by age. Measure is an empty element that includes the 
>>>following attributes: "varRef" is an IDREF; "aggrMeth" indicates the 
>>>type of aggregation method used, for example 'sum', 'average', 'count'; 
>>>"measUnit" records the measurement unit, for example 'km', 'miles', 
>>>etc.; "scale" records unit of scale, for example 'x1', 'x1000'; "origin" 
>>>records the point of origin for anchored scales;"additivity" records 
>>>type of additivity such as 'stock', 'flow', 'non-additive'.
>>>
>>>We are assuming that this element will be replaced by a "container" of 
>>>attributes or characteristics that will apply at all levels, some of 
>>>which J will be supplying from SDMX. Thus, the existing attributes like 
>>>"Aggregation Method" will become part of this larger set of 
>>>characteristics. "Measure" is a particularly problematic term because 
>>>Nesstar uses it in a different way to mean the variable itself.
>>
>>Let me explain why Nesstar is using the measure element in this way, and 
>>why I think it make sense to think twice before we make radical changes 
>>to this construct.
>>
>>In the nCube element, the dmns- and measure-elements are playing similar 
>>roles.  Both of them are pointing back to var-elements in the variable 
>>description section and are in this way indicating how concrete variables 
>>are used to construct a multidimensional table. The dmns-elements lists 
>>the variables that establish the dimensionality of the table/cube, and 
>>the measure-elements list the variables that are populating the cells of 
>>the cube. This can be a single variable, or multiple variables in a 
>>multi-measure-cube. Both elements are in addition holding a series of 
>>attributes that adds specific cube-related variable information that are 
>>missing in the variable-elements. For dimensions the most important is 
>>cohort that is used to describe what parts of a variable/classification 
>>that actually in used in the dimension  For measures, the most important 
>>attributes relates to the logical and mathematical properties of the 
>>measure, like aggregation method, additivity etc. In oo-terms you could 
>>say that measure as well as dmns inherits from their var-elements and add 
>>a few more attributes that are specific to the role the variables play in 
>>the cube.
>>
>>Please note that this use of the terms (as well as the logic) are fully 
>>compliant with the way the concepts dimension and measure are used in the 
>>OLAP, data warehouse and data mining community.
>>
>>One reason to forget that a measure really is a variable derives from the 
>>relationship between crosstabs and cubes. If you use SPSS to create a 
>>crosstab from micro data you are only specifying the dimension variables 
>>but you are still creating a cube. The reason is that the measure 
>>variable, which in this case is the counts/frequencies of the sample or 
>>population that the micro-data describe, is implicit (by crossing gender 
>>and age in a census, you get the population count for age- and 
>>gender-groups). Population is not a defined variable in the microdataset, 
>>but it still a variable in statistical terms. This can easily be seen if 
>>you instead of running a standard crosstab runs an old-fashioned 
>>crossbreak and add a variable like income as a "summary" variable (and 
>>ask for the aggregation method "mean"). You will then get another 
>>table/cube displaying mean income for age- and gender-groups.The cube has 
>>the same dimensionality as the previous one, but another measure variable.
>>
>>So, please do not make radical changes to the measure-attribute that will 
>>prevent us from meeting this very basic and standard requirement.
>>
>>All the best,
>>Jostein
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>Mary Vardigan
>Assistant Director
>Inter-university Consortium for Political and Social Research (ICPSR)
>University of Michigan
>P.O. Box 1248, Ann Arbor, MI 48106-1248
>Phone: 734-615-7908
>Fax: 734-647-8200
>www.icpsr.umich.edu




More information about the DDI-ADG mailing list