[DDI-ADG] nCube elements
Jostein Ryssevik
Jostein.Ryssevik at nsd.uib.no
Tue Aug 30 08:11:26 EDT 2005
Mary,
Unfortunately I will not be on the call today (due to a dentist
appointment). I will however be happy any questions that you may have by
mail later this evening.
Regards,
Jostein
At 07:49 30.08.2005 -0400, Mary Vardigan wrote:
>Jostein,
>
>I hope you will be on the call today so we can discuss this further -- we
>gave the operator your number, so they should be calling you.
>
>I missed the call when this was discussed so have just written what I
>thought was a summary of the discussion at this point, but I may not be
>capturing the sense of the group correctly. This seems very important to
>pin down, so we can really use your help.
>
>Mary
>
>At 04:52 AM 8/30/2005, Jostein Ryssevik wrote:
>>At 16:20 29.08.2005 -0400, Mary Vardigan wrote:
>>
>>>
>>><measure> 4.4.13 Measure
>>>
>>>
>>> * Optional
>>> * Repeatable
>>> * Attributes:
>>> <http://webapp.icpsr.umich.edu/cocoon/DDI-LIBRARY/?element-definition=codeBook>ID,
>>> xml:lang, source, varRef, aggrMeth, measUnit, scale, origin, additivity
>>>
>>>Description: The element measure indicates the measurement features of
>>>the cell content: type of aggregation used, measurement unit, and
>>>measurement scale. An origin point is recorded for anchored scales, to
>>>be used in determining relative movement along the scale. Additivity
>>>indicates whether an aggregate is a stock (like the population at a
>>>given point in time) or a flow (like the number of births or deaths over
>>>a certain period of time). The non-additive flag is to be used for
>>>measures that for logical reasons cannot be aggregated to a higher level
>>>- for instance, data that only make sense at a certain level of
>>>aggregation, like a classification. Two nCubes may be identical except
>>>for their measure - for example, a count of persons by age and percent
>>>of persons by age. Measure is an empty element that includes the
>>>following attributes: "varRef" is an IDREF; "aggrMeth" indicates the
>>>type of aggregation method used, for example 'sum', 'average', 'count';
>>>"measUnit" records the measurement unit, for example 'km', 'miles',
>>>etc.; "scale" records unit of scale, for example 'x1', 'x1000'; "origin"
>>>records the point of origin for anchored scales;"additivity" records
>>>type of additivity such as 'stock', 'flow', 'non-additive'.
>>>
>>>We are assuming that this element will be replaced by a "container" of
>>>attributes or characteristics that will apply at all levels, some of
>>>which J will be supplying from SDMX. Thus, the existing attributes like
>>>"Aggregation Method" will become part of this larger set of
>>>characteristics. "Measure" is a particularly problematic term because
>>>Nesstar uses it in a different way to mean the variable itself.
>>
>>Let me explain why Nesstar is using the measure element in this way, and
>>why I think it make sense to think twice before we make radical changes
>>to this construct.
>>
>>In the nCube element, the dmns- and measure-elements are playing similar
>>roles. Both of them are pointing back to var-elements in the variable
>>description section and are in this way indicating how concrete variables
>>are used to construct a multidimensional table. The dmns-elements lists
>>the variables that establish the dimensionality of the table/cube, and
>>the measure-elements list the variables that are populating the cells of
>>the cube. This can be a single variable, or multiple variables in a
>>multi-measure-cube. Both elements are in addition holding a series of
>>attributes that adds specific cube-related variable information that are
>>missing in the variable-elements. For dimensions the most important is
>>cohort that is used to describe what parts of a variable/classification
>>that actually in used in the dimension For measures, the most important
>>attributes relates to the logical and mathematical properties of the
>>measure, like aggregation method, additivity etc. In oo-terms you could
>>say that measure as well as dmns inherits from their var-elements and add
>>a few more attributes that are specific to the role the variables play in
>>the cube.
>>
>>Please note that this use of the terms (as well as the logic) are fully
>>compliant with the way the concepts dimension and measure are used in the
>>OLAP, data warehouse and data mining community.
>>
>>One reason to forget that a measure really is a variable derives from the
>>relationship between crosstabs and cubes. If you use SPSS to create a
>>crosstab from micro data you are only specifying the dimension variables
>>but you are still creating a cube. The reason is that the measure
>>variable, which in this case is the counts/frequencies of the sample or
>>population that the micro-data describe, is implicit (by crossing gender
>>and age in a census, you get the population count for age- and
>>gender-groups). Population is not a defined variable in the microdataset,
>>but it still a variable in statistical terms. This can easily be seen if
>>you instead of running a standard crosstab runs an old-fashioned
>>crossbreak and add a variable like income as a "summary" variable (and
>>ask for the aggregation method "mean"). You will then get another
>>table/cube displaying mean income for age- and gender-groups.The cube has
>>the same dimensionality as the previous one, but another measure variable.
>>
>>So, please do not make radical changes to the measure-attribute that will
>>prevent us from meeting this very basic and standard requirement.
>>
>>All the best,
>>Jostein
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>Mary Vardigan
>Assistant Director
>Inter-university Consortium for Political and Social Research (ICPSR)
>University of Michigan
>P.O. Box 1248, Ann Arbor, MI 48106-1248
>Phone: 734-615-7908
>Fax: 734-647-8200
>www.icpsr.umich.edu
More information about the DDI-ADG
mailing list