[DDI-SRG] iIn-line data / defined range of values / data display format

Joachim Wackerow joachim.wackerow at gesis.org
Wed Aug 15 08:41:25 EDT 2007


In the discussion with Larry Hoyle about a SAS converter I noticed 
several things. Now I have three questions:


DataSet, in-line data

Both transpositions of a data matrix are possible to represent in 
DataSet. A rectangular data file has normally the variables as columns 
and the cases as rows (or records). A similar representation is possible 
in DataSet, but then for every data item a VariableReference is 
necessary. The markup gets a bit wordy. For a transposed version of the 
data matrix only one VariableReference per variable is necessary and the 
Value element is repeated.

For an application it would be helpful to differentiate between both 
cases. That should be indicated in DataSet. Otherwise the application 
has to check the chosen representation. This can be error prone. Both 
approaches can be used in one representation, which wouldn't make sense.


Defined range of values, CodeScheme / CategoryScheme

Variables with interval or ratio measurement can have ranges of data 
with different code values but same category labels.

Example:
BMI
     low-<18.5 =  "Underweight"
     18.5-24.9 =  "Normal weight"
     25-29.9 =  "Overweight"
     30-high =  "Obesity"

For such a variable it would make sense to define ranges of values 
associated with the same category. A derived variable with a related 
recode would be not necessary.

For my understanding we have no possibility to represent this approach.

A solution would be in Code of CodeScheme to have Range as a choice for 
Value.


Data Display Format

A need for a display format exists for example for variables whose 
values are proportions (percent). Another example would be currency. For 
these type of variable no CodeScheme and no CategoryScheme is necessary. 
  How can we define this. It seems to be an attribute of the variable 
itself in LogicalProduct. Is this also a missing feature?
We have a data format in PhysicalInstance. But that is the data format 
of the data itself.

Any comments?

Achim


More information about the DDI-SRG mailing list