[DDI-SRG] Missing value range and interval measurement
Joachim Wackerow
joachim.wackerow at gesis.org
Fri May 16 05:11:13 EDT 2008
Hi Sigbjoern,
Yes, you are right: missingValue="<=0 9 20..25 >=99" would be possible
to define. But as you assumed, that was not the intention in terms of a
good machine-actionability. It should be not necessary to parse a
content of a field.
This field should be more restricted by a regular expression to numbers
and a restricted set of strings.
We should add something for the missing value range in 3.1. I'll file a bug.
I like the idea to invent missing schemes. This should be considered in
the related discussion.
Achim
> In addition to SPSS, NSDstat and Nesstar Publisher use missing ranges.
> In addition you have unbounded missing ranges with either a lower limit
> or an upper limit (e.g. <= 0 or >= 99). It was possible to represent
> this in DDI 2 using the <invalrng> element. I guess you can represent
> this in the missingValue attribute of r:RepresentationType if you do it
> like this: missingValue="<=0 9 20..25 >=99"
> This will define any value equal or lower than 0 as missing.
> The value 9 will be missing.
> Any value in the range 20 to 25 will be missing.
> And any value larger or equal to 99 will be missing.
>
> The definition of the missingValue attribute does not prevent you from
> defining it like this, but I guess it was not the way it was intended to
> be used so it should probably not be used that way.
>
> In SAS/STATA and SPSS (unless you use range) there is limit on how many
> missing values you can define, in the DDI 3.0 there is no limit. I think
> one should try to be compatible with the market leading statistical
> packages for social science. For a future version of Nesstar Publisher
> we are planing to add compatibility for SAS/Stata type of missing (they
> will now be recoded on import), we currently have SPSS compatibility.
> The user will, for each variable, have to choose which missing scheme to
> use, SAS type or SPSS type. Maybe this is the way to go for DDI as well?
> I.e. you have to define which missing scheme to use (SAS or SPSS). Are
> there other missing schemes used in other statistical packages that
> aren't a subset of the SAS or SPSS schemes and are more extensive than them?
>
> Sigbjoern
>
>> Joachim Wackerow wrote:
>>
>>> Currently I'm working again on the SPSS converter.
>>>
>>> I'm wondering how to express in DDI a missing value range of a variable
>>> with an interval measurement level.
>>>
>>> For example the variable temperature. A missing value range is 20-25
>>> Celsius. The values are expressed as floating numbers like 20.17 etc.,
>>> which are not all known in advance. This can be expressed in SPSS. In
>>> DDI I don't see a way. Do I miss something?
>>>
>>> For variables with ordinal measurement (i.e. with categories like
>>> occupation) a workaround can be used for expressing a missing value
>>> range in DDI. Each category in the missing value range must be defined
>>> as missing. This can produce some overhead. Category entries with just
>>> the missing definition can be produced without labels depending on the
>>> definition in files of the statistical packages.
>>>
>>> Any ideas? I think we discussed already a similar issue?
>>>
>>> Achim
>>> _______________________________________________
>>> DDI-SRG mailing list
>>> DDI-SRG at icpsr.umich.edu
>>> http://www.icpsr.umich.edu/mailman/listinfo/ddi-srg
>>>
>>>
>>>
>>
>> _______________________________________________
>> DDI-SRG mailing list
>> DDI-SRG at icpsr.umich.edu
>> http://www.icpsr.umich.edu/mailman/listinfo/ddi-srg
>>
>
--
GESIS - German Social Science Infrastructure Services
http://www.gesis.org/en/
More information about the DDI-SRG
mailing list