[DDI-SRG] Missing value range and interval measurement
Sigbjoern Revheim
sigbjorn.revheim at nsd.uib.no
Fri May 16 04:35:04 EDT 2008
Pascal Heus wrote:
> Achim:
> This is an issue I raised by email couple of weeks ago. It is currently
> not supported. What I do in DExT is to assume all the values are
> discrete and I generate a missing for every number in the range. I agree
> that this may cause some serious overhead and I may put a limit in the
> next version of the SPSSReader (like default of 200 values that can be
> change by application if needed).
> The continuous variable case was also pointed out by Guido Gay a(working
> on the R SPSS-DDI converter) and I don't think there is much we can do
> about it for now. We should file this as a bug for 3.1. This however
> seem to be an SPSS speicif issue, I don;t think Stata or SAS support
> missing value ranges (?). What about other packages?
> best
> *P
>
In addition to SPSS, NSDstat and Nesstar Publisher use missing ranges.
In addition you have unbounded missing ranges with either a lower limit
or an upper limit (e.g. <= 0 or >= 99). It was possible to represent
this in DDI 2 using the <invalrng> element. I guess you can represent
this in the missingValue attribute of r:RepresentationType if you do it
like this: missingValue="<=0 9 20..25 >=99"
This will define any value equal or lower than 0 as missing.
The value 9 will be missing.
Any value in the range 20 to 25 will be missing.
And any value larger or equal to 99 will be missing.
The definition of the missingValue attribute does not prevent you from
defining it like this, but I guess it was not the way it was intended to
be used so it should probably not be used that way.
In SAS/STATA and SPSS (unless you use range) there is limit on how many
missing values you can define, in the DDI 3.0 there is no limit. I think
one should try to be compatible with the market leading statistical
packages for social science. For a future version of Nesstar Publisher
we are planing to add compatibility for SAS/Stata type of missing (they
will now be recoded on import), we currently have SPSS compatibility.
The user will, for each variable, have to choose which missing scheme to
use, SAS type or SPSS type. Maybe this is the way to go for DDI as well?
I.e. you have to define which missing scheme to use (SAS or SPSS). Are
there other missing schemes used in other statistical packages that
aren't a subset of the SAS or SPSS schemes and are more extensive than them?
Sigbjoern
> Joachim Wackerow wrote:
>
>> Currently I'm working again on the SPSS converter.
>>
>> I'm wondering how to express in DDI a missing value range of a variable
>> with an interval measurement level.
>>
>> For example the variable temperature. A missing value range is 20-25
>> Celsius. The values are expressed as floating numbers like 20.17 etc.,
>> which are not all known in advance. This can be expressed in SPSS. In
>> DDI I don't see a way. Do I miss something?
>>
>> For variables with ordinal measurement (i.e. with categories like
>> occupation) a workaround can be used for expressing a missing value
>> range in DDI. Each category in the missing value range must be defined
>> as missing. This can produce some overhead. Category entries with just
>> the missing definition can be produced without labels depending on the
>> definition in files of the statistical packages.
>>
>> Any ideas? I think we discussed already a similar issue?
>>
>> Achim
>> _______________________________________________
>> DDI-SRG mailing list
>> DDI-SRG at icpsr.umich.edu
>> http://www.icpsr.umich.edu/mailman/listinfo/ddi-srg
>>
>>
>>
>
> _______________________________________________
> DDI-SRG mailing list
> DDI-SRG at icpsr.umich.edu
> http://www.icpsr.umich.edu/mailman/listinfo/ddi-srg
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.icpsr.umich.edu/pipermail/ddi-srg/attachments/20080516/bc90adb7/attachment.html
More information about the DDI-SRG
mailing list