[DDI-SRG] FW: stats question please

Wendy Thomas wlt at pop.umn.edu
Wed Mar 25 17:59:28 EDT 2009


I think we should make it 3.2 given the related issues.

wlt

On Wed, 25 Mar 2009, Pascal Heus wrote:

> Unless we need to agree on this one quickly, it could be pushed to 3.2.
> *P
>
> Wendy Thomas wrote:
>> This is being added to the resolved bug list? Making this repeatable has
>> ramifications for the reference to summary statistics held in another
>> physical instance because Statistics is not identifiable. It didn't need
>> to be because it was the only one in the file.
>>
>> wendy
>>
>>
>> On Wed, 25 Mar 2009, Pascal Heus wrote:
>>
>>
>>> You can still do this if you want, we're we're just adding
>>> repeatability, a default weight and a descriptive field (not possible
>>> today).
>>> cheers
>>> *P
>>>
>>> Wendy Thomas wrote:
>>>
>>>> That assumes you want to bundle by summary statistic type. Others may
>>>> want to bundle by variable in the way that a statistical package
>>>> produces summary statistics
>>>>
>>>> for each variable provide  count, median, mean, standard deviation, etc.
>>>>
>>>> This is definately an issue for 3.2 as providing the flexibility to
>>>> handle both situations could become problematic
>>>>
>>>> Right now the only thing we are preventing is bundling on one
>>>> dimension or the other, we're not preventing the expression of the
>>>> summary statistic.
>>>>
>>>> At least we won't lack things to argue over in the future :)
>>>>
>>>> wendy
>>>>
>>>> On Wed, 25 Mar 2009, Pascal Heus wrote:
>>>>
>>>>
>>>>> Wendy:
>>>>> The reason we suggest to have <Statistics> repeatable is that you can
>>>>> then create sets, point to a default weight once for all, and include
>>>>> descriptive information.
>>>>>
>>>>> For example:
>>>>> <Statistics>
>>>>>   <Description>Frequencies / Unweighted statistics</Descripiton>
>>>>>   <VariableStatistics>...</VariableStatistics>
>>>>>   ...
>>>>> </Statistics>
>>>>> <Statistics>
>>>>>   <Description>Statistics using household weight</Descripiton>
>>>>>   <DefalutWieghtVariableReference> (or <DefaultWeightUsedReference)
>>>>>   <VariableStatistics>...</VariableStatistics>
>>>>>   ...
>>>>> </Statistics>
>>>>> <Statistics>
>>>>>   <Description>Statistics using panel weight</Descripiton>
>>>>>   <DefalutWieghtVariableReference> (or <DefaultWeightUsedReference)
>>>>>   <VariableStatistics>...</VariableStatistics>
>>>>>   ...
>>>>> </Statistics>
>>>>>
>>>>> best
>>>>> *P
>>>>>
>>>>> arofan.gregory wrote:
>>>>>
>>>>>> Wendy:
>>>>>>
>>>>>> It's the not-bundling part that we didn't like. As stated earlier,
>>>>>> I'll let
>>>>>> Pascal explore this one with you, as I feel he has a better handle
>>>>>> on the
>>>>>> issue.
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Arofan
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Wendy Thomas [mailto:wlt at pop.umn.edu]
>>>>>> Sent: Tuesday, March 24, 2009 6:30 PM
>>>>>> To: arofan.gregory
>>>>>> Cc: 'Mary Vardigan'; 'Sanda Ionescu'; 'TIC list'
>>>>>> Subject: RE: [DDI-SRG] FW: stats question please
>>>>>>
>>>>>> I thought that in essence it is repeatable as I can provide more
>>>>>> than one
>>>>>> summary statistic referencing the same variable but a different weight.
>>>>>> Yep...just checked. Under Statistics I have an unbounded number of
>>>>>> VariableStatistic. Each VariableStatistic can have one weight, but I
>>>>>> can
>>>>>> have multiple VariableStatistic items referencing the same variable. Of
>>>>>> course that doesn't bundle them, but you would need to either redeclare
>>>>>> the weight for the category statistics or reference the variable and
>>>>>> then
>>>>>> have a separate repeatable structure that identified weight and
>>>>>> statistitic value and category statistics using that weight.
>>>>>>
>>>>>> wendy
>>>>>>
>>>>>>
>>>>>> On Tue, 24 Mar 2009, arofan.gregory wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Wendy:
>>>>>>>
>>>>>>> No - there's wasn't any approach that worked that I could see. They've
>>>>>>> started using "bootstrap weights" at Stats Can, so this will be an
>>>>>>>
>>>>>>>
>>>>>> on-going
>>>>>>
>>>>>>
>>>>>>> issue, and they have to describe the data as they receive it, so
>>>>>>> inventing
>>>>>>>
>>>>>>>
>>>>>> a
>>>>>>
>>>>>>
>>>>>>> variable isn't something they can do.
>>>>>>>
>>>>>>> As I understand the requirement, what is needed is to have several
>>>>>>> sets of
>>>>>>> statistics per variable - the fix we envisioned was to make the
>>>>>>>
>>>>>>>
>>>>>> "Statistics"
>>>>>>
>>>>>>
>>>>>>> element repeatable. I should probably turn you over to Pascal, as I
>>>>>>>
>>>>>>>
>>>>>> believe
>>>>>>
>>>>>>
>>>>>>> he understand the issue better than I do.
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>> Arofan
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: ddi-srg-bounces at icpsr.umich.edu
>>>>>>> [mailto:ddi-srg-bounces at icpsr.umich.edu] On Behalf Of Wendy Thomas
>>>>>>> Sent: Tuesday, March 24, 2009 2:56 PM
>>>>>>> To: arofan.gregory
>>>>>>> Cc: 'Mary Vardigan'; 'Sanda Ionescu'; 'TIC list'
>>>>>>> Subject: Re: [DDI-SRG] FW: stats question please
>>>>>>>
>>>>>>> Different sets of statistics are easy as you can simply provide
>>>>>>> multiple
>>>>>>> summary statistics for the same variable. The question is when you use
>>>>>>> multiple weights to create a statistics. I recommended the use of a
>>>>>>> variable that reflects the generated weight using a generation
>>>>>>> instruction
>>>>>>> to describe the process. Then that weight can be referred to from the
>>>>>>> summary statistics. Did you have another approach??
>>>>>>>
>>>>>>> wendy
>>>>>>>
>>>>>>>
>>>>>>> On Tue, 24 Mar 2009, arofan.gregory wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Wendy:
>>>>>>>>
>>>>>>>> We're here in Ottowa working with the Canadian RDC people (Stats Can
>>>>>>>>
>>>>>>>>
>>>>>> data)
>>>>>>
>>>>>>
>>>>>>>> and they have exactly the same issue - they need to have different
>>>>>>>>
>>>>>>>>
>>>>>> weights
>>>>>>
>>>>>>
>>>>>>>> for a single study, and therefore different sets of statistics.
>>>>>>>>
>>>>>>>> This is *very* common in Stats Canada data, apparently.
>>>>>>>>
>>>>>>>> I'm happy to show you how I think it could be fixed.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>>
>>>>>>>> Arofan
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: ddi-srg-bounces at icpsr.umich.edu
>>>>>>>> [mailto:ddi-srg-bounces at icpsr.umich.edu] On Behalf Of Wendy Thomas
>>>>>>>> Sent: Monday, March 23, 2009 11:26 AM
>>>>>>>> To: Sanda Ionescu
>>>>>>>> Cc: Mary Vardigan; TIC list
>>>>>>>> Subject: Re: [DDI-SRG] FW: stats question please
>>>>>>>>
>>>>>>>> We require the ability to make this machine actionable. We are
>>>>>>>> creating
>>>>>>>> variable representations of measures, concatonated measures providing
>>>>>>>> unique identifiers and links. We are no longer just documenting data.
>>>>>>>>
>>>>>>>> Wendy
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, 23 Mar 2009, Sanda Ionescu wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Wendy,
>>>>>>>>>
>>>>>>>>> It is for case 2) - a combination of weights.
>>>>>>>>> But ... I don't know...
>>>>>>>>> Creating a "fake" weight in DDI only (not in the data) seems a
>>>>>>>>> little
>>>>>>>>>
>>>>>>>>>
>>>>>>>> far-fetched.
>>>>>>>>
>>>>>>>>
>>>>>>>>> And really, I'm not sure we should be going that way.
>>>>>>>>> Aren't we just documenting data?
>>>>>>>>> Why resort to creating artifices like that - assigning fake values,
>>>>>>>>>
>>>>>>>>>
>>>>>>>> creating imaginary weights?
>>>>>>>>
>>>>>>>>
>>>>>>>>> And again, I think this makes it so much more difficult to
>>>>>>>>> automate -
>>>>>>>>>
>>>>>>>>>
>>>>>>> both
>>>>>>>
>>>>>>>
>>>>>>>> direct markup, and conversions.
>>>>>>>>
>>>>>>>>
>>>>>>>>> Sanda.
>>>>>>>>>
>>>>>>>>> Sanda Ionescu
>>>>>>>>> ICPSR
>>>>>>>>> University of Michigan
>>>>>>>>> P.O. Box 1248
>>>>>>>>> Ann Arbor, MI 48106
>>>>>>>>>
>>>>>>>>> Phone, Fax: 734-615-7890
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Wendy Thomas [mailto:wlt at pop.umn.edu]
>>>>>>>>> Sent: Monday, March 23, 2009 12:48 PM
>>>>>>>>> To: Sanda Ionescu
>>>>>>>>> Subject: Re: FW: stats question please
>>>>>>>>>
>>>>>>>>> Please clarify
>>>>>>>>>
>>>>>>>>> If they mean that a variable can have more than one weight, a
>>>>>>>>> separate
>>>>>>>>> statistic is created for each weight used.
>>>>>>>>>
>>>>>>>>> If they mean they use a combination of weights, then DDI 3.0
>>>>>>>>> requires
>>>>>>>>>
>>>>>>>>>
>>>>>>> they
>>>>>>>
>>>>>>>
>>>>>>>>> create a variable (not expressed in the data set physical
>>>>>>>>> content) that
>>>>>>>>> provides the expression of that multiple weighting through its
>>>>>>>>>
>>>>>>>>>
>>>>>> generation
>>>>>>
>>>>>>
>>>>>>>>> instruction.
>>>>>>>>>
>>>>>>>>> For example: If I had a standard weight A and also applied weight
>>>>>>>>>
>>>>>>>>>
>>>>>>> variable
>>>>>>>
>>>>>>>
>>>>>>>>> X and weight variable Y in calculating the statistic then I would
>>>>>>>>> need
>>>>>>>>>
>>>>>>>>>
>>>>>> to
>>>>>>
>>>>>>
>>>>>>>>> create a variable Z which using the generation instruction  of
>>>>>>>>> "StandardWeightA * variableX * variableY.
>>>>>>>>>
>>>>>>>>> It may not be how they want to express it, but this is the way
>>>>>>>>> its done.
>>>>>>>>>
>>>>>>>>> Wendy
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  On Mon, 23 Mar 2009, Sanda Ionescu wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Hi, Wendy.
>>>>>>>>>> This is a different issue.
>>>>>>>>>> Our researchers/ analysis specialists seem to think that it is
>>>>>>>>>>
>>>>>>>>>>
>>>>>> necessary
>>>>>>
>>>>>>
>>>>>>>> to be able to reference multiple weights for a statistic.
>>>>>>>>
>>>>>>>>
>>>>>>>>>> Please see exchange below.
>>>>>>>>>> Could you submit this as a request for a change to TIC?
>>>>>>>>>> Thanks
>>>>>>>>>> Sanda.
>>>>>>>>>>
>>>>>>>>>> Sanda Ionescu
>>>>>>>>>> ICPSR
>>>>>>>>>> University of Michigan
>>>>>>>>>> P.O. Box 1248
>>>>>>>>>> Ann Arbor, MI 48106
>>>>>>>>>>
>>>>>>>>>> Phone, Fax: 734-615-7890
>>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Felicia LeClere
>>>>>>>>>> Sent: Monday, March 23, 2009 10:14 AM
>>>>>>>>>> To: Sanda Ionescu
>>>>>>>>>> Subject: Re: stats question please
>>>>>>>>>>
>>>>>>>>>> It happens...so it is necessary to be able to have more than one
>>>>>>>>>> weight
>>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Sanda Ionescu
>>>>>>>>>> To: Lynette Hoelter; Felicia LeClere
>>>>>>>>>> CC: Mary Vardigan
>>>>>>>>>> Sent: Mon Mar 23 09:23:52 2009
>>>>>>>>>> Subject: RE: stats question please
>>>>>>>>>>
>>>>>>>>>> It's about DDI.
>>>>>>>>>>
>>>>>>>>>> Version 3 only allows 1 weight to be defined per statistical value.
>>>>>>>>>>
>>>>>>>>>> And we need to know if it is necessary to ask for the weight to
>>>>>>>>>> become
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>> repeatable (so that one can link to more than one).
>>>>>>>>
>>>>>>>>
>>>>>>>>>> This is not about whether it's LIKELY or not (because, yes, it is
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>> UNLIKELY) but rather is it necessary to cover for that possibility
>>>>>>>> too,
>>>>>>>> however unlikely it may be?
>>>>>>>>
>>>>>>>>
>>>>>>>>>> Sanda.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Sanda Ionescu
>>>>>>>>>> ICPSR
>>>>>>>>>> University of Michigan
>>>>>>>>>> P.O. Box 1248
>>>>>>>>>> Ann Arbor, MI 48106
>>>>>>>>>>
>>>>>>>>>> Phone, Fax: 734-615-7890
>>>>>>>>>>
>>>>>>>>>> From: Lynette Hoelter
>>>>>>>>>> Sent: Friday, March 20, 2009 4:22 PM
>>>>>>>>>> To: Sanda Ionescu; Felicia LeClere
>>>>>>>>>> Cc: Mary Vardigan
>>>>>>>>>> Subject: RE: stats question please
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> In theory, I don't see why not because the weights just multiple
>>>>>>>>>> the
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>> values for the cases by some fraction, so you could multiply by two.
>>>>>>>>
>>>>>>>>
>>>>>>>>>> Why?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ....................................................
>>>>>>>>>> Lynette F. Hoelter, Ph.D.
>>>>>>>>>> Dir., Instructional Resources & Development, ICPSR
>>>>>>>>>> Institute for Social Research
>>>>>>>>>> University of Michigan
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> From: Sanda Ionescu
>>>>>>>>>> Sent: Friday, March 20, 2009 3:13 PM
>>>>>>>>>> To: Felicia LeClere; Lynette Hoelter
>>>>>>>>>> Cc: Mary Vardigan
>>>>>>>>>> Subject: stats question please
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Felicia, Lynette,
>>>>>>>>>>
>>>>>>>>>> In Variable Statistics is it POSSIBLE  to obtain ONE value (say,
>>>>>>>>>> mean,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>> or
>>>>>>>
>>>>>>>
>>>>>>>> mode, etc.) by applying TWO different weights - not sequentially,
>>>>>>>> but at
>>>>>>>>
>>>>>>>>
>>>>>>> the
>>>>>>>
>>>>>>>
>>>>>>>> same time?
>>>>>>>>
>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>>
>>>>>>>>>> Sanda.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Sanda Ionescu
>>>>>>>>>> ICPSR
>>>>>>>>>> University of Michigan
>>>>>>>>>> P.O. Box 1248
>>>>>>>>>> Ann Arbor, MI 48106
>>>>>>>>>>
>>>>>>>>>> Phone, Fax: 734-615-7890
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> Wendy L. Thomas                          Phone: +1 612.624.4389
>>>>>>>>> Data Access Core Director         Fax:   +1 612.626.8375
>>>>>>>>> Minnesota Population Center              Email: wlt at pop.umn.edu
>>>>>>>>> University of Minnesota
>>>>>>>>> 50 Willey Hall
>>>>>>>>> 225 19th Avenue South
>>>>>>>>> Minneapolis, MN 55455
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> Wendy L. Thomas                          Phone: +1 612.624.4389
>>>>>>>> Data Access Core Director         Fax:   +1 612.626.8375
>>>>>>>> Minnesota Population Center              Email: wlt at pop.umn.edu
>>>>>>>> University of Minnesota
>>>>>>>> 50 Willey Hall
>>>>>>>> 225 19th Avenue South
>>>>>>>> Minneapolis, MN 55455
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> Wendy L. Thomas                          Phone: +1 612.624.4389
>>>>>>> Data Access Core Director         Fax:   +1 612.626.8375
>>>>>>> Minnesota Population Center              Email: wlt at pop.umn.edu
>>>>>>> University of Minnesota
>>>>>>> 50 Willey Hall
>>>>>>> 225 19th Avenue South
>>>>>>> Minneapolis, MN 55455
>>>>>>> _______________________________________________
>>>>>>> DDI-SRG mailing list
>>>>>>> DDI-SRG at icpsr.umich.edu
>>>>>>> http://www.icpsr.umich.edu/mailman/listinfo/ddi-srg
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> Wendy L. Thomas                          Phone: +1 612.624.4389
>>>>>> Data Access Core Director         Fax:   +1 612.626.8375
>>>>>> Minnesota Population Center              Email: wlt at pop.umn.edu
>>>>>> University of Minnesota
>>>>>> 50 Willey Hall
>>>>>> 225 19th Avenue South
>>>>>> Minneapolis, MN 55455
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> --
>>>>>
>>>>> Pascal Heus
>>>>>
>>>>> Director, Metadata Technology
>>>>>
>>>>> http://www.metadatatechnology.com
>>>>>
>>>>>
>>>>> ============================================================
>>>>>
>>>>> This email is intended only for the person to whom it is addressed
>>>>> and/or otherwise authorized personnel. The information contained herein
>>>>> and attached is confidential and the property of Metadata Technology
>>>>> Ltd.. If you are not the intended recipient, please be advised that
>>>>> viewing this message and any attachments, as well as copying,
>>>>> forwarding, printing, and disseminating any information related to this
>>>>> email is prohibited, and that you should not take any action based on
>>>>> the content of this email and/or its attachments. If you received this
>>>>> message in error, please contact the sender and destroy all copies of
>>>>> this email and any attachment. Please note that the views and opinions
>>>>> expressed herein are solely those of the author and do not necessarily
>>>>> reflect those of the company. While antivirus protection tools have been
>>>>> employed, you should check this email and attachments for the presence
>>>>> of viruses. No warranties or assurances are made in relation to the
>>>>> safety and content of this email and attachments. Metadata Technology
>>>>> Ltd. accepts no liability for any damage caused by any virus transmitted
>>>>> by or contained in this email and attachments. No liability is accepted
>>>>> for any consequences arising from this email.
>>>>>
>>>>>
>>>>>
>>>> Wendy L. Thomas                          Phone: +1 612.624.4389
>>>> Data Access Core Director         Fax:   +1 612.626.8375
>>>> Minnesota Population Center              Email: wlt at pop.umn.edu
>>>> University of Minnesota
>>>> 50 Willey Hall
>>>> 225 19th Avenue South
>>>> Minneapolis, MN 55455
>>>>
>>>>
>>> --
>>>
>>> Pascal Heus
>>>
>>> Director, Metadata Technology
>>>
>>> http://www.metadatatechnology.com
>>>
>>>
>>> ============================================================
>>>
>>> This email is intended only for the person to whom it is addressed
>>> and/or otherwise authorized personnel. The information contained herein
>>> and attached is confidential and the property of Metadata Technology
>>> Ltd.. If you are not the intended recipient, please be advised that
>>> viewing this message and any attachments, as well as copying,
>>> forwarding, printing, and disseminating any information related to this
>>> email is prohibited, and that you should not take any action based on
>>> the content of this email and/or its attachments. If you received this
>>> message in error, please contact the sender and destroy all copies of
>>> this email and any attachment. Please note that the views and opinions
>>> expressed herein are solely those of the author and do not necessarily
>>> reflect those of the company. While antivirus protection tools have been
>>> employed, you should check this email and attachments for the presence
>>> of viruses. No warranties or assurances are made in relation to the
>>> safety and content of this email and attachments. Metadata Technology
>>> Ltd. accepts no liability for any damage caused by any virus transmitted
>>> by or contained in this email and attachments. No liability is accepted
>>> for any consequences arising from this email.
>>>
>>>
>>>
>>
>> Wendy L. Thomas                          Phone: +1 612.624.4389
>> Data Access Core Director		 Fax:   +1 612.626.8375
>> Minnesota Population Center              Email: wlt at pop.umn.edu
>> University of Minnesota
>> 50 Willey Hall
>> 225 19th Avenue South
>> Minneapolis, MN 55455
>> _______________________________________________
>> DDI-SRG mailing list
>> DDI-SRG at icpsr.umich.edu
>> http://www.icpsr.umich.edu/mailman/listinfo/ddi-srg
>>
>>
>
>
> -- 
>
> Pascal Heus
>
> Director, Metadata Technology
>
> http://www.metadatatechnology.com
>
>
> ============================================================
>
> This email is intended only for the person to whom it is addressed
> and/or otherwise authorized personnel. The information contained herein
> and attached is confidential and the property of Metadata Technology
> Ltd.. If you are not the intended recipient, please be advised that
> viewing this message and any attachments, as well as copying,
> forwarding, printing, and disseminating any information related to this
> email is prohibited, and that you should not take any action based on
> the content of this email and/or its attachments. If you received this
> message in error, please contact the sender and destroy all copies of
> this email and any attachment. Please note that the views and opinions
> expressed herein are solely those of the author and do not necessarily
> reflect those of the company. While antivirus protection tools have been
> employed, you should check this email and attachments for the presence
> of viruses. No warranties or assurances are made in relation to the
> safety and content of this email and attachments. Metadata Technology
> Ltd. accepts no liability for any damage caused by any virus transmitted
> by or contained in this email and attachments. No liability is accepted
> for any consequences arising from this email.
>
>

Wendy L. Thomas                          Phone: +1 612.624.4389
Data Access Core Director		 Fax:   +1 612.626.8375
Minnesota Population Center              Email: wlt at pop.umn.edu
University of Minnesota
50 Willey Hall
225 19th Avenue South
Minneapolis, MN 55455


More information about the DDI-SRG mailing list