[DDI-SRG] FW: stats question please

Pascal Heus pascal.heus at metadatatechnology.com
Wed Mar 25 17:47:26 EDT 2009


Unless we need to agree on this one quickly, it could be pushed to 3.2.
*P

Wendy Thomas wrote:
> This is being added to the resolved bug list? Making this repeatable has 
> ramifications for the reference to summary statistics held in another 
> physical instance because Statistics is not identifiable. It didn't need 
> to be because it was the only one in the file.
>
> wendy
>
>
> On Wed, 25 Mar 2009, Pascal Heus wrote:
>
>   
>> You can still do this if you want, we're we're just adding
>> repeatability, a default weight and a descriptive field (not possible
>> today).
>> cheers
>> *P
>>
>> Wendy Thomas wrote:
>>     
>>> That assumes you want to bundle by summary statistic type. Others may
>>> want to bundle by variable in the way that a statistical package
>>> produces summary statistics
>>>
>>> for each variable provide  count, median, mean, standard deviation, etc.
>>>
>>> This is definately an issue for 3.2 as providing the flexibility to
>>> handle both situations could become problematic
>>>
>>> Right now the only thing we are preventing is bundling on one
>>> dimension or the other, we're not preventing the expression of the
>>> summary statistic.
>>>
>>> At least we won't lack things to argue over in the future :)
>>>
>>> wendy
>>>
>>> On Wed, 25 Mar 2009, Pascal Heus wrote:
>>>
>>>       
>>>> Wendy:
>>>> The reason we suggest to have <Statistics> repeatable is that you can
>>>> then create sets, point to a default weight once for all, and include
>>>> descriptive information.
>>>>
>>>> For example:
>>>> <Statistics>
>>>>   <Description>Frequencies / Unweighted statistics</Descripiton>
>>>>   <VariableStatistics>...</VariableStatistics>
>>>>   ...
>>>> </Statistics>
>>>> <Statistics>
>>>>   <Description>Statistics using household weight</Descripiton>
>>>>   <DefalutWieghtVariableReference> (or <DefaultWeightUsedReference)
>>>>   <VariableStatistics>...</VariableStatistics>
>>>>   ...
>>>> </Statistics>
>>>> <Statistics>
>>>>   <Description>Statistics using panel weight</Descripiton>
>>>>   <DefalutWieghtVariableReference> (or <DefaultWeightUsedReference)
>>>>   <VariableStatistics>...</VariableStatistics>
>>>>   ...
>>>> </Statistics>
>>>>
>>>> best
>>>> *P
>>>>
>>>> arofan.gregory wrote:
>>>>         
>>>>> Wendy:
>>>>>
>>>>> It's the not-bundling part that we didn't like. As stated earlier,
>>>>> I'll let
>>>>> Pascal explore this one with you, as I feel he has a better handle
>>>>> on the
>>>>> issue.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Arofan
>>>>>
>>>>> -----Original Message-----
>>>>> From: Wendy Thomas [mailto:wlt at pop.umn.edu]
>>>>> Sent: Tuesday, March 24, 2009 6:30 PM
>>>>> To: arofan.gregory
>>>>> Cc: 'Mary Vardigan'; 'Sanda Ionescu'; 'TIC list'
>>>>> Subject: RE: [DDI-SRG] FW: stats question please
>>>>>
>>>>> I thought that in essence it is repeatable as I can provide more
>>>>> than one
>>>>> summary statistic referencing the same variable but a different weight.
>>>>> Yep...just checked. Under Statistics I have an unbounded number of
>>>>> VariableStatistic. Each VariableStatistic can have one weight, but I
>>>>> can
>>>>> have multiple VariableStatistic items referencing the same variable. Of
>>>>> course that doesn't bundle them, but you would need to either redeclare
>>>>> the weight for the category statistics or reference the variable and
>>>>> then
>>>>> have a separate repeatable structure that identified weight and
>>>>> statistitic value and category statistics using that weight.
>>>>>
>>>>> wendy
>>>>>
>>>>>
>>>>> On Tue, 24 Mar 2009, arofan.gregory wrote:
>>>>>
>>>>>
>>>>>           
>>>>>> Wendy:
>>>>>>
>>>>>> No - there's wasn't any approach that worked that I could see. They've
>>>>>> started using "bootstrap weights" at Stats Can, so this will be an
>>>>>>
>>>>>>             
>>>>> on-going
>>>>>
>>>>>           
>>>>>> issue, and they have to describe the data as they receive it, so
>>>>>> inventing
>>>>>>
>>>>>>             
>>>>> a
>>>>>
>>>>>           
>>>>>> variable isn't something they can do.
>>>>>>
>>>>>> As I understand the requirement, what is needed is to have several
>>>>>> sets of
>>>>>> statistics per variable - the fix we envisioned was to make the
>>>>>>
>>>>>>             
>>>>> "Statistics"
>>>>>
>>>>>           
>>>>>> element repeatable. I should probably turn you over to Pascal, as I
>>>>>>
>>>>>>             
>>>>> believe
>>>>>
>>>>>           
>>>>>> he understand the issue better than I do.
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Arofan
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: ddi-srg-bounces at icpsr.umich.edu
>>>>>> [mailto:ddi-srg-bounces at icpsr.umich.edu] On Behalf Of Wendy Thomas
>>>>>> Sent: Tuesday, March 24, 2009 2:56 PM
>>>>>> To: arofan.gregory
>>>>>> Cc: 'Mary Vardigan'; 'Sanda Ionescu'; 'TIC list'
>>>>>> Subject: Re: [DDI-SRG] FW: stats question please
>>>>>>
>>>>>> Different sets of statistics are easy as you can simply provide
>>>>>> multiple
>>>>>> summary statistics for the same variable. The question is when you use
>>>>>> multiple weights to create a statistics. I recommended the use of a
>>>>>> variable that reflects the generated weight using a generation
>>>>>> instruction
>>>>>> to describe the process. Then that weight can be referred to from the
>>>>>> summary statistics. Did you have another approach??
>>>>>>
>>>>>> wendy
>>>>>>
>>>>>>
>>>>>> On Tue, 24 Mar 2009, arofan.gregory wrote:
>>>>>>
>>>>>>
>>>>>>             
>>>>>>> Wendy:
>>>>>>>
>>>>>>> We're here in Ottowa working with the Canadian RDC people (Stats Can
>>>>>>>
>>>>>>>               
>>>>> data)
>>>>>
>>>>>           
>>>>>>> and they have exactly the same issue - they need to have different
>>>>>>>
>>>>>>>               
>>>>> weights
>>>>>
>>>>>           
>>>>>>> for a single study, and therefore different sets of statistics.
>>>>>>>
>>>>>>> This is *very* common in Stats Canada data, apparently.
>>>>>>>
>>>>>>> I'm happy to show you how I think it could be fixed.
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>> Arofan
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: ddi-srg-bounces at icpsr.umich.edu
>>>>>>> [mailto:ddi-srg-bounces at icpsr.umich.edu] On Behalf Of Wendy Thomas
>>>>>>> Sent: Monday, March 23, 2009 11:26 AM
>>>>>>> To: Sanda Ionescu
>>>>>>> Cc: Mary Vardigan; TIC list
>>>>>>> Subject: Re: [DDI-SRG] FW: stats question please
>>>>>>>
>>>>>>> We require the ability to make this machine actionable. We are
>>>>>>> creating
>>>>>>> variable representations of measures, concatonated measures providing
>>>>>>> unique identifiers and links. We are no longer just documenting data.
>>>>>>>
>>>>>>> Wendy
>>>>>>>
>>>>>>>
>>>>>>> On Mon, 23 Mar 2009, Sanda Ionescu wrote:
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>>>> Wendy,
>>>>>>>>
>>>>>>>> It is for case 2) - a combination of weights.
>>>>>>>> But ... I don't know...
>>>>>>>> Creating a "fake" weight in DDI only (not in the data) seems a
>>>>>>>> little
>>>>>>>>
>>>>>>>>                 
>>>>>>> far-fetched.
>>>>>>>
>>>>>>>               
>>>>>>>> And really, I'm not sure we should be going that way.
>>>>>>>> Aren't we just documenting data?
>>>>>>>> Why resort to creating artifices like that - assigning fake values,
>>>>>>>>
>>>>>>>>                 
>>>>>>> creating imaginary weights?
>>>>>>>
>>>>>>>               
>>>>>>>> And again, I think this makes it so much more difficult to
>>>>>>>> automate -
>>>>>>>>
>>>>>>>>                 
>>>>>> both
>>>>>>
>>>>>>             
>>>>>>> direct markup, and conversions.
>>>>>>>
>>>>>>>               
>>>>>>>> Sanda.
>>>>>>>>
>>>>>>>> Sanda Ionescu
>>>>>>>> ICPSR
>>>>>>>> University of Michigan
>>>>>>>> P.O. Box 1248
>>>>>>>> Ann Arbor, MI 48106
>>>>>>>>
>>>>>>>> Phone, Fax: 734-615-7890
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Wendy Thomas [mailto:wlt at pop.umn.edu]
>>>>>>>> Sent: Monday, March 23, 2009 12:48 PM
>>>>>>>> To: Sanda Ionescu
>>>>>>>> Subject: Re: FW: stats question please
>>>>>>>>
>>>>>>>> Please clarify
>>>>>>>>
>>>>>>>> If they mean that a variable can have more than one weight, a
>>>>>>>> separate
>>>>>>>> statistic is created for each weight used.
>>>>>>>>
>>>>>>>> If they mean they use a combination of weights, then DDI 3.0
>>>>>>>> requires
>>>>>>>>
>>>>>>>>                 
>>>>>> they
>>>>>>
>>>>>>             
>>>>>>>> create a variable (not expressed in the data set physical
>>>>>>>> content) that
>>>>>>>> provides the expression of that multiple weighting through its
>>>>>>>>
>>>>>>>>                 
>>>>> generation
>>>>>
>>>>>           
>>>>>>>> instruction.
>>>>>>>>
>>>>>>>> For example: If I had a standard weight A and also applied weight
>>>>>>>>
>>>>>>>>                 
>>>>>> variable
>>>>>>
>>>>>>             
>>>>>>>> X and weight variable Y in calculating the statistic then I would
>>>>>>>> need
>>>>>>>>
>>>>>>>>                 
>>>>> to
>>>>>
>>>>>           
>>>>>>>> create a variable Z which using the generation instruction  of
>>>>>>>> "StandardWeightA * variableX * variableY.
>>>>>>>>
>>>>>>>> It may not be how they want to express it, but this is the way
>>>>>>>> its done.
>>>>>>>>
>>>>>>>> Wendy
>>>>>>>>
>>>>>>>>
>>>>>>>>  On Mon, 23 Mar 2009, Sanda Ionescu wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>>>>>> Hi, Wendy.
>>>>>>>>> This is a different issue.
>>>>>>>>> Our researchers/ analysis specialists seem to think that it is
>>>>>>>>>
>>>>>>>>>                   
>>>>> necessary
>>>>>
>>>>>           
>>>>>>> to be able to reference multiple weights for a statistic.
>>>>>>>
>>>>>>>               
>>>>>>>>> Please see exchange below.
>>>>>>>>> Could you submit this as a request for a change to TIC?
>>>>>>>>> Thanks
>>>>>>>>> Sanda.
>>>>>>>>>
>>>>>>>>> Sanda Ionescu
>>>>>>>>> ICPSR
>>>>>>>>> University of Michigan
>>>>>>>>> P.O. Box 1248
>>>>>>>>> Ann Arbor, MI 48106
>>>>>>>>>
>>>>>>>>> Phone, Fax: 734-615-7890
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Felicia LeClere
>>>>>>>>> Sent: Monday, March 23, 2009 10:14 AM
>>>>>>>>> To: Sanda Ionescu
>>>>>>>>> Subject: Re: stats question please
>>>>>>>>>
>>>>>>>>> It happens...so it is necessary to be able to have more than one
>>>>>>>>> weight
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Sanda Ionescu
>>>>>>>>> To: Lynette Hoelter; Felicia LeClere
>>>>>>>>> CC: Mary Vardigan
>>>>>>>>> Sent: Mon Mar 23 09:23:52 2009
>>>>>>>>> Subject: RE: stats question please
>>>>>>>>>
>>>>>>>>> It's about DDI.
>>>>>>>>>
>>>>>>>>> Version 3 only allows 1 weight to be defined per statistical value.
>>>>>>>>>
>>>>>>>>> And we need to know if it is necessary to ask for the weight to
>>>>>>>>> become
>>>>>>>>>
>>>>>>>>>                   
>>>>>>> repeatable (so that one can link to more than one).
>>>>>>>
>>>>>>>               
>>>>>>>>> This is not about whether it's LIKELY or not (because, yes, it is
>>>>>>>>>
>>>>>>>>>                   
>>>>>>> UNLIKELY) but rather is it necessary to cover for that possibility
>>>>>>> too,
>>>>>>> however unlikely it may be?
>>>>>>>
>>>>>>>               
>>>>>>>>> Sanda.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Sanda Ionescu
>>>>>>>>> ICPSR
>>>>>>>>> University of Michigan
>>>>>>>>> P.O. Box 1248
>>>>>>>>> Ann Arbor, MI 48106
>>>>>>>>>
>>>>>>>>> Phone, Fax: 734-615-7890
>>>>>>>>>
>>>>>>>>> From: Lynette Hoelter
>>>>>>>>> Sent: Friday, March 20, 2009 4:22 PM
>>>>>>>>> To: Sanda Ionescu; Felicia LeClere
>>>>>>>>> Cc: Mary Vardigan
>>>>>>>>> Subject: RE: stats question please
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> In theory, I don't see why not because the weights just multiple
>>>>>>>>> the
>>>>>>>>>
>>>>>>>>>                   
>>>>>>> values for the cases by some fraction, so you could multiply by two.
>>>>>>>
>>>>>>>               
>>>>>>>>> Why?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ....................................................
>>>>>>>>> Lynette F. Hoelter, Ph.D.
>>>>>>>>> Dir., Instructional Resources & Development, ICPSR
>>>>>>>>> Institute for Social Research
>>>>>>>>> University of Michigan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> From: Sanda Ionescu
>>>>>>>>> Sent: Friday, March 20, 2009 3:13 PM
>>>>>>>>> To: Felicia LeClere; Lynette Hoelter
>>>>>>>>> Cc: Mary Vardigan
>>>>>>>>> Subject: stats question please
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Felicia, Lynette,
>>>>>>>>>
>>>>>>>>> In Variable Statistics is it POSSIBLE  to obtain ONE value (say,
>>>>>>>>> mean,
>>>>>>>>>
>>>>>>>>>                   
>>>>>> or
>>>>>>
>>>>>>             
>>>>>>> mode, etc.) by applying TWO different weights - not sequentially,
>>>>>>> but at
>>>>>>>
>>>>>>>               
>>>>>> the
>>>>>>
>>>>>>             
>>>>>>> same time?
>>>>>>>
>>>>>>>               
>>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>>> Sanda.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Sanda Ionescu
>>>>>>>>> ICPSR
>>>>>>>>> University of Michigan
>>>>>>>>> P.O. Box 1248
>>>>>>>>> Ann Arbor, MI 48106
>>>>>>>>>
>>>>>>>>> Phone, Fax: 734-615-7890
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                   
>>>>>>>> Wendy L. Thomas                          Phone: +1 612.624.4389
>>>>>>>> Data Access Core Director         Fax:   +1 612.626.8375
>>>>>>>> Minnesota Population Center              Email: wlt at pop.umn.edu
>>>>>>>> University of Minnesota
>>>>>>>> 50 Willey Hall
>>>>>>>> 225 19th Avenue South
>>>>>>>> Minneapolis, MN 55455
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>>>> Wendy L. Thomas                          Phone: +1 612.624.4389
>>>>>>> Data Access Core Director         Fax:   +1 612.626.8375
>>>>>>> Minnesota Population Center              Email: wlt at pop.umn.edu
>>>>>>> University of Minnesota
>>>>>>> 50 Willey Hall
>>>>>>> 225 19th Avenue South
>>>>>>> Minneapolis, MN 55455
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>> Wendy L. Thomas                          Phone: +1 612.624.4389
>>>>>> Data Access Core Director         Fax:   +1 612.626.8375
>>>>>> Minnesota Population Center              Email: wlt at pop.umn.edu
>>>>>> University of Minnesota
>>>>>> 50 Willey Hall
>>>>>> 225 19th Avenue South
>>>>>> Minneapolis, MN 55455
>>>>>> _______________________________________________
>>>>>> DDI-SRG mailing list
>>>>>> DDI-SRG at icpsr.umich.edu
>>>>>> http://www.icpsr.umich.edu/mailman/listinfo/ddi-srg
>>>>>>
>>>>>>
>>>>>>             
>>>>> Wendy L. Thomas                          Phone: +1 612.624.4389
>>>>> Data Access Core Director         Fax:   +1 612.626.8375
>>>>> Minnesota Population Center              Email: wlt at pop.umn.edu
>>>>> University of Minnesota
>>>>> 50 Willey Hall
>>>>> 225 19th Avenue South
>>>>> Minneapolis, MN 55455
>>>>>
>>>>>
>>>>>
>>>>>           
>>>> --
>>>>
>>>> Pascal Heus
>>>>
>>>> Director, Metadata Technology
>>>>
>>>> http://www.metadatatechnology.com
>>>>
>>>>
>>>> ============================================================
>>>>
>>>> This email is intended only for the person to whom it is addressed
>>>> and/or otherwise authorized personnel. The information contained herein
>>>> and attached is confidential and the property of Metadata Technology
>>>> Ltd.. If you are not the intended recipient, please be advised that
>>>> viewing this message and any attachments, as well as copying,
>>>> forwarding, printing, and disseminating any information related to this
>>>> email is prohibited, and that you should not take any action based on
>>>> the content of this email and/or its attachments. If you received this
>>>> message in error, please contact the sender and destroy all copies of
>>>> this email and any attachment. Please note that the views and opinions
>>>> expressed herein are solely those of the author and do not necessarily
>>>> reflect those of the company. While antivirus protection tools have been
>>>> employed, you should check this email and attachments for the presence
>>>> of viruses. No warranties or assurances are made in relation to the
>>>> safety and content of this email and attachments. Metadata Technology
>>>> Ltd. accepts no liability for any damage caused by any virus transmitted
>>>> by or contained in this email and attachments. No liability is accepted
>>>> for any consequences arising from this email.
>>>>
>>>>
>>>>         
>>> Wendy L. Thomas                          Phone: +1 612.624.4389
>>> Data Access Core Director         Fax:   +1 612.626.8375
>>> Minnesota Population Center              Email: wlt at pop.umn.edu
>>> University of Minnesota
>>> 50 Willey Hall
>>> 225 19th Avenue South
>>> Minneapolis, MN 55455
>>>
>>>       
>> -- 
>>
>> Pascal Heus
>>
>> Director, Metadata Technology
>>
>> http://www.metadatatechnology.com
>>
>>
>> ============================================================
>>
>> This email is intended only for the person to whom it is addressed
>> and/or otherwise authorized personnel. The information contained herein
>> and attached is confidential and the property of Metadata Technology
>> Ltd.. If you are not the intended recipient, please be advised that
>> viewing this message and any attachments, as well as copying,
>> forwarding, printing, and disseminating any information related to this
>> email is prohibited, and that you should not take any action based on
>> the content of this email and/or its attachments. If you received this
>> message in error, please contact the sender and destroy all copies of
>> this email and any attachment. Please note that the views and opinions
>> expressed herein are solely those of the author and do not necessarily
>> reflect those of the company. While antivirus protection tools have been
>> employed, you should check this email and attachments for the presence
>> of viruses. No warranties or assurances are made in relation to the
>> safety and content of this email and attachments. Metadata Technology
>> Ltd. accepts no liability for any damage caused by any virus transmitted
>> by or contained in this email and attachments. No liability is accepted
>> for any consequences arising from this email.
>>
>>
>>     
>
> Wendy L. Thomas                          Phone: +1 612.624.4389
> Data Access Core Director		 Fax:   +1 612.626.8375
> Minnesota Population Center              Email: wlt at pop.umn.edu
> University of Minnesota
> 50 Willey Hall
> 225 19th Avenue South
> Minneapolis, MN 55455
> _______________________________________________
> DDI-SRG mailing list
> DDI-SRG at icpsr.umich.edu
> http://www.icpsr.umich.edu/mailman/listinfo/ddi-srg
>
>   


-- 

Pascal Heus

Director, Metadata Technology

http://www.metadatatechnology.com


============================================================

This email is intended only for the person to whom it is addressed
and/or otherwise authorized personnel. The information contained herein
and attached is confidential and the property of Metadata Technology
Ltd.. If you are not the intended recipient, please be advised that
viewing this message and any attachments, as well as copying,
forwarding, printing, and disseminating any information related to this
email is prohibited, and that you should not take any action based on
the content of this email and/or its attachments. If you received this
message in error, please contact the sender and destroy all copies of
this email and any attachment. Please note that the views and opinions
expressed herein are solely those of the author and do not necessarily
reflect those of the company. While antivirus protection tools have been
employed, you should check this email and attachments for the presence
of viruses. No warranties or assurances are made in relation to the
safety and content of this email and attachments. Metadata Technology
Ltd. accepts no liability for any damage caused by any virus transmitted
by or contained in this email and attachments. No liability is accepted
for any consequences arising from this email.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.icpsr.umich.edu/pipermail/ddi-srg/attachments/20090325/4fbb356b/attachment-0001.html 


More information about the DDI-SRG mailing list