[DDI-ADG] Version 3.0 approach to controlled vocabularies

Wendy Thomas wlt at pop.umn.edu
Fri Jul 29 12:04:22 EDT 2005


Ok, this is not going to get into the mechanics of the process, but here
is what we have settled on in terms of approach. J, please correct
anything that looks odd to you.

In general our approach is that Version 3.0 must have much stronger typing
than earlier versions in order to make it more machine processable. This
includes a wider and more consistant use of controlled vocabularies. The
SRG has gone through Version 2.0 and identified all locations of current
controlled vocabularies and locations where the comments indicate that
controlled vocabularies need to be created. (see attachment) We have also
paid attention to the which new or additional fields need controlled
vocabularies due to the structure of Version 3.0.

We are also aware that controlled vocabularies require more flexibility
than other elements of the model in order to reflect new developments,
odd-ball data etc. Because of this tension between the need for both
strong typing and flexibility, we will most likely rely on controlled
vocabularies stored external to the scheme. (This is not an all or nothing
approach, there will obviously be some controlled vocabs that will be
internal to the schema since they are not dynamic in nature...Yes | No
comes to mind).

Our greatest concern lies in the way controlled vocabularies are developed
and maintained. There needs to be a separate, expidited process for
managing and maintaining these vocabulary lists. One thought, in terms of
getting them going was to put out a base list to the expert committee
after the October meeting. This would include all the known locations for
controlled language, preliminary lists of words and definitions, and any
issues regarding the list that came out of the working groups.

The focus on these should be the definitions (are they clear, exhaustive,
mutually exclusive, etc.). We don't want to get into a fight over specifc
terms. T1, T2, T3 etc. while not intuitive becomes a default option if we
get too involved in a game of semantics. The discussions between the SRG
and the Comparative WG bore this out. If I say po-tae-to and you say
po-tah-to but we have the same definition, we are capable of mapping one
to the other and THAT is what's important.

All of this means the DDI needs a process for recommending additions to a
controlled vocabulary list that can be reviewed and processed quickly. It
also needs to accept responsibility for keeping these lists up and
available.

The detailed contents of these lists can be hashed out after the October
SRG meeting with representatives of the working groups. What is important
at this point is to a) make sure that all fields requiring controlled
vocabulary are identified, b) provide justification if its not clear, and
c)  provide a preliminary list with term (or T1, T2 etc) and definitions.

As you've already noted many of these controlled vocabularies fall into
the interest areas of multiple Working Groups so that no one working group
is going to end up making the final definitive list.

Wendy


On Fri, 29 Jul 2005, Mary Vardigan wrote:

> Hello, everyone. I talked with Wendy earlier this week about the SRG's
> strategy for handling controlled vocabularies in Version 3.0. Wendy, could
> you please send something to the whole group about this?
>
> Thanks,
> Mary
>
> Mary Vardigan
> Director, Collection Delivery
> Inter-university Consortium for Political and Social Research (ICPSR)
> University of Michigan
> P.O. Box 1248, Ann Arbor, MI 48106-1248
> Phone: 734-615-7908
> Fax: 734-647-8200
> www.icpsr.umich.edu
>
> _______________________________________________
> DDI-ADG mailing list
> DDI-ADG at icpsr.umich.edu
> http://www.icpsr.umich.edu/mailman/listinfo/ddi-adg
>

Wendy L. Thomas                          Phone: +1 612.624.4389
Data Access Core Director		 Fax:   +1 612.626.8375
Minnesota Population Center              Email: wlt at pop.umn.edu
University of Minnesota
50 Willey Hall
225 19th Avenue South
Minneapolis, MN 55455
-------------- next part --------------
A non-text attachment was scrubbed...
Name: CONTROLLEDVOCAB.doc
Type: application/octet-stream
Size: 33280 bytes
Desc: 
Url : http://www.icpsr.umich.edu/pipermail/ddi-adg/attachments/20050729/bb318c9a/CONTROLLEDVOCAB-0001.obj


More information about the DDI-ADG mailing list