[DDI-SRG] DDI 3.0 simple data dictionary
Wendy Thomas
wlt at pop.umn.edu
Mon Dec 22 14:58:08 EST 2008
Mike,
Depending on the current storage structure of you metadata you could
probably write a simple perl script similar to the one you have to create
a very basic data dictionary. I've attached a sample of how what you
provided would be entered in DDI 3.0. I do recommend DDI 3.0 for those
starting in DDI at this point because of the added features. I am familer
with a number of data collections within the DOJ as well as the need to
link these to geographic files. If you expand the application of DDI to
your at all, you will quickly find that these features will come in handy.
A qucck walkthrough of the attached xml file will help you get oriented.
First, published DDI instances are always wrapped in a DDIInstance to make
them consistantly recognizable. While I routinely list all the schemas
available in DDI the only ones needed here are instance, studyunit,
conceptualcomponent, logicalproduct, physicaldatastructure,
physicalinstance, archive, and reusable (along with the support files for
XHTML and dublin core). If you are using this only for internal purposes
you could skip the instance and the archive sections, but this doesn't
really save you much.
The required information in the StudyUnit consists of a brief citation
(title required), universe reference, abstract, and purpose. These second
two can be very brief. You need to declare the universe in a universe
scheme, at least the top level. At the end of the instance is a brief
identification of the responsible agency, refered to as the archive. It
contains a reference to itself in the organization scheme. Once again this
can be very brief but as this is standard information it could be produced
directly from the perl script.
The body of the instance is broken into three parts:
LogicalProduct
Describes the record of data as a whole. As this is a simple record its
pretty brief. The LogicalRecord within the DataRelationship section serves
the purpose of the link between the information on the phyiscal layout of
the data record and the intellectual content.
VariableScheme contains the intellectual information information on each
variable. In your file you have declared them all to be strings so all are
text representations. Each variable has a Name, Label, and
TextRepresentation with the maximum field length. Pretty basic. Other
options are numeric, coded categories, date, etc.
PhysicalDataProduct
This describes the physical layout in two steps. First is the physical
structure where you find the link to the LogicalRecord. Once again this is
a simple file so there is a single physical record segment and the default
delimiter is identified.
The RecordLayoutScheme identifies the RecordLayout, links it to the
PhysicalSegment within the PhysicalStructure, indicates the language of
the file (ASCII) the ArrayBase (1) and then lists each data item in the
file. This consists of a reference to the variable, its array number, and
width.
PhysicalInstance is simply a refrence to the location of the physical data
file that the metadata describes, linking to the PhysicalStructure by
referencing the RecordLayout.
The may seem a bit convoluted but consider that you may have multipel
copies of a data file (multiple physicalinstances of the same file), or
different formats of the data (different physicaldatastructures), all
pointing back to the same intellectual description of the data. This
structure allows you to copy and reformat your data and keep it all linked
to the common description of the data.
If you haven't joined the DDI User group yet, you should as there will be
training sessions in the US announced over the coming year. Also please
contact with an further questions you have. I have asked Achim, the author
of the software you mentioned to contact you upon his return from vacation
(mid January). I hope this helped clarify what you needed for a basic data
dictionary in DDI 3.0
Wendy Thomas
Chair, DDI Technical Implementation Committee
Wendy L. Thomas Phone: +1 612.624.4389
Data Access Core Director Fax: +1 612.626.8375
Minnesota Population Center Email: wlt at pop.umn.edu
University of Minnesota
50 Willey Hall
225 19th Avenue South
Minneapolis, MN 55455
-------------- next part --------------
<?xml version="1.0"?>
<ddi:DDIInstance xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="ddi:instance:3_0 instance.xsd" xmlns:ddi="ddi:instance:3_0" xmlns:r="ddi:reusable:3_0" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:dce="ddi:dcelements:3_0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:a="ddi:archive:3_0" xmlns:g="ddi:group:3_0" xmlns:cm="ddi:comparative:3_0" xmlns:c="ddi:conceptualcomponent:3_0" xmlns:d="ddi:datacollection:3_0" xmlns:l="ddi:logicalproduct:3_0" xmlns:p="ddi:physicaldataproduct:3_0" xmlns:ds="ddi:dataset:3_0" xmlns:pi="ddi:physicalinstance:3_0" xmlns:m1="ddi:physicaldataproduct/ncube/normal:3_0" xmlns:m2="ddi:physicaldataproduct/ncube/tabular:3_0" xmlns:m3="ddi:physicaldataproduct/ncube/inline:3_0" xmlns:s="ddi:studyunit:3_0" xmlns:pr="ddi:profile:3_0" isMaintainable="true" id="datadictionary" version="1.0" versionDate="2008-12-19" agency="mpc.umn.ddi" urn="urn:ddi:3.0:Instance=datadictionary:mpc.umn.ddi[1.0]">
<s:StudyUnit isMaintainable="true" id="WLT_DD" version="2.0" versionDate="2008-12-19">
<r:Citation>
<r:Title>Sample Data Dictionary</r:Title>
</r:Citation>
<s:Abstract isIdentifiable="true" id="ABS_1"><r:Content xml:lang="en">Limited data dictionary</r:Content> </s:Abstract>
<r:UniverseReference isReference="true"><r:ID>U1</r:ID></r:UniverseReference>
<s:Purpose isIdentifiable="true" id="PUR_1"><r:Content xml:lang="en">To show it can be done</r:Content></s:Purpose>
<c:ConceptualComponent isMaintainable="true" id="CC">
<c:UniverseScheme isMaintainable="true" id="UScheme">
<c:Universe isVersionable="true" id="U1">
<c:HumanReadable xml:lang="en">BLAHBLAHBLAH in the United States</c:HumanReadable>
</c:Universe>
</c:UniverseScheme>
</c:ConceptualComponent>
<l:LogicalProduct isMaintainable="true" id="LP_1">
<l:DataRelationship isIdentifiable="true" id="DR_1"><r:Description>Single logical record</r:Description>
<l:LogicalRecord isIdentifiable="true" id="LR_1" hasLocator="false">
<l:VariablesInRecord allVariablesInLogicalProduct="true"></l:VariablesInRecord></l:LogicalRecord>
</l:DataRelationship>
<l:VariableScheme isMaintainable="true" id="VS_1">
<l:Variable isVersionable="true" id="V1" isTemporal="false" isGeographic="false" isWeight="false">
<r:Name xml:lang="en">CERT</r:Name>
<r:Label xml:lang="en">FDIC Certificate Number</r:Label>
<l:Representation>
<l:TextRepresentation maxLength="5">
</l:TextRepresentation>
</l:Representation>
</l:Variable>
<l:Variable isVersionable="true" id="V2" isTemporal="false" isGeographic="false" isWeight="false">
<r:Name xml:lang="en">BRNUM</r:Name>
<r:Label xml:lang="en">Office Number</r:Label>
<l:Representation>
<l:TextRepresentation maxLength="4">
</l:TextRepresentation>
</l:Representation>
</l:Variable>
<l:Variable isVersionable="true" id="V3" isTemporal="false" isGeographic="false" isWeight="false">
<r:Name xml:lang="en">STCNTYBR</r:Name>
<r:Label xml:lang="en">State and County Number (Branch)</r:Label>
<l:Representation>
<l:TextRepresentation maxLength="5">
</l:TextRepresentation>
</l:Representation>
</l:Variable>
<l:Variable isVersionable="true" id="V4" isTemporal="false" isGeographic="false" isWeight="false">
<r:Name xml:lang="en">CBSA_METROB</r:Name>
<r:Label xml:lang="en">Core Based Statistical Areas (Branch)</r:Label>
<l:Representation>
<l:TextRepresentation maxLength="5">
</l:TextRepresentation>
</l:Representation>
</l:Variable>
<l:Variable isVersionable="true" id="V5" isTemporal="false" isGeographic="false" isWeight="false">
<r:Name xml:lang="en">RSSDID</r:Name>
<r:Label xml:lang="en">FRB ID Number</r:Label>
<l:Representation>
<l:TextRepresentation maxLength="8">
</l:TextRepresentation>
</l:Representation>
</l:Variable>
<l:Variable isVersionable="true" id="V6" isTemporal="false" isGeographic="false" isWeight="false">
<r:Name xml:lang="en">DOCKET</r:Name>
<r:Label xml:lang="en">OTS Docket Number</r:Label>
<l:Representation>
<l:TextRepresentation maxLength="8">
</l:TextRepresentation>
</l:Representation>
</l:Variable>
<l:Variable isVersionable="true" id="V7" isTemporal="false" isGeographic="false" isWeight="false">
<r:Name xml:lang="en">NAME</r:Name>
<r:Label xml:lang="en">Institution Name</r:Label>
<l:Representation>
<l:TextRepresentation maxLength="72">
</l:TextRepresentation>
</l:Representation>
</l:Variable>
<l:Variable isVersionable="true" id="V8" isTemporal="false" isGeographic="false" isWeight="false">
<r:Name xml:lang="en">NAMEFULL</r:Name>
<r:Label xml:lang="en">Institution Name</r:Label>
<l:Representation>
<l:TextRepresentation maxLength="72">
</l:TextRepresentation>
</l:Representation>
</l:Variable>
<l:Variable isVersionable="true" id="V9" isTemporal="false" isGeographic="false" isWeight="false">
<r:Name xml:lang="en">RSSDHCR</r:Name>
<r:Label xml:lang="en">FRB ID Number (Band Holding Company)</r:Label>
<l:Representation>
<l:TextRepresentation maxLength="8">
</l:TextRepresentation>
</l:Representation>
</l:Variable>
<l:Variable isVersionable="true" id="V10" isTemporal="false" isGeographic="false" isWeight="false">
<r:Name xml:lang="en">NAMEHCR</r:Name>
<r:Label xml:lang="en">Name of regulatory high hold (BHC)</r:Label>
<l:Representation>
<l:TextRepresentation maxLength="95">
</l:TextRepresentation>
</l:Representation>
</l:Variable>
<l:Variable isVersionable="true" id="V11" isTemporal="false" isGeographic="false" isWeight="false">
<r:Name xml:lang="en">HCTMULT</r:Name>
<r:Label xml:lang="en">Multi-Bank Holding Company flag</r:Label>
<l:Representation>
<l:TextRepresentation maxLength="8">
</l:TextRepresentation>
</l:Representation>
</l:Variable>
<l:Variable isVersionable="true" id="V12" isTemporal="false" isGeographic="false" isWeight="false">
<r:Name xml:lang="en">HCTNONE</r:Name>
<r:Label xml:lang="en">No Bank Holding Company flag</r:Label>
<l:Representation>
<l:TextRepresentation maxLength="8">
</l:TextRepresentation>
</l:Representation>
</l:Variable>
<l:Variable isVersionable="true" id="V13" isTemporal="false" isGeographic="false" isWeight="false">
<r:Name xml:lang="en">HCTONE</r:Name>
<r:Label xml:lang="en">One Bank Holding Company flag</r:Label>
<l:Representation>
<l:TextRepresentation maxLength="8">
</l:TextRepresentation>
</l:Representation>
</l:Variable>
<l:Variable isVersionable="true" id="V14" isTemporal="false" isGeographic="false" isWeight="false">
<r:Name xml:lang="en">STALPHCR</r:Name>
<r:Label xml:lang="en">State Code(BHC)</r:Label>
<l:Representation>
<l:TextRepresentation maxLength="2">
</l:TextRepresentation>
</l:Representation>
</l:Variable>
<l:Variable isVersionable="true" id="V15" isTemporal="false" isGeographic="false" isWeight="false">
<r:Name xml:lang="en">CITYHCR</r:Name>
<r:Label xml:lang="en">City (Bank Holding Company)</r:Label>
<l:Representation>
<l:TextRepresentation maxLength="25">
</l:TextRepresentation>
</l:Representation>
</l:Variable>
<l:Variable isVersionable="true" id="V16" isTemporal="false" isGeographic="false" isWeight="false">
<r:Name xml:lang="en">UNIT</r:Name>
<r:Label xml:lang="en">Unit Bank flag</r:Label>
<l:Representation>
<l:TextRepresentation maxLength="8">
</l:TextRepresentation>
</l:Representation>
</l:Variable>
<l:Variable isVersionable="true" id="V17" isTemporal="false" isGeographic="false" isWeight="false">
<r:Name xml:lang="en">REGAGNT</r:Name>
<r:Label xml:lang="en">Primary Federal Regulator</r:Label>
<l:Representation>
<l:TextRepresentation maxLength="5">
</l:TextRepresentation>
</l:Representation>
</l:Variable>
<l:Variable isVersionable="true" id="V18" isTemporal="false" isGeographic="false" isWeight="false">
<r:Name xml:lang="en">INSAGNT1</r:Name>
<r:Label xml:lang="en">Primary Insurance Fund</r:Label>
<l:Representation>
<l:TextRepresentation maxLength="5">
</l:TextRepresentation>
</l:Representation>
</l:Variable>
<l:Variable isVersionable="true" id="V19" isTemporal="false" isGeographic="false" isWeight="false">
<r:Name xml:lang="en">OAKAR</r:Name>
<r:Label xml:lang="en">OAKAR flag</r:Label>
<l:Representation>
<l:TextRepresentation maxLength="8">
</l:TextRepresentation>
</l:Representation>
</l:Variable>
<l:Variable isVersionable="true" id="V20" isTemporal="false" isGeographic="false" isWeight="false">
<r:Name xml:lang="en">CHRTAGNT</r:Name>
<r:Label xml:lang="en">Charter Agent Code</r:Label>
<l:Representation>
<l:TextRepresentation maxLength="5">
</l:TextRepresentation>
</l:Representation>
</l:Variable>
</l:VariableScheme>
<!-- continues through remaining variables -->
</l:LogicalProduct>
<p:PhysicalDataProduct isMaintainable="true" id="PD_1">
<p:PhysicalStructureScheme isMaintainable="true" id="PSS_1">
<p:PhysicalStructure isVersionable="true" id="PS_1">
<p:LogicalProductReference isReference="true"><r:ID>LP_1</r:ID></p:LogicalProductReference>
<p:DefaultDelimiter>Comma</p:DefaultDelimiter>
<p:GrossRecordStructure isIdentifiable="true" id="GR_1" numberOfPhysicalSegments="1">
<p:LogicalRecordReference isReference="true"><r:ID>LR_1</r:ID></p:LogicalRecordReference>
<p:PhysicalRecordSegment isIdentifiable="true" id="PHYS_1" segmentOrder="1" hasSegmentKey="false">
</p:PhysicalRecordSegment>
</p:GrossRecordStructure>
</p:PhysicalStructure>
</p:PhysicalStructureScheme>
<p:RecordLayoutScheme isMaintainable="true" id="RLS_1">
<p:RecordLayout isIdentifiable="true" id="RL_1">
<p:PhysicalStructureReference isReference="true" lateBound="false"><r:ID>PS_1</r:ID><p:PhysicalRecordSegmentUsed>PHYS_1</p:PhysicalRecordSegmentUsed></p:PhysicalStructureReference>
<p:CharacterSet>ASCII</p:CharacterSet>
<p:ArrayBase>1</p:ArrayBase>
<p:DataItem>
<p:VariableReference isReference="true"><r:ID>V1</r:ID></p:VariableReference>
<p:PhysicalLocation><p:ArrayPosition>1</p:ArrayPosition><p:Width>5</p:Width>
</p:PhysicalLocation>
</p:DataItem>
<p:DataItem>
<p:VariableReference isReference="true"><r:ID>V2</r:ID></p:VariableReference>
<p:PhysicalLocation><p:ArrayPosition>2</p:ArrayPosition><p:Width>4</p:Width>
</p:PhysicalLocation>
</p:DataItem>
<p:DataItem>
<p:VariableReference isReference="true"><r:ID>V3</r:ID></p:VariableReference>
<p:PhysicalLocation><p:ArrayPosition>3</p:ArrayPosition><p:Width>5</p:Width>
</p:PhysicalLocation>
</p:DataItem>
<p:DataItem>
<p:VariableReference isReference="true"><r:ID>V4</r:ID></p:VariableReference>
<p:PhysicalLocation><p:ArrayPosition>4</p:ArrayPosition><p:Width>5</p:Width>
</p:PhysicalLocation>
</p:DataItem>
<p:DataItem>
<p:VariableReference isReference="true"><r:ID>V5</r:ID></p:VariableReference>
<p:PhysicalLocation><p:ArrayPosition>5</p:ArrayPosition><p:Width>8</p:Width>
</p:PhysicalLocation>
</p:DataItem>
<p:DataItem>
<p:VariableReference isReference="true"><r:ID>V6</r:ID></p:VariableReference>
<p:PhysicalLocation><p:ArrayPosition>6</p:ArrayPosition><p:Width>8</p:Width>
</p:PhysicalLocation>
</p:DataItem>
<p:DataItem>
<p:VariableReference isReference="true"><r:ID>V7</r:ID></p:VariableReference>
<p:PhysicalLocation><p:ArrayPosition>7</p:ArrayPosition><p:Width>72</p:Width>
</p:PhysicalLocation>
</p:DataItem>
<p:DataItem>
<p:VariableReference isReference="true"><r:ID>V8</r:ID></p:VariableReference>
<p:PhysicalLocation><p:ArrayPosition>8</p:ArrayPosition><p:Width>72</p:Width>
</p:PhysicalLocation>
</p:DataItem>
<p:DataItem>
<p:VariableReference isReference="true"><r:ID>V9</r:ID></p:VariableReference>
<p:PhysicalLocation><p:ArrayPosition>9</p:ArrayPosition><p:Width>8</p:Width>
</p:PhysicalLocation>
</p:DataItem>
<p:DataItem>
<p:VariableReference isReference="true"><r:ID>V10</r:ID></p:VariableReference>
<p:PhysicalLocation><p:ArrayPosition>10</p:ArrayPosition><p:Width>95</p:Width>
</p:PhysicalLocation>
</p:DataItem>
<p:DataItem>
<p:VariableReference isReference="true"><r:ID>V11</r:ID></p:VariableReference>
<p:PhysicalLocation><p:ArrayPosition>11</p:ArrayPosition><p:Width>8</p:Width>
</p:PhysicalLocation>
</p:DataItem>
<p:DataItem>
<p:VariableReference isReference="true"><r:ID>V12</r:ID></p:VariableReference>
<p:PhysicalLocation><p:ArrayPosition>12</p:ArrayPosition><p:Width>8</p:Width>
</p:PhysicalLocation>
</p:DataItem>
<p:DataItem>
<p:VariableReference isReference="true"><r:ID>V13</r:ID></p:VariableReference>
<p:PhysicalLocation><p:ArrayPosition>13</p:ArrayPosition><p:Width>8</p:Width>
</p:PhysicalLocation>
</p:DataItem>
<p:DataItem>
<p:VariableReference isReference="true"><r:ID>V14</r:ID></p:VariableReference>
<p:PhysicalLocation><p:ArrayPosition>14</p:ArrayPosition><p:Width>2</p:Width>
</p:PhysicalLocation>
</p:DataItem>
<p:DataItem>
<p:VariableReference isReference="true"><r:ID>V15</r:ID></p:VariableReference>
<p:PhysicalLocation><p:ArrayPosition>15</p:ArrayPosition><p:Width>25</p:Width>
</p:PhysicalLocation>
</p:DataItem>
<p:DataItem>
<p:VariableReference isReference="true"><r:ID>V16</r:ID></p:VariableReference>
<p:PhysicalLocation><p:ArrayPosition>16</p:ArrayPosition><p:Width>8</p:Width>
</p:PhysicalLocation>
</p:DataItem>
<p:DataItem>
<p:VariableReference isReference="true"><r:ID>V17</r:ID></p:VariableReference>
<p:PhysicalLocation><p:ArrayPosition>17</p:ArrayPosition><p:Width>5</p:Width>
</p:PhysicalLocation>
</p:DataItem>
<p:DataItem>
<p:VariableReference isReference="true"><r:ID>V18</r:ID></p:VariableReference>
<p:PhysicalLocation><p:ArrayPosition>18</p:ArrayPosition><p:Width>5</p:Width>
</p:PhysicalLocation>
</p:DataItem>
<p:DataItem>
<p:VariableReference isReference="true"><r:ID>V19</r:ID></p:VariableReference>
<p:PhysicalLocation><p:ArrayPosition>19</p:ArrayPosition><p:Width>8</p:Width>
</p:PhysicalLocation>
</p:DataItem>
<p:DataItem>
<p:VariableReference isReference="true"><r:ID>V20</r:ID></p:VariableReference>
<p:PhysicalLocation><p:ArrayPosition>20</p:ArrayPosition><p:Width>5</p:Width>
</p:PhysicalLocation>
</p:DataItem>
<!-- continues through remaining variables -->
</p:RecordLayout>
</p:RecordLayoutScheme>
</p:PhysicalDataProduct>
<pi:PhysicalInstance isMaintainable="true" id="PI_1">
<pi:RecordLayoutReference isReference="true"><r:ID>RL_1</r:ID></pi:RecordLayoutReference>
<pi:DataFileIdentification isIdentifiable="true" id="FID_1">
<pi:Location>DOJ</pi:Location>
<pi:URI>filename.dat</pi:URI>
</pi:DataFileIdentification>
</pi:PhysicalInstance>
<a:Archive isMaintainable="true" id="ARCH">
<a:ArchiveSpecific>
<a:ArchiveOrganizationReference isReference="true"><r:ID>ORG_OWNER</r:ID></a:ArchiveOrganizationReference>
</a:ArchiveSpecific>
<a:OrganizationScheme isMaintainable="true" id="OS_1">
<a:Organization isVersionable="true" id="ORG_OWNER">
<a:OrganizationName>Minnesota Population Center</a:OrganizationName>
<a:Nickname>mpc.umn.ddi</a:Nickname>
</a:Organization>
</a:OrganizationScheme>
</a:Archive>
</s:StudyUnit>
</ddi:DDIInstance>
More information about the DDI-SRG
mailing list