[DDI-SRG] Data formats: locale and language
Joachim Wackerow
joachim.wackerow at gesis.org
Wed Dec 12 08:42:44 EST 2007
Looking at the SAS formats I realized that we would need additional
information like locale and/or language for specific formats.
For example some date formats like a string representation of "day in
the week". Assuming strings like "SUN" or "MON" in the data file. This
can be represented by a generic format, but additionally a definition of
the used language would be necessary.
Similar with dates like 10.12.2007 (in Germany in ISO format 2007-12-10,
in USA in ISO format 2007-10-12); using a generic format an additional
information about the locale would be necessary. The alternative would
be to have a specific format definition for each variation. But then the
information is lost, that the format is locale dependent.
Reading numeric or monetary values with embedded grouping (or thousands)
separator and decimal separator is another candidate for localization.
We have already explicit elements for decimal and grouping separators.
But a alternate way would be to use a generic numeric format with a locale.
The locale and language information should stay at the same place where
the data format is defined. Both can be seen as attributes of data format.
In general I think both ways can make sense: definition of a specific
format by a name (for a related type) and definition of a generic format
with attributes like decimal separator.
SPSS has no NLS support, SAS has NLS support, but also old style fixed
definitions, SQL has also both. When both ways of definitions are
available, the work of describing the formats seems to be easier. The
mapping table and the applications using the mapping table are getting
more complicate. But doing formats without NLS seems to be a bad choice.
Achim
More information about the DDI-SRG
mailing list