[DDI-SRG] Data formats: locale and language
I-Lin Kuo
ikuoikuo at gmail.com
Wed Dec 12 09:33:12 EST 2007
Hi Joachim,
While I understand the intent, I'm not sure that localization covers is
sufficient or the right solution
First, the ISO date + locale example is not correct. ISO 8601 is locale
neutral, and time elements are arranged in descending order. If 12
represents the month, then the US example 2007-10-12 is not an ISO date.
Secondly, the DateFormatStandardName + locale (=ISO + US) scheme of
identifying formats is not expressive enough to cover 10-DEC-2007 and all
the other possible variations on date that might occur, unless we greatly
expand the set of allowed identifying formats (ORA-US). If we do allow
nonstandard formats, do the formats then mean anything? ORA-US to me means
Oracle date format, US, but might not mean that to someone else. I would
vote for YYYY-MM-DD for date specifications rather than a name.
In general, I favor specific markup to specific cases rather than a general
approach of localization. For money and currency, I would simply prefer
@unitOfCurrency and @decimalDelimiter @thousandsDelimiter to solve the
problem rather than a more general localization approach. This may be for no
other reason that the country of currency is no longer sufficient to specify
whether the currency is in marks or euros.
The other reason I don't favor the localization approach is that for the
data format concern, I see date, number, and currency as the only issues.
The other items on the list at
http://en.wikipedia.org/wiki/Internationalization_and_localization are all
already covered.
On Dec 12, 2007 7:42 AM, Joachim Wackerow <joachim.wackerow at gesis.org>
wrote:
> Looking at the SAS formats I realized that we would need additional
> information like locale and/or language for specific formats.
>
> For example some date formats like a string representation of "day in
> the week". Assuming strings like "SUN" or "MON" in the data file. This
> can be represented by a generic format, but additionally a definition of
> the used language would be necessary.
>
> Similar with dates like 10.12.2007 (in Germany in ISO format 2007-12-10,
> in USA in ISO format 2007-10-12); using a generic format an additional
> information about the locale would be necessary. The alternative would
> be to have a specific format definition for each variation. But then the
> information is lost, that the format is locale dependent.
>
> Reading numeric or monetary values with embedded grouping (or thousands)
> separator and decimal separator is another candidate for localization.
> We have already explicit elements for decimal and grouping separators.
> But a alternate way would be to use a generic numeric format with a
> locale.
>
> The locale and language information should stay at the same place where
> the data format is defined. Both can be seen as attributes of data format.
>
> In general I think both ways can make sense: definition of a specific
> format by a name (for a related type) and definition of a generic format
> with attributes like decimal separator.
>
> SPSS has no NLS support, SAS has NLS support, but also old style fixed
> definitions, SQL has also both. When both ways of definitions are
> available, the work of describing the formats seems to be easier. The
> mapping table and the applications using the mapping table are getting
> more complicate. But doing formats without NLS seems to be a bad choice.
>
> Achim
> _______________________________________________
> DDI-SRG mailing list
> DDI-SRG at icpsr.umich.edu
> http://www.icpsr.umich.edu/mailman/listinfo/ddi-srg
>
--
I-Lin Kuo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.icpsr.umich.edu/pipermail/ddi-srg/attachments/20071212/bcda1d06/attachment.html
More information about the DDI-SRG
mailing list