#CLDRKit
ATTENTION! This is still in alpha stage and absolutely unusable in anything other than a playground project!
CLDRKit aims to provide full CPLocale support for the Cappuccino Web Framework using the CLDR locale data fron the CLDR - Unicode Common Locale Data Repository. CLDR data is maintained by maintained by Unicode consortium and automatically parsed by CLDR kit. So every update by Unicode is automatically usable without any manual interaction!
Either obtain a system CPLocale instance or create one manually:
@import <CLDRKit.j>
var germanLocale = [CPLocale localeWithLocaleIdentifier:@"de_DE"],
frLocale = [[CPLocale alloc] initWithLocaleIdentifier:@"fr_FR"],
usLocale = [CPLocale localeWithLocaleIdentifier:@"en_US"];
The resulting CPLocale instance provides all the basic functionality to query locales specific data defined by CLDR. E.g.:
[germanLocale objectForKey:CPLocaleIdentifier] --> @"de_DE"
[germanLocale objectForKey:CPLocaleLanguageCode] --> @"de"
[germanLocale objectForKey:CPLocaleCountryCode] --> @"DE"
[frLocale displayNameForKey:CPLocaleLanguageCode value:@"fr_FR"] --> @"français"
[frLocale displayNameForKey:CPLocaleLanguageCode value:@"en_US"] --> @"anglais"
[frLocale displayNameForKey:CPLocaleCountryCode value:@"fr_FR"] --> @"France"
[frLocale displayNameForKey:CPLocaleCountryCode value:@"en_US"] --> @"États-Unis"
[frLocale displayNameForKey:CPLocaleIdentifier value:@"fr_FR"] --> @"français (France)"
[frLocale displayNameForKey:CPLocaleIdentifier value:@"en_US"] --> @"anglais (États-Unis)"
[germanLocale objectForKey:CPLocaleDecimalSeparator] --> @","
[usLocale objectForKey:CPLocaleDecimalSeparator] --> @"."
Cocoa provides a comprehensive support for i18n out of the box - for View and core classes. This means all numbers, dates and other locale specific date is displayd/parsed according to the current locale. Cappuccino does not yet implement this locale support although the basic classes (e.g. CPLocale) are available. IMHO this is caused by the ammount of work needed for each locale if this data is maintained manually. CLDRKit addresses this by automatically building locales from external data. Thus l18n will simply be a matter of the framework for every/locale ... l10n is another matter though ...
Please note that the github repository does contain preparsed plists for all locales. You don't have to generate them manually! I have to check licensing issues though
$ git clone https://github.com/krodelin/CLDRKit
$ cd CLDRKit
$ jake install
You'll need the following external components:
- A JRE environment
- A copy of the CLDR core data. At the time of writing this was Version 25 of core.zip.
- A copy of the CLDR tools. At the time of writing this was Version 25 of tools.zip. Please not that based on your environment additional components like ant and ICU4J might be needed.
Extract the archives and change the variable (CLDR_DIR, CLDR_TOOLS_FOLDER and ICU4J_FOLDER) in Ldml2JsonConverter to point to the according directories.
You might want to change CLDR_INITIAL_LOCALES in cldr.js to list a number of locales which should be included in the initial data (reducing the number of requests for demand loading).
- Create JSON data from XML. Please note that CLDR does provide pre-built JSON files. In my experience they are incomplete - in terms of supported locales and included data. So we'll build our own from the full-blown XML
a. Generate "main" data in directory
cldrusing$ ./Ldml2JsonConverter -d cldr -t main -r false -l modernb. Generate supplemental data in directorycldrusing$ ./Ldml2JsonConverter -d cldr -t supplemental -r false -l modern - Create locale plists from JSON in
Ressources/localesusing$ jake cldr.Ressources/localeswill contain the following files:root.plist: The data for the root locale- One
.plistfile per language: E.g.de.plistcontains data forde,de_AT,de_CHandde_DE. initial.plist: Core data needed by CLDRKit (e.g. available locales).- Optional/Debug
DEBUG.plist: The complete data of all locales- All files in JSON format. Not used by CLDRKit ... but this makes testing the generated files easier ...
The intended API is identical to the one used by NSLocale. The NSLocale blog enty by NSHipster provides some additial background (must read!).
Please note that some tests are either failing (still alpha) or generate warnings due to return types!
$ ojtest Test/*
CLDR uses a hierarchical model for locales. E.g. the locale en_US denotes a locale for the language english (en) and the region/country United States (US). Some locales even diffrentiate different scripts. All locale data is part of a tree whose root is calles root:
+-root
+-de
| +-AT
| |-CH
| +-DE
+-en
+-US
+-GB
root already defines all locale data in a very generic way. All of it's children only define the delta to their parent. E.g. en_US and en_GB only specify differences to the en locale. So to obtain the full data for en_US one has to start with rootand apply all the changes from en and US in this order.
This means two option for a framework like CLDRKit:
- Use pre-merged (during build time) locales
- Pro
- Each locale requested by the browser can (in theory) be fetched using a single HTTP Request.
- Con
- The deployment size would be huge! Having all (defined) locales pre-merged results in a locale DB with 100s of MB! This might be feasible for a desktop operating systems - for a WebApp it is not!
- CLDR only lists locales for which data (deltas) is known. E.g.
en_CNis unknown (english in China?) but not invalid! CLDR dictates to search up the tree (in this caseen) until a match is found. This also means that at least therootlocale will allways match and thus provide sensible defaults.
- Pro
- Use demand-loading during runtime
- Pro
- Much smaller filesize as each locale only stores the delta to it's parent
- Con
- Higher number of requests. Naïve Demand loading would create three requests for a locale
en_US(root.plist,en.plistanden_US.plist). - Not-Found Requests (404): Without the framework being aware which locale data is available you can only try to fetch a locale and react accordingly if you get a 404. E.g.
en_CHwould try to fetchen_CH.plistonly to realize (404) that it's not available.
- Higher number of requests. Naïve Demand loading would create three requests for a locale
- Pro
Even in full demand loading mode (i.e. CLDR_INITIAL_LOCALES is empty - in production it should at least contain widely used locales like en, es, fr, de. root is always included) CLDRKit uses two strategies to reduce the number of requests:
initial.plist: This plist contains meta-data - especially all the locales known to CLDRKit. This eliminates requests for non-existing locales:- Language locales: Instead of storing each region/country seperately CLDKit combined all locales for a language into one file. E.g.
de.plistcontains the data forde,de_AT,de_CHandde_DE.
So in production (e.g. CLDR_INITIAL_LOCALES = [ 'de.*', 'en.*', 'es.*', 'fr.*', 'pt.*' ]) requesting a locale for a country/region where the language is either german, english, spanish, french or portugese will only result in one request (fetching initial.plist). Any other requested locale will only result in one additional request per language!
The actual demand loading, caching and merging of CLDR data is completely implemented in CLDRDatabase.
var cldrData = [[CLDRDatabase sharedDatabase] mergedLocaleWithIdentifier:@"de_AT"];
cldrData will contain the fully merged data from the provided locale (based on the available data) which is then used by CPLocale.
- Some Cappuccino core classes do use
CPLocale(e.g.CPDatePicker) and would in theory benefit from full locale support automatically). Others (e.g.CPDateFormatter) don't and provide their own l18n capability ... which is sometimes broken (e.g.CPDateFormaterclaims support forfr,en,deandes... however the array fordeandesare empty thus resulting in no output at all for browsers with this locale!). A possible solution might be to refactor core classes to always useCPLocale. ExtendingCPLocalewith a sensible static data would make Cappuccino l18n enabled. For those needing full locale support CLDRKit could then provide additional locales (e.g. by implementingCPLocale#_platformLocaleAdditionalDescriptionForIdentifier:).
- Udo Schneider udo.schneider@homeaddress.de
Copyright 2014, Krodelin Software Solutions. All rights reserved.
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.