POPFile Localization

This is a page that describes how POPFile I18N and L10N works.

Current Languages

POPFile is currently translated into: French, Portugese Brazilian, and German. Translations also exist for Arabic, Bulgarian, Chinese (Simplified and Traditional), Czech, Danish, Dutch, British English, Finnish, French, German, Greek, Hebrew, Hungarian, Italian, Japanese, Korean, Norwegian, Polish, Portugese Iberian, Russian, Slovak, Spanish, Swedish, Turkish, Ukrainian, but these translations are currently not up to date. The POPFile Project is actively looking for translators into their local languages.

If you are interested in translating POPFile into a language you know get the file English.msg from the latest POPFile release and translate the strings in it. Once done submit it as a Patch and send a POPFile License Agreement to [email protected]

This page provides full details of how POPFile localization is done.

The L10N Scheme

POPFile is localized by translating strings found in the file English.msg which is located in the languages/ subdirectory. English.msg begins with the following:

# Identify the language and character set used for the interface
LanguageCode                            en
LanguageCharset                         ISO-8859-1
LanguageDirection                       ltr

# This is used to get the appropriate subdirectory for the manual
ManualLanguage                          en

A line that begins with # is a comment and ignored by POPFile. Lines that contains text consist of an identifier (e.g. LanguageCode) that contains no whitespace, then some whitespace, and finally the string associated with the identifier.

There are five special identifiers. The special identifiers are:

Identifier Meaning Values
LanguageCode The base language for the user interface in HTML terms; used to set the lang attribute on the <html> tag of the UI.See http://www.w3.org/TR/html401/struct/dirlang.html#adef-lang for details.
LanguageCharset Sets the character set for the UI; used in a <meta> tag that sets the Content-Type of the UI.See http://www.w3.org/TR/html401/charset.html#h-5.2 for details.
LanguageDirection The direction in which the language is read. e.g. English is left to right, Arabic is right to left; used to set the dir attribute on the <html> tag of the UI. ltr, rtl; see http://www.w3.org/TR/html401/struct/dirlang.html#bidirection for details.
ManualLanguage The subdirectory of manual that contains the manual for this language. If no translation use en English. Note that non-English manuals are not served locally, but come from http://popfile.sourceforge.net/manual/. Subject to change, check http://popfile.sourceforge.net/manual/
Locale_Date Two format strings that determine how the message dates will be formatted on the POPFile History screen. The first one determines the format used for messages that are less than 7 days old, the second one for older messages. Both format strings are devided by the '|' sign. The format specifiers you can use here are described on the DateFormat page.

Some of the strings contain the character sequence %s. These are also format strings and the %s will be replaced by another string at run-time. For example, the string named 'Bucket_Error2' is set to “Bucket named %s already exists” in the Engish version. The %s symbol will be replaced by whatever bucket name the user entered on the Buckets page when this bucket name already exists to come up with a nice error string.

Tools and helpers for translators

To make it easier for you to translate the POPFile UI, you can change the html_test_language variable and use the script check.pl.

html_test_language

The POPFile option html_test_language (see OptionReference) can be set to 1 to make POPFile's UI show the identifiers used for each string instead of strings in a particular language. This can prove helpful if someone doing localization wants to discover the relationship between specific strings in an MSG file and the UI.

Example: With the parameter set to 1 the text “POPFile Control Center” changes to “Header_Title” which is an identifier that can be found in the MSG files.

check.pl

The languages directory also contains a little script which you can used to easily find out which strings still need to be translated, which strings are no longer used, etc. The script is named “check.pl” and can be used like this:

perl check.pl language.msg

where “language” should be replaced by the language your are interested in.

Special Details for Japanese Handling

To filter Japanese emails correctly, POPFile handles them differently from emails in English or other languages. This is enabled when “Nihongo” is selected as User Interface language.

Encoding conversion

Emails in ISO-2022-JP encoding are converted into emails in EUC-JP encoding. This is because characters in EUC-JP are more suitable for text matching and parsing than those in ISO-2022-JP.

Text::Kakasi

Japanese words are not separated by spaces. For example:

  • Japanesewordsarenotseparatedbyspaces

Alone, POPFile classification would not work well because it cannot recognize the individual words. So a Japanese language processing filter, Kakasi, is used to split Japanese sentences into individual words based on its dictionary. The Text::Kakasi Perl module provides POPFile the interface to the Kakasi filter. The individual words are then passed to the classification engine.

 
worlddomination.txt · Last modified: 2008/02/08 19:49 (external edit)

Should you find anything in the documentation that is incomplete, unclear, outdated or just plain wrong, please let us know and leave a note in the Documentation Forum.

Recent changes RSS feed Donate Driven by DokuWiki
The content of this wiki is protected by the GNU Fee Documentation License