Internationalization
From Omeka 1.5 on, site admins can pick the locale they want Omeka to use. When working on the core, plugins, or themes, the code must be internationalized to make display in different languages and locales possible.
Text
For most plugins and themes, making user-facing text translatable will be the
lion’s share of the internationalization work. All text strings that are
presented to the user and are not editable by the user should be translated.
This includes obvious suspects like text in paragraphs and other visible HTML
elements, but also less obvious places like the <title>
element or title
and alt
attributes.
Omeka uses one function for enabling text translation, the __
(double-underscore) function. Anything that needs to be translated must be passed through the
double-underscore function.
Bare text
Before internationalization, a great deal of the user-facing text may be written directly in HTML or plain text, with no PHP code. Omeka’s translation works through a PHP function, so you need to introduce a PHP block.
Untranslatable
<p>Some text.</p>
Translatable
<p><?php echo __('Some text.'); ?></p>
Literal PHP strings
PHP strings that will end up being shown to the user also need to get translated. These strings are already in PHP code blocks, so the process is easy. Just wrap the double-underscore function around the string that’s already there.
Untranslatable
<?php
echo head(array(
'title' => 'Page Title'
));
?>
Translatable
<?php
echo head(array(
'title' => __('Page Title')
));
?>
Strings with variables
A common pattern in PHP is to write strings that directly contain variables. These need a slightly different approach to be translatable. The goal is to make translators only have to translate your string once, no matter what the particular values of the variables inside are.
To do this, you replace your variables with placeholders, and pass your
variables separately into the double-underscore function. (The placeholders
used are from PHP’s sprintf
function.)
Single variable
The basic placeholder is %s
. It’s used when your original string simply
contained one variable.
Untranslatable
<?php
echo "The site contains $numItems items.";
?>
Translatable
<?php
echo __('The site contains %s items.', $numItems);
?>
This will output the same way as the original, but translators will work
with the single string 'The site contains %s items.'
instead of many
different ones for each possible number.
Multiple variables
The %s
placeholder is fine for a string with only one variable. However,
with two or more, you need to account for the possibility that some
translations will need to reorder the variables, because their sentence
structure differs from English. With multiple variables, you must instead
use numbered placeholders like %1$s
, %2$s
, and so on.
Untranslatable
<?php
echo "Added $file to $item.";
?>
Translatable
<?php
echo __('Added %s$1 to %s$2.', $file, $item);
?>
By using numbered placeholders, translators can reorder where the variables will appear in the string, without modifying the code to do so.
Dates and times
The other major thing you will often want to display differently for different for different locales are dates and times. Omeka comes pre-packaged with date formats for various locales already.
Where translations run through one function, the double-underscore function,
dates and times similarly work with one function: format_date
.
format_date
automatically selects the right format based on the site’s
configured locale.
format_date
takes two parameters. The first is the time you want to
display. The second, which is optional, is the format you want to use. If
you don’t pick a format, the default is an appropriate format for displaying
a date.
Time
There are two possible types for the time parameter for format_date
:
integer and string. If you pass an integer, the time is interpreted as a
Unix timestamp. If you pass a string, the time/date is interpreted
according to the ISO 8601 standard (this will, among many other formats,
correctly parse the output from MySQL date and time columns).
Format
format_date
uses Zend_Date internally, so the Zend documentation is
the place to go for an exhaustive list of available formats.
Format constants starting with DATE
are used for displaying dates
without a specific time, ones starting with DATETIME
are used for
date/time combinations, and ones starting with TIME
are for times alone.
For each, there are FULL
, LONG
, MEDIUM
, and SHORT
variants.
Each variant will automatically use a format specific to the current
locale, including things like the proper order for dates and the correct
names of months.
The default format is Zend_Date::DATE_MEDIUM
. This will display the
given date/time value as a date, with medium length. In the standard US
English locale, this looks like “May 31, 2013.” In a Brazilian locale, it
would instead look like “31/05/2013.”
Preparing Translation Files
Omeka reads translations from .mo files produced with GNU gettext
. There are three steps
to the process. After the basic work described above is complete, you will need to
Create a template file that includes all of the strings to translate
Create
.po
files that contain the actual translationsCompile
.mo
files that Omeka will use
The guide for these tasks below follows the practices used by the Omeka dev team. There are other tools and approaches that can accomplish the same tasks. The tool we use are
ant build utility (along with a
build.xml
file described below)Transifex client (requires Python)
podebug (requires Python)
Creating the template file
The simplest way to produce the template file is to follow the examples in Omeka. We begin with a
template.base.pot
file, which contains the basic format required to begin generating translations.
# Translation for the Simple Pages plugin for Omeka.
# Copyright (C) 2011 Roy Rosenzweig Center for History and New Media
# This file is distributed under the same license as the Omeka package.
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: SimplePages\n"
"Report-Msgid-Bugs-To: http://github.com/omeka/plugin-SimplePages/issues\n"
"POT-Creation-Date: 2012-01-09 21:49-0500\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"Language: \n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
This file will be used to generate the template.pot
file that is used as the template for translations.
template.pot
files will begin with exactly the content shown above and then include pairs of msgid
s and
empty msgstr
. The msgid
s contain the English string of text to translate. The msgstr
s will eventually
contain the actual translations.
The template.base.pot
file is also helpful if your plugin uses strings of text that are not available for
the __()
function described above. For example, if your records include a flag for
a permission such as allowed
or required
in the database, those strings need to be translated, but
might not appear directly in your plugin’s display. In such cases, the strings should be added to template.base.pot
below the last line:
msgid "allowed"
msgstr ""
msgid "required"
msgstr ""
If you have ant
installed on your system, you can modify the following build.xml
file.
<?xml version="1.0" encoding="UTF-8"?>
<project name="SimplePages" basedir=".">
<property name="lang.dir" location="languages" />
<property name="core.pot" location="../../application/languages/Omeka.pot" />
<target name="update-pot" description="Update the translation template.">
<property name="pot.file" location="${lang.dir}/template.pot"/>
<property name="pot.base" location="${lang.dir}/template.base.pot"/>
<tempfile property="pot.temp" suffix=".pot"/>
<tempfile property="pot.duplicates" suffix="-duplicates.pot" />
<copy file="${pot.base}" tofile="${pot.temp}"/>
<apply executable="xgettext" relative="true" parallel="true" verbose="true">
<arg value="--language=php"/>
<arg value="--from-code=utf-8"/>
<arg value="--keyword=__"/>
<arg value="--flag=__:1:pass-php-format"/>
<arg value="--add-comments=/"/>
<arg value="--omit-header"/>
<arg value="--join-existing"/>
<arg value="-o"/>
<arg file="${pot.temp}"/>
<fileset dir="." includes="**/*.php **/*.phtml"
excludes="tests/"/>
</apply>
<exec executable="msgcomm">
<arg value="--omit-header" />
<arg value="-o" />
<arg file="${pot.duplicates}" />
<arg file="${pot.temp}" />
<arg file="${core.pot}" />
</exec>
<exec executable="msgcomm">
<arg value="--unique" />
<arg value="-o" />
<arg file="${pot.temp}" />
<arg file="${pot.temp}" />
<arg file="${pot.duplicates}" />
</exec>
<move file="${pot.temp}" tofile="${pot.file}"/>
<delete file="${pot.duplicates}" quiet="true" />
</target>
<target name="build-mo" description="Build the MO translation files.">
<apply executable="msgfmt" dest="${lang.dir}" verbose="true">
<arg value="-o"/>
<targetfile />
<srcfile />
<fileset dir="${lang.dir}" includes="*.po"/>
<mapper type="glob" from="*.po" to="*.mo"/>
</apply>
</target>
</project>
It creates two ant commands. The first one that is important to us here is ant update-pot
.
It will read the template.base.pot
and generate the template.pot
file from the
strings that are wrapped in __()
. template.pot
will then contain all the msgid
s
to be translated.
You will want to double-check that you have found all of the strings that require localization. The podebug
utility
can be helpful with this. It automatically generates .po
files that contain pseudo-translations that will help you
spot any strings that are not being translated, but should be.
Creating .po
files
The .po
files contain the localizations, named according to the ISO 639-1 standard. For example, es.po
will contain translations into Spanish, and es_CO.po
will contain the more precise localization to Colombian Spanish.
Omeka uses the Transifex service to produce our translations. Other tools and services also exist to help you produce your translations, but we recommend using Transifex if possible, and setting up your plugin as child project to Omeka. This will widen the pool of translators and languages for your project.
Compiling .mo
files
Once you have created the .po
files for your localizations, the final step is to compile them into
binary .mo
files. The second command defined by the build.xml
file used above, ant build-mo
will
perform this task for you.
All files, template.base.pot
, template.pot
, and all .po
and .mo
files should be in a languages
directory
at the top level of your plugin.