Oliver Nassar

Handling localization in french/english

November 02, 2010

I started working at an agency a couple months ago called henderson bas kohn and most of everything they do requires a french and english version.

Up until this point, I'd done only a specific type of localization. While working at KickApps we made the entire platform internationalized based on databases of terms that could be customized by users. The catch of it, however, was when you visited a site, it was either french or english or german or whatever.

Where I am now though, it's a little different. Every site/app we do needs to be available in both languages. This means that you need to have two (at least) different language/copy sources; one english, one french.

We tackle this in two ways; the first is by creating 'copy decks' for each language. For example, if you're looking at a FAQ page for a site, there will be a file called faq.en.xml and faq.fr.xml. Each contain seperate copy for that language, and sometimes custom html. We load them in through the PHP extension SimpleXML and spit it out in the right places.

The second part of this, though, is a joint apache-rewrite/php-constant fix. Whenever a request is made to the server, we run a rewrite through the .htaccess file that passes along a _GET parameter called 'language' which contains either the value 'en' or 'fr'. We then store this strong value in a constant called LANG.

Finally, whenever we need language-specific logic (for example, if we want to load faq.en.xml or faq.fr.xml, how do we know which to load?), we use a custom function:

// i18n
function __($en, $fr)
{
    return ${LANG};
}

When we want a language specific value returned, we would write something like this:

$obj = simplexml_load_file('faq.' . __('en', 'fr') . '.xml');

Note that two parameters were passed in. The first is the response you want if the site is being requested in english; the second if it's being requested in french. The function above does nothing fancy. It simply returns the variable based on the _GET parameter that was set earlier.

So there are two ways we handle french/english localization. Flexible enough for two languages, but you may run into issues if you get to three or more languages (for example, I wouldn't want to write out __('en', 'fr', 'de', 'es') just to get a language specific string somewhere).