Time to get multilingual!
According to my Google Analytics and Xiti stats, the majority of the visitors of this yann.com site are now English-speakers (or at least probably not fluent in French.) That’s a consequence of my recent activity as a WordPress plugin open-source developer. So I guess I’ll have to start writing some content in what the French call “the language of Shakespeare” to satisfy that worldwide audience – OK, I don’t guarantee I can really write like Shakespeare did, but then again, who would understand anyways?
I thought that a natural subject for my first English-language post, as a way of inaugurating the new bilingual twist of this blog, would be the subject of running a multi-language WordPress!
Making WordPress multi-lingual
I’ve been digging in that subject lately, because I needed a sound multilingual architecture for my Post2Peer.com WordPress content-sharing project. So I thought about the best ways of going multilingual while staying as close as possible to the basic out-of-the-zip WP data structure. I spoke with my old friend Kyo, who’s been designing quite a few multi-language WordPress sites (some of them hosting Japanese, English, French and Spanish content in the same WordPress blog). However, he told me that none of the usual plug-ins he tried were satisfying to him or to other international developers, and that making those sites run as they should usually involved a lot of ugly tweaking of the Worpress insides. One of the problems is to get the right XML headers at the top of each page, depending on the actual content. Another issue is to be able to switch between .po language files as a convenient way to organize the localization of each interface “the industrial way”.
My philosophy is to try and never touch the core (except to patch some bugs maybe
), and do everything with plug-ins, while not changing the MySql data structure either.
In the end, I found it simpler to write my own multi-lingual plugin to achieve what I needed. The trick I came about with is quite simple indeed (the plug-in is basically a one-liner function!).
Two-letter wonders
I had noticed that on many multi-language sites (and on most of Kyo’s project), the URL structure reflects the language segmentation. For example, there would be a few “root directories” named with two-letter language codes, like /en/ for English, /ja/ for Japanese, /fr/ for French, and so forth.
So that without having to add anything to the data structure, we can use the WordPress existing segmentation tools based on the WP taxonomy (categories), custom URLs and/or sub-pages to arrange the content in such a way that the base URL would always start with the appropriate 2-letter country code. I then just needed to hook into the “locale” filter and inject a little regexp that looks at the current page’s URL, and sets the locale appropriately. Et voilà! (oops, that’s French again, sorry
…)
The WordPress internal $locale variable, when set appropriately, governs the rest of the page configuration magic that will spit out the appropriate XML language headers, and load the appropriate l10n .po/.mo files if available.
Organizing the planet
The only questions that remains is the best way to organise the content’s URL structure in WP to get to what we need. My first bet was using hierarchical categories in such a way that a set of “root” categories would stand for the language, and then we can add appropriate categories to each language. That’s the way I’ve been going for the post2peer.com directory site and for this yann.com blog.
This solution however is less than ideal without fixing some other issues involving hierarchical categories: a lot of popular WP plugins don’t work right with hierarchical cats. And there is a nasty documented bug that prevents WordPress to serve queries that specify a category + a tag the way it should. For the post2peer content-sharing site, I decided to patch the core with the documented bugfix for this issue, due for shipment with the next WP release. It did the trick nicely.
For whatever reason, the WordPress team also strongly recommends not using the %category% permalink parameter to organize your custom post URLs because of performance issues. I bypassed this advice, since I’m running all my sites with WP Super Cache. It seems Ok after all.
The cool cat
One important trick to know when using the %category%variable placeholder to build the custom pretty URLs is that WordPress will always use the “oldest” category (ie. the lowest category ID) when a post has multiple categories. So if you’re going to use nested categories, and/or tag posts with multiple cats, remember that you should create all the two-lettered-slug language categories at first, before anything else. This was quite tricky to achieve when converting this yann.com blog that already had a few “legacy” categories. But tweaking the database a bit to give the “en” root category a lower ID eventually did the trick. (If you have to do it, just change the term_id in the terms and term_taxonomy table: hopefully, like me, you can find a vacant low ID left available from an early category deletion…)
An added bonus when using %category%at the base of your URL structure is that you can get rid of the category prefix altogether (you still have to specify a prefix in the options admin page, but the page works without the suffix) – but beware of “duplicate content” issues (having two different URLs serving the same page is considered bad SEO practice) – To avoid redundant URLs, I again wrote a tiny function that makes WordPress always print out category URLs without the suffix.
For the post2peersite, I also wrote a couple of custom plugins to replace the way tag clouds, searches or rss work in order to make the site aware of “tags by category” (which in this case means “tags in different languages”.) It’s too early to say if this way of doing things will be viable on the lung-run, but for now it seems quite OK.
Once everything is thoroughly tested, I might release a few of the other plugins I wrote to deal with what I call “language categories”. In the meantime, if you don’t want to use root categories for language segmentation in WordPress and go hierarchical, you can always use special pages with custom queries relying on tags, or any other means you will find to get the appropriate 2-letter codes at the beginning of your custom pretty URLs. That’s not the hard part. And use my yd-setup-locale plugin to do the locale-switching magic for you.
It works for me!
I’m sure there are other clever ways to go, so If you’d like to share other tricks to make WP run multi-language, please leave a comment… in whatever language you like





