|
This
page outlines how to configure Apache to automatically deliver pages in
multiple languages. It covers how to do this using Auto ("Transparent")
Content negotiation (RFC2295) which is available with both Apache 1.3
and 2.x.
Prerequistes
Overview
|
Content negotiation is designed to deliver not just multiple languages
but also multiple formats of documents - e.g. GIF or JPG, HTML or
PDF. This tends to make the Apache
documentation complex and possibly overwhelming for some. This
document simply cover how to configure multiple languages automatically.
All modern browsers can be configured to negiotiate for several
languages, and retrieve from the server the most appriopiate for
the user. When setting up your server you should be aware that many
users only configure their browsers to accept ONLY their native
language. If none of the browser's acceptable languages are available
on your server the user will by default see the server's error 406
page - "No acceptable representation". There are two workarounds
for this problem, covered below.
|
 |
You need to identify the ISO639 codes for the languages you intend to
support and the naming convention you intend to use. ISO639 codes are
in the table below. I recommend you use this file
naming convention;
<resource>.<langCode>.<filetype>
So if you are providing English, French and German throughout your site
you will likely have three top level index files named;
| index.en.html |
index.fr.html |
index.de.html |
Note that whilst ISO639 language codes bear a strong resemblence
to, they are not the same as ISO3166
country codes as used within DNS.
Non-Latin languages: If you are providing languages that are NOT
based on Latin (European) character set - such as Chinese, Japanese, or
Thai, then every such page should have the content-type HTTP header present.
So for Thai;
<meta http-equiv="Content-Type" content="text/html;
charset=tis-620">
Server Setup
In every directory containing pages in multiple languages you will need
an .htaccess that contains at the minimum these two lines;
Options MultiViews
DirectoryIndex index
The Options directive tells the server to enable content negiotation for
this directory. Each time the server processes a non-qualified resource
name it will look for the files with the language codes from the browser.
By 'Non-qualified' we mean that the GET request does not specify a full
path name.
DirectoryIndex index tells the server to look for a resource named 'index'
when the request ends with a trailing '/'. It is vital that this does
not specify a fully qualified path name.
Writing your HTML
For multi-language support to work seamlessly and transparently across
your site you need to be careful about specifying all interpage links.
Specifically you must avoid links such as;
<A HREF="/index.html">Home</a>
If you use the naming convention recommended this link is already broken.
The correct link to the site's home page be specified as
<A HREF="/">Home</a> or <A HREF="/index">Home</a>.
The downside to this is that most HTML authoring tools ( E.g. Dreamweaver
) will flags such links as broken.
If you are using version 2.x of Apache you can utilise the ForceLanguagePriority
directive to deliver the first language variant specified in LanguagePriority.
The only way of doing something with earlier versions is to trap the
error via a ErrorDocument
directive and either show the user a specific page or run a CGI-type script.
Using the HTTP_REFERER and DOCUMENT_ROOT environment variables it is relatively
straight forward to show the user the page in a language of your choosing.
ISO639 defines a series of 2 letter codes for many known languages. This
standard is not considered stable, and several codes have changed, one example
being that for Indonesian which has changed from 'in' to 'id'.
| aa |
Afar |
ab |
Abkhazian |
af |
Afrikaans |
| am |
Amharic |
ar |
Arabic |
as |
Assamese |
| ay |
Aymara |
az |
Azerbaijani |
ba |
Bashkir |
| be |
Byelorussian |
bg |
Bulgarian |
bh |
Bihari |
| bi |
Bislama |
bn |
Bengali-Bangla |
bo |
Tibetan |
| br |
Breton |
ca |
Catalan |
co |
Corsican |
| cs |
Czech |
cy |
Welsh |
da |
Danish |
| de |
German |
dz |
Bhutani |
el |
Greek |
| en |
English |
eo |
Esperanto |
es |
Spanish |
| et |
Estonian |
eu |
Basque |
fa |
Persian |
| fi |
Finnish |
fj |
Fiji |
fo |
Faeroese |
| fr |
French |
fy |
Frisian |
ga |
Irish |
| gd |
Gaelic-Scots-Gaelic |
gl |
Galician |
gn |
Guarani |
| gu |
Gujarati |
ha |
Hausa |
he |
Hebrew |
| hi |
Hindi |
hr |
Croatian |
hu |
Hungarian |
| hy |
Armenian |
ia |
Interlingua |
id |
Indonesian |
| ie |
Interlingue |
ik |
Inupiak |
in |
Indonesian |
| is |
Icelandic |
it |
Italian |
iu |
Inuktitut |
| iw |
Hebrew |
ja |
Japanese |
ji |
Yiddish |
| jw |
Javanese |
ka |
Georgian |
kk |
Kazakh |
| kl |
Greenlandic |
km |
Cambodian |
kn |
Kannada |
| ko |
Korean |
ks |
Kashmiri |
ku |
Kurdish |
| ky |
Kirghiz |
la |
Latin |
ln |
Lingala |
| lo |
Laothian |
lt |
Lithuanian |
lv |
Latvian-Lettish |
| mg |
Malagasy |
mi |
Maori |
mk |
Macedonian |
| ml |
Malayalam |
mn |
Mongolian |
mo |
Moldavian |
| mr |
Marathi |
ms |
Malay |
mt |
Maltese |
| my |
Burmese |
na |
Nauru |
ne |
Nepali |
| nl |
Dutch |
no |
Norwegian |
oc |
Occitan |
| om |
Oromo-Afan |
or |
Oriya |
pa |
Punjabi |
| pl |
Polish |
ps |
Pashto-Pushto |
pt |
Portuguese |
| qu |
Quechua |
rm |
Rhaeto-Romance |
rn |
Kirundi |
| ro |
Romanian |
ru |
Russian |
rw |
Kinyarwanda |
| sa |
Sanskrit |
sd |
Sindhi |
sg |
Sangro |
| sh |
Serbo-Croatian |
si |
Singhalese |
sk |
Slovak |
| sl |
Slovenian |
sm |
Samoan |
sn |
Shona |
| so |
Somali |
sq |
Albanian |
sr |
Serbian |
| ss |
Siswati |
st |
Sesotho |
su |
Sudanese |
| sv |
Swedish |
sw |
Swahili |
ta |
Tamil |
| te |
Tegulu |
tg |
Tajik |
th |
Thai |
| ti |
Tigrinya |
tk |
Turkmen |
tl |
Tagalog |
| tn |
Setswana |
to |
Tonga |
tr |
Turkish |
| ts |
Tsonga |
tt |
Tatar |
tw |
Twi |
| ug |
Uighur |
uk |
Ukrainian |
ur |
Urdu |
| uz |
Uzbek |
vi |
Vietnamese |
vo |
Volapuk |
| wo |
Wolof |
xh |
Xhosa |
yi |
Yiddish |
| yo |
Yoruba |
za |
Zhuang |
zh |
Chinese |
| zu |
Zulu |
Online translation tools;
|