25

I'm trying to build multilangual site.

I use this piece of code to detect users language. If you havent chosen a language, it will include your language file based on HTTP_ACCEPT_LANGUAGE.

I don't know where it gets it from though:

session_start();

if (!isset($_SESSION['lang'])) {
   $_SESSION['lang'] = substr($_SERVER['HTTP_ACCEPT_LANGUAGE'], 0, 2);
}

elseif (isset($_GET['setLang']) && $_GET['setLang'] == 'en') $_SESSION['lang'] = "en";
elseif (isset($_GET['setLang']) && $_GET['setLang'] == 'sv') $_SESSION['lang'] = "sv";
elseif (isset($_GET['setLang']) && $_GET['setLang'] == 'pl') $_SESSION['lang'] = "pl";
elseif (isset($_GET['setLang']) && $_GET['setLang'] == 'fr') $_SESSION['lang'] = "fr";

include('languages/'.$_SESSION['lang'].'.php');

It works for me and includes the polish lang file. But is this code accurate? Or is there another way?

1
  • What if the header is Accept-Language: fr;q=0,en or something alike? Commented May 4, 2011 at 10:03

10 Answers 10

40

The browser generally sends a HTTP header, name Accept-Language, that indicates which languages the user is willing to get.

For instance, this header can be :

Accept-Language: en-us,en;q=0.5

There is notion of priority in it, btw ;-)

In PHP, you can get this in the $_SERVER super global :

var_dump($_SERVER['HTTP_ACCEPT_LANGUAGE']);

will get me :

string 'en-us,en;q=0.5' (length=14)

Now, you have to parse that ;-)


If I edit my preferences in the browser's option to say "I want french, and if you can't serve me french, get me english from the US ; and if you can't get me that either, just get me english), the header will be :
Accept-Language: fr-fr,en-us;q=0.7,en;q=0.3

And, from PHP :

string 'fr-fr,en-us;q=0.7,en;q=0.3' (length=26)

For more informations, you can take a look at [section 14.4 of the HTTP RFC][1].

And you probably can find lots of code example in PHP to parse that header ; for instance : Parse Accept-Language to detect a user's language

Have fun !

Sign up to request clarification or add additional context in comments.

2 Comments

is the most preferred language always the first parameter in the header?
in deed, and the q identifies the priority for each
11

Here's the script I used for a bi-lingual site. It is to be used as index.php of mysite.com. Based on the user's browser's language preference, it would redirect to desired language version of the site or the default language site if the site in user's preferred langauge was not available.

<?php
// List of available localized versions as 'lang code' => 'url' map
$sites = array(
    "en" => "http://en.mysite.com/",
    "bn" => "http://bn.mysite.com/",
);

// Get 2 char lang code
$lang = substr($_SERVER['HTTP_ACCEPT_LANGUAGE'], 0, 2);

// Set default language if a `$lang` version of site is not available
if (!in_array($lang, array_keys($sites)))
    $lang = 'en';

// Finally redirect to desired location
header('Location: ' . $sites[$lang]);
?>

Comments

10

I know there already many good solutions, but have found my own way to solve this problem.

<?php
  $prefLocales = array_reduce(
    explode(',', $_SERVER['HTTP_ACCEPT_LANGUAGE']), 
      function ($res, $el) { 
        list($l, $q) = array_merge(explode(';q=', $el), [1]); 
        $res[$l] = (float) $q; 
        return $res; 
      }, []);
    arsort($prefLocales);
    /*
    This get you from headers like this
      string 'en-US,en;q=0.8,uk;q=0.6,ru;q=0.4' (length=32)
    array like this
    array (size=4)
      'en-US' => float 1
      'en' => float 0.8
      'uk' => float 0.6
      'ru' => float 0.4
    */

Code will convert HTTP_ACCEPT_LANGUAGE string to array with locales as keys and weight as values, sorted from high value to low. So you can just get one by one with array_shift to get the best match with your site locales.

2 Comments

While this is a decent solution, i've downvoted because code-only answers are not cool. You have given no indication why your answer is the best and, unless the reader already knows why your answer is a good, this would leave them with more questions than answers.
@hiburn8 fixed, check please
5

You can use: Locale::acceptFromHttp().

Tries to find locale that can satisfy the language list that is requested by the HTTP "Accept-Language" header.

4 Comments

be aware that this function only returns languages that it considers valid. I had the problem in Egypt that it returns en_US for people who prefer ar-AR (which is not an official language variant of course, but seems to be used)
You also obviously need the Locale classes, which are not part of the default PHP installation. You can get them with sudo apt-get install php-intl -y on ubuntu. Or by uncommenting extension=intl in php.ini on xamp.
Personally i think for the majority of basic use-cases, substr($_SERVER['HTTP_ACCEPT_LANGUAGE'], 0, 2); makes more sense.
substr($_SERVER['HTTP_ACCEPT_LANGUAGE'], 0, 2); not corectly parse: da, en-gb;q=0.7, en;q=0.8
1

Your code looks just fine. You might want to add a final else default choice if the visitor asks for a language you aren't providing. Also, if the visitor himself selects a language you should save that choice in a persistent cookie and check its value, giving it precedence over HTTP_ACCEPT_LANGUAGE.

As far as I can tell Youtube does use HTTP_ACCEPT_LANGUAGE, but at the same time uses IP geolocation to suggest a change in language if the langauge of the visitor's country doesn't match that. Definitely annoying.

Just nitpicking: if you're gonna add languages to the list a switch() statement might be more readable.

Comments

1

Here's a function for selecting the best out of a group of supported languages. It extracts languages from Accept-Language, then sorts the given array of languages according to their priority.

function select_best_language($languages) {
    if (!$_SERVER['HTTP_ACCEPT_LANGUAGE']) return $languages[0];
    $default_q=100;
    foreach (explode(",",$_SERVER['HTTP_ACCEPT_LANGUAGE']) as $lqpair) {
        $lq=explode(";q=",$lqpair);
        if ($lq[1]) $lq[1]=floatval($lq[1]); else $lq[1]=$default_q--;
        $larr[$lq[0]]=$lq[1];
    }
    usort($languages,function($a,$b) use ($larr) { return $larr[$b]<=>$larr[$a]; });
    return $languages[0];
}

$lang = select_best_language(['en','fr','it']);

Comments

0

Try This

function getUserLanguage() {
    $langs = array();

    if (isset($_SERVER['HTTP_ACCEPT_LANGUAGE'])) {
        // break up string into pieces (languages and q factors)
        preg_match_all(
            '/([a-z]{1,8}(-[a-z]{1,8})?)\s*(;\s*q\s*=\s*(1|0\.[0-9]+))?/i',
            $_SERVER['HTTP_ACCEPT_LANGUAGE'],
            $lang_parse
        );

        if (count($lang_parse[1])) {
            // create a list like 'en' => 0.8
            $langs = array_combine($lang_parse[1], $lang_parse[4]);

            // set default to 1 for any without q factor
            foreach ($langs as $lang => $val) {
                if ($val === '') {
                    $langs[$lang] = 1;
                }
            }
            // sort list based on value
            arsort($langs, SORT_NUMERIC);
        }
    }
    //extract most important (first)
    reset($langs);
    $lang = key($langs);

    //if complex language simplify it
    if (stristr($lang, '-')) {
        list($lang) = explode('-', $lang);
    }

    return $lang;
}

3 Comments

This seems to have a type/syntax error somewhere. Suppose the quotes are broken.
I have fixed this ;)
the regex does not match langs like sr-latn-rs or es-419. also * language is not respected. i have changed it to /([a-z*\-0-9]+)\s*(;\s*q\s*=\s*(1|0\.[0-9]+))?/i and the 2nd argument of array_combine from $lang_parse[4] to $lang_parse[3]
0

This is also possible. It will use english as default if .php is not available.

$lang = substr($_SERVER['HTTP_ACCEPT_LANGUAGE'], 0, 2);
(@include_once 'languages/'.$lang.'.php') or (@include_once 'languages/en.php'); 

Comments

0

I solved this issue for PHP7.4+ with the code below. It strips out the first two letters of the locale code, takes into account cases like 'en-US' and produces a map like:

$map = [
    'en' => 1,
    'de' => 0.8,
    'uk' => 0.3
];

The code is as follows:


$header  = 'en-US,de-DE;q=0.8,uk;q=0.3';
$pattern = '((?P<code>[a-z-_A-Z]{2,5})([;q=]+?(?P<prio>0.\d+))?)';

preg_match_all($pattern, $header, $matches);
['code' => $codes, 'prio' => $values] = $matches;

$map = \array_combine(
    \array_map(fn(string $language) => strtolower(substr($language, 0, 2)), \array_values($codes)),
    \array_map(fn(string $value)    => empty($value) ? 1 : (float)$value, \array_values($values))
);

Please note that the regex is probably suboptimal, improvement suggestions are welcome. Named capture groups are always in the order of matching, so we can use array_combine safely.

Comments

-1
if(isset($_SERVER['HTTP_ACCEPT_LANGUAGE'])){
    $parts=explode(';',$_SERVER['HTTP_ACCEPT_LANGUAGE']);
    $langs=explode(',',$parts[0]);
    var_dump($langs);
}

4 Comments

Could you elaborate on what's going on?
Use this code and you will get array of accept languages
I wanted to point out that a good answer should include an explanation instead of only dropping code.
This code won't work at all. It first splits by semicolons, while languages are split by commas - and then splits the first result by commas. Thus, it will return only the languages that didn't have priorities set. For "en,fr,jp;q=0.9,it;q=0.8" it'll return ["en","fr","jp"].

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.