7

I want to detect a operating system of filesystem's encoding as default, like Windows OS in different language version it will use different encoding (iso-8859-1, ms950, big5, gb2312..etc) So how can I detect the different operating system of encoding in PHP? Any idea? Thanks.

3
  • Have you checked the other questions here on SO regarding encoding identification? Look at this one for example: stackoverflow.com/questions/910793/… Or this one: stackoverflow.com/questions/505562/detect-file-encoding-in-php Commented Nov 30, 2011 at 14:04
  • I'm not sure the file system delegates an encoding... mb_list_encodings will return a list of supported encodings. Commented Nov 30, 2011 at 14:07
  • That is not I want answer and That is differnt question with my Commented Nov 30, 2011 at 14:09

4 Answers 4

1

Linux does not have an encoding, filenames are stored in binary strings and may contain anything. Interpreting that in a specific encoding is up to the application. Most often this will simply be UTF-8. But yea, it depends on the 'viewer' of filenames.

Accessing the filesystem on OS/X will use UTF-8 normalization form D.

Unfortunately, I can not answer what it is on windows. Internally it's stored as a variation of UTF-16 but accessing it through PHP on my machine the api is CP-1252, but yea, this does depend on the language.

Sign up to request clarification or add additional context in comments.

Comments

0

Why don't use mb_detect_encoding()?

Comments

0

Try

    print_r( explode(";", setlocale(LC_ALL, 0)));

Then need convert code page to encoding

Comments

0

FileSystem doesn't have a kinds of encoding, each file can use different kinds of encoding, so all you need is find a right encoding to process the filename string.

To detect a filename's encoding, you can just "try" to convert that filename to all you known encode list, and compare the original filename string with the converted string, if equals, then that encoding is what you are looking for.

Convert a string to a kinds of encoding i use This way. So to do this work, you can see the following code for a example.

function getActuallEncoding($text) {
    $encodingList = array('UTF-8', 'gb2312', 'ISO-8859-1', 'big5'); // Add more if you need.
    foreach($encodingList as $oneEncode) {
        $oneResult = iconv(mb_detect_encoding($text, mb_detect_order(), true), $oneEncode, $text);
        if(md5($oneResult) == md5($text)) return $oneEncode;
    }
    return "UNKNOWN"; // This return value may cause problem, just let you know.
}

Hope that helps.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.