UTF-8 / I18N Config for PHP5

2014-02-10 | #helper, #php

Getting UTF-8 up and running for PHP5 has become essential but not really gotten more comfortable with the last versions of 5. Generally speaking its a bunch of of headers and ini sets you can define more or less lengthy and detailed, the basic assumption being that all your source files are UTF-8-encoded (preferably without byte order marker) to begin with.

The most basic setup would be:

header('Content-type: text/html; charset=UTF-8');

ini_set('default_charset', 'UTF-8');

ini_set('mbstring.internal_encoding', 'UTF-8');

ini_set('mbstring.http_output', 'UTF-8');

Which configures the browser for UTF-8 and makes sure the multibyte functions don't screw up.

If you want the full package you'll have to set something more like this:

header('Content-type: text/html; charset=UTF-8');

ini_set('default_charset', 'UTF-8');

ini_set('mbstring.language', 'neutral');

ini_set('mbstring.internal_encoding', 'UTF-8');

ini_set('mbstring.encoding_translation', 'on');

ini_set('mbstring.http_input', 'auto');

ini_set('mbstring.http_output', 'UTF-8');

ini_set('mbstring.detect_order', 'auto');

ini_set('mbstring.substitute_character', 'none');

Which mainly configures multibyte handling a little bit mor in depth.

Additionally you may want to look at »setlocale() for configuring the use of system side locale files which help sorting strings according to language specific rules as well as displaying of currencies, numbers and so forth.