Mon, 17 Jul 2006
IS in UTF-8
Our Information System is running with UTF-8 support even at the application layer since Friday. Finally the work which took the most of my work time is almost finished. Now we are fixing the parts of the system which are not running directly in Apache (cron jobs, etc), and minor glitches which survived our prior testing.
We do not allow arbitrary characters everywhere, because we must maintain some attributes in the form suitable for printing through TeX or exporting to the external systems, which are ISO 8859-2 or Windows-1250-based mostly. We allow almost all of Latin-1 and Latin-2 characters in most applications, though.
While it has been hard to convert the whole system to UTF-8, I must say that the UTF-8 support in Perl is well architected (and from what I have read, definitely better than in other scripting languages).