tumbledry

Twenty Years of Character Encoding Mismatches

So, I anticipated this problem when we gave Ess her full name: Esmé… but I didn’t quite realize the extent of the problems she’ll have with computers accurately displaying her name:

… my girlfriend’s surname contains an ‘é’. I have yet to see a year go by without receiving mail having ‘é’ on the address label where the é should be.

We’re Dutch, and the é is part of our language, and even part of the legacy character encoding standard everyone used before Unicode’s widespread adoption. This is just a matter of code that works perfect as long as all characters are part of the ASCII set, but fails on the characters that don’t conveniently match between UTF-8 (é) and ISO-8859-15 (é).

I doubt these issues will go away within even, say, twenty years.

Sorry about this, Ess. You’re going to receive a lot of mail addressed to ‘Esmé Micek’. And you’re going to learn character encoding the memorable way! I’ll be happy to teach you.

Brief Notes Nearby