the corner office : tech blog

a tech blog, by Colin Pretorius

Encoding is never fun

I noticed some weird behaviour with my blog app. I had a special character in one of my entries which was being converted correctly if I ran the blog builder app from my IDE, but didn't if I ran it from the command line (or more correctly, cygwin).

After some googling, I came across this SO post. The upshot is that Java's FileWriter is sucky and just takes the default encoding from your environment, and you always want an OutputStreamWriter with the encoding explicitly set.

Another subtlety is that using a CharSetEncoder will cause encoding exceptions to be caught and thrown, just specifying "UTF-8" as the argument for the encoding will cause exceptions to be suppressed.

You can also specify -Dfile.encoding=UTF-8 on the command line, although who wants to do that?

{2016.08.20 09:53}

« Bitkeeper

» Flashing my Nexus 7