Dodgy Dave

Beware Unicode passwords!

Before you rush in and change everything to be made up of U+1F4A9 'PILE OF POO', here's a cautionary tale:

I worked on a 'secure email' client for a large US company and discovered, following some work on the UI, that the code which takes 'what you type' and turns it into 'what gets hashed' when setting a password had managed to pass on only the first byte of the UTF-8 encoding of each character. So, for instance, an 8-character word in Arabic might have been squashed to 0xD8 0xD8 0xD8 0xD8 0xD8 0xD8 0xD8 0xD8, and would match countless other words.

We were only saved from disaster when it emerged that there was a separate copy of the code used when verifying your password, and this was broken in a different way. The effect was that any password containing a non-ASCII character could never be verified after you'd set it.

So: Unicode - great. Programmers' general ability to write correct internationalized code - needs improvement.

