1) from_bytes() overload that takes the single const char* expects a null-terminated byte string, but your very second byte is '\0'.
using namespace std;
wstring_convert<codecvt_utf16<wchar_t, 0x10ffff, little_endian>,
wstring ws = conv.from_bytes(
reinterpret_cast<const char*> (&s),
reinterpret_cast<const char*> (&s + s.size()));
wcout << ws << endl;
Tested with Visual Studio 2010 SP1 on Windows and CLang++/libc++-svn on Linux.
PS, this should be using char32_t to guarantee UCS4, of course. The wchar_t version produces UTF-16 where wchar_t is 16 bit.
This is a very awesome answer, and I salute you for knowing all this really! I'd upvote the answer 3 more times if I could. May I also ask more questions if allowed: 1. Can you tell me the concept of MaxCode which you set to 0x10ffff? Cause I notice that it's actually needed.. 2. Good point about '\0' being the terminator of const char*. Makes me quickly wonder, what would be the corresponding terminator for char16_t* ? Thanks again.
@ryaner Maxcode is just the limit on the acceptable character values, it's only needed here because endianness/BOM handling indicator happens to be the third template parameter, which I think is a small design flaw. The terminating character for a null-terminated array of chat16_t is char16_t() aka u'\0'