The issue is that, at some point, there is a conversion of the unicode symbol into a particular sequence of bytes using an encoding that does not support that particular character (which causes you to get the replacement character instead, which happens to be a ? for this particular conversion).
The core of Tk is Unicode-aware and at least the initial stage of scripting will be using UTF-8; the character is (well, almost certainly) getting through from the keyboard and Windows correctly. What happens then is that the character is conveyed to the Python layer; I don't know that part of Tkinter very well, but it is where I suspect the problem is (e.g., if the wrong type of string is being generated). In other words, it smells like it might be a subtle Tkinter bug. (By comparison, Tcl's internal notion of strings is entirely Unicode-aware, which I rely on in my code rather a lot and have done for many years. This definitely has some trade-offs, and I know that Python's choice among those trade-offs was different.)
You can check further by seeing what exact type of string you've got. It should be a Unicode string or you'll be forever having problems with this sort of thing (some platforms and deployments must natively deal with far more than 256 characters).
Thanks, this is a great start. So, any unicode issues is really a problem with how Tkinter is using TCL/TK; ultimately the UTF-8 string is getting dropped somewhere after it gets to Tkinter?
Hi, I realized that if this answer is checked then it would limit others from answering more locations where a question mark is inserted. For example, I read somewhere in the TCL/TK documentation that if there is a problem finding the font for a given character than a question mark will be used instead.
@BiagioArobba I'm guessing that the problem is at the point where the string out of Tk is turned into a Python string, and that it is either using the wrong encoding or converting to the wrong type of string at that point. I've only ever very briefly skimmed the implementation of Tkinter I'm a Tk maintainer, not a Tkinter one and not this specific bit, so I really don't know what's going on. I just know what the problem smells like
@BiagioArobba And the replacement where Tk can't render a character is not necessarily ?; that really depends on the platform. (On Windows, in my experience it's typically a hollow box.) The ? is more common as a replacement character when converting from Unicode to an 8-bit character set ( is what you get going the other direction). Which isn't actually a Tk operation; Tk is internally Unicode-aware (but only in the BMP that's a standing bug)