Rectangle 27 3

Your server encoding seems to be UTF8. I suspect your client_encoding does not match, which might give you a wrong impression of what you are dealing with. Check with:

SHOW client_encoding;   -- in your actual session

The rest of the tool chain has to be in sync, too. When using puTTY, for instance, one has to make sure, the terminal agrees with the rest: Change settings... Window -> Translation -> Remote character set = UTF-8.

As for your first question, you already have the best solution. A couple of umlauts are best replaced with a string of replace() statements.

As you seem to know already as well, single character replacements are more efficient with (a single) translate() statement.

I'm not so sure about the client/server encoding mismatch. In my experience, those kinds of mapping failures usually give "character has no equivalent" errors. "Invalid byte sequence" sounds more like the client_encoding is set to UTF8, but the client program is still sending ANSI data.

@NickBarnes: Well, the client sends data encoded according to the clent_encoding. client_encoding goes both ways. So we are probably talking about the same thing here.

Not quite. The server interprets the client's data according to the client_encoding, but the client can send whatever it wants. For example, if I fire up psql in Windows, it defaults to WIN1252. If I run SET client_encoding TO 'UTF8' and SELECT '', I get an "invalid byte sequence" error. psql has no idea that anything has changed; it's still sending its data as ANSI.

@NickBarnes: Good point. My description in the previous comment was not quite correct. When using puTTY, for instance, one has to make sure, the terminal agrees with the rest ... Change settings... Window -> Translation -> Remote character set: UTF-8

@Erwin: That's the point. i am using SuperPutty and had to change character set to UTF-8. Though the problem should be solved - but: Select an address with an umlaut inside the street, the umlaut still isn't displayed correctly. Instead I am getting instead of . However, my problem is solved. Thank y'all!

postgresql - Replace characters with multi-character strings - Stack O...

postgresql encoding replace diacritics
Rectangle 27 11

U&
UPDATE mytable 
SET myfield = regexp_replace(myfield, U&'\0050', U&'\0060', 'g')

You can also use the PostgreSQL-specific escape-string form E'\u0050'. This will work on older versions than the unicode escape form does, but the unicode escape form is preferred for newer versions. This should show what's going on:

regress=> SELECT '\u0050', E'\u0050', U&'\0050';
 ?column? | ?column? | ?column? 
----------+----------+----------
 \u0050   | P        | P
(1 row)

Replace unicode characters in PostgreSQL - Stack Overflow

postgresql unicode replace sql-update
Rectangle 27 3

It should work with the "characters corresponding to that code" unless come client or other layer in the food-chain mangles your code!

Also, use translate() or replace() for this simple job. Much faster than regexp_replace(). translate() is also good for multiple simple replacements at a time. And avoid empty updates with a WHERE clause. Much faster yet, and avoids table boat and additional VACUUM cost.

UPDATE mytable
SET    myfield  = translate(myfield, 'P', '`')  -- actual characters
WHERE  myfield <> translate(myfield, 'P', '`');

If you keep running into problems, use the encoding @mvp provided:

UPDATE mytable
SET   myfield =  translate(myfield, U&'\0050', U&'\0060')
WHERE myfield <> translate(myfield, U&'\0050', U&'\0060');

Replace unicode characters in PostgreSQL - Stack Overflow

postgresql unicode replace sql-update
Rectangle 27 0

It should work with the "characters corresponding to that code" unless come client or other layer in the food-chain mangles your code!

Also, use translate() or replace() for this simple job. Much faster than regexp_replace(). translate() is also good for multiple simple replacements at a time. And avoid empty updates with a WHERE clause. Much faster yet, and avoids table boat and additional VACUUM cost.

UPDATE mytable
SET    myfield  = translate(myfield, 'P', '`')  -- actual characters
WHERE  myfield <> translate(myfield, 'P', '`');

If you keep running into problems, use the encoding @mvp provided:

UPDATE mytable
SET   myfield =  translate(myfield, U&'\0050', U&'\0060')
WHERE myfield <> translate(myfield, U&'\0050', U&'\0060');

Replace unicode characters in PostgreSQL - Stack Overflow

postgresql unicode replace sql-update
Rectangle 27 0

U&
UPDATE mytable 
SET myfield = regexp_replace(myfield, U&'\0050', U&'\0060', 'g')

You can also use the PostgreSQL-specific escape-string form E'\u0050'. This will work on older versions than the unicode escape form does, but the unicode escape form is preferred for newer versions. This should show what's going on:

regress=> SELECT '\u0050', E'\u0050', U&'\0050';
 ?column? | ?column? | ?column? 
----------+----------+----------
 \u0050   | P        | P
(1 row)

Replace unicode characters in PostgreSQL - Stack Overflow

postgresql unicode replace sql-update