: Character Taxonomy

High Private Use Surrogates (High_PU_Surrogates) — U+DB80 ‐ U+DBFF

These are surrogates.

You should not use these codepoints. UTF-16 encoding cannot represent them.

Note: you may see such codes in UTF-16 files, but these are to be seen as code units, used to represents codepoints between U+10000 and U+10FFFF.

Note: Python strings may use it (surrogateescape) for very specific reasons (unknown encoding, mixed encoding, from system data)

Spam Spam Spam Spam