I first tried typing in a Unicode character, encode it in UTF-8, and decode it back. Python happily gives back the original character. I took a look at the encoded string, it is b'\xe6\x88\x91'
. I don't understand what this is, it looks like 3 hex numbers.
Then I did some research and I found that the CJK set starts from 4E00, so now I want Python to show me what this character looks like. How do I do that? Do I need to convert 4E00 to the form of something like the one above?
The text b'\xe6\x88\x91'
is the representation of the bytes that are the utf-8 encoding of the unicode codepoint \u6211
which is the character 我. So there is no need in converting something, other than to a unicode string with .decode('utf-8')
.
이 기사는 인터넷에서 수집됩니다. 재 인쇄 할 때 출처를 알려주십시오.
침해가 발생한 경우 연락 주시기 바랍니다[email protected] 삭제
몇 마디 만하겠습니다