How to input unicode character and get its numeric value

Harshil.Chokshi 发表于 Dev

Greg Tzikas

I am trying to take a file and remove all characters that are not in the greek language. We found the unicode values for the alphabet, 880 - 1023, and were able to print out the correct characters with a simple print(unichr(880)) line. The problem is when running this code

greek ='ÏÎ' 
for c in greek:
    if(unichr(c) >= 880 and unichr(c) <= 1023):
        print(c)

Is there a way to enter any letter or symbol that will return a unicode value. We have tested with values inside of the greek range and outside and still get the same error, UnicodeDecodeError: 'ascii' codec cannot decode byte 0xc3 in position 0: ordinal not in range(128)

tdelaney

You have several problems. Assuming this is python 2 (since there is no unichr in python 3 you'd get a different error) your first problem is that you didn't initialize a unicode string in the first place.

>>> greek ='ÏÎ' 
>>> len(greek)
4

These aren't 2 unicode characters... they are 4 single byte characters that also happen to be the utf-8 encodings of the unicode characters. Instead do

greek =u'ÏÎ'

Next, these are not the droids, I mean greek characters, you think they are.

>>> ord(greek[0])
207

These are codepage characters in the 128-255 range and are outside of the range you are looking for. Did you want these instead?

>>> greek = u'Ϊΐ'
>>> ord(greek[0])
938

Finally, unichr goes the wrong way... it converts ordinals to characters but you wanted to go the other way. So,

>>> for c in greek:
...     if ord(c) >= 880 and ord(c) <= 1023:
...         print(c)
... 
Ϊ
ΐ

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。

编辑于2021-02-26

我来说两句

0条评论

登录后参与评论

上一篇：如果客户端分辨率较低，如何隐藏一些html元素？

来自分类Dev

How can I preserve the value of my parameters through a function so it can be used multiple times with its initial value?

来自分类Dev

ggvis input_select在scale_numeric trans参数上

Related 相关文章

文章