I am using Python 3.4 on Windows 7. My program generates some numbers (range 0-255) and then converts them into ascii characters (chr) and creates a string. Now I want to save contents of this string in a text file. It gives me the following error:
UnicodeEncodeError: 'charmap' codec can't encode character '\x8e' in position 6: character maps to <undefined>
Please note that the length of the string is variable and any and all codes (0-255) can occur.
Sample code:
file = open('somefiliename.txt', 'w')
file.write(result) #result being the string variable containing ascii chars.
file.close()
I can print the result string and there is no error using print(result). But it is not saving to a file.
result = '' for y in range(4): for x in range(4): result += chr(matrix[x, y]) print(result)
The code is pretty long, i have added above the pertinent. matrix is a numpy 2-dimensional (4x4) matrix which stores the numbers.
I can reproduce this in Windows 7, using a simple code like -
>>> s = ''
>>> for i in range(256):
... s += chr(i)
...
>>>
>>> f = open('a.txt','w')
>>> f.write(s)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python34\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 129-160: character maps to <undefined>
And the characters in position 129 start at \x81
, etc.
The issue occurs because you are openning your file with a default encoding, if you really want to write those characters into your file, you should open it with utf8
encoding , also specify the newline argument to ''
(Why? Explained below). Example -
>>> f = open('a.txt','w',encoding="utf8",newline='')
>>> f.write(s)
257
>>> f.close()
For those using Python 2.x , they can use codecs.open()
to open the file with a specific encoding.
Also for Python 3.x , you would have issues when reading back this file, as when reading back you would see the ASCII value 13 - (Carriage return - '\r'
) has been converted to '\n'
) . This is because in Python 3.x , if we do not specify the newline argument for open()
function (which means it is None), it will use universal newline (which will convert all - \r\n
, \r
, \n
to \n
) . From documentation -
newline controls how universal newlines works (it only applies to text mode). It can be None, '', '\n', '\r', and '\r\n'. It works as follows:
On input, if newline is None, universal newlines mode is enabled. Lines in the input can end in '\n', '\r', or '\r\n', and these are translated into '\n' before being returned to the caller. If it is '', universal newline mode is enabled, but line endings are returned to the caller untranslated. If it has any of the other legal values, input lines are only terminated by the given string, and the line ending is returned to the caller untranslated.
On output, if newline is None, any '\n' characters written are translated to the system default line separator, os.linesep. If newline is '', no translation takes place. If newline is any of the other legal values, any '\n' characters written are translated to the given string.
In your case, you should specify newline=''
argument while both writing as well as reading the file.
Example of reading -
>>> f= open('b.txt','r',newline='',encoding='utf8')
>>> x = f.read()
>>> print(x)
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments