Problem Opening A Text Document - Unicode Error
Solution 1:
(unicode eror) 'unicodeescape' codec can't decode bytes in position 2-4: truncated \UXXXXXXXX escape
This probably means that the file you are trying to read is not in the encoding that open() expects. Apparently open() expects some Unicode encoding (most likely UTF-8 or UTF-16), but your file is not encoded like that.
You should not normally use plain open() for reading text files, as it is impossible to correctly read a text file (unless it's pure ASCII) without specifying an encoding.
Use codecs instead:
import codecs
fileObj = codecs.open( "someFile", "r", "utf-8" )
u = fileObj.read() # Returns a Unicode string from the UTF-8 bytes in the file
Solution 2:
Change that to
# for Python 2.5+import sys
try:
d = open("p0901aus.txt","w")
except Exception, ex:
print"Unsuccessful."print ex
sys.exit(0)
# for Python 3import sys
import codecs
try:
d = codecs.open("p0901aus.txt","w","utf-8")
except Exception as ex:
print("Unsuccessful.")
print(ex)
sys.exit(0)
The W is case-sensitive. I do not want to hit you with all the Python syntax at once, but it will be useful for you to know how to display what exception was raised, and this is one way to do it.
Also, you are opening the file for writing, not reading. Is that what you wanted?
If there is already a document named p0901aus.txt, and you want to read it, do this:
#for Python 2.5+import sys
try:
d = open("p0901aus.txt","r")
print"Awesome, I opened p0901aus.txt. Here is what I found there:"for l in d:
print l
except Exception, ex:
print"Unsuccessful."print ex
sys.exit(0)
#for Python 3+import sys
import codecs
try:
d = codecs.open("p0901aus.txt","r","utf-8")
print"Awesome, I opened p0901aus.txt. Here is what I found there:"for l in d:
print(l)
except Exception, ex:
print("Unsuccessful.")
print(ex)
sys.exit(0)
You can of course use the codecs in Python 2.5 also, and your code will be higher quality ("correct") if you do. Python 3 appears to treat the Byte Order Mark as something between a curiosity and line noise which is a bummer.
Solution 3:
import csv
data = csv.reader(open('c:\x\list.csv' ))
for row indata:
print(row)
print('ready')
Brings up "(unicode error)'unicodeescape' codec can't decode bytes in position 2-4: truncated \xXX escape"
Try c:\\x\\list.csv
instead of c:\x\list.csv
This is Python 3 code.
Post a Comment for "Problem Opening A Text Document - Unicode Error"