Sqlalchemy Result For Utf-8 Column Is Of Type 'str', Why?
Solution 1:
If you want the data converted automatically, you should specify the charset when you create the engine:
create_engine('mysql+mysqldb:///mydb?charset=utf8')
Setting use_unicode
alone won't tell sqlalchemy which charset to use.
Solution 2:
To convert from an UTF-8 bytestring to a unicode object, you need to decode:
utf_8_field.decode('utf8')
Also, when executing a raw SELECT
through .execute
, SQLAlchemy has no metadata to work out that your query is returning utf-8 data, so it is not converting this information to unicode for you.
In other words, convert_unicode
only works if you use the SQLAlchemy SQL expression API or the ORM functionality.
EDIT: As pointed out, your data is not even UTF-8 encoded; 0xe9
in UTF-8 would indicate a character between \u9000
and \u9fff
, which are CJK unified ideographs while you said it was a latin-1 character, whose UTF-8 code would start with 0xc3
. This is probably ISO-8859-1
(latin-1) or similar instead:
>>> u'é'.encode('ISO-8859-1')
'\xe9'
The conclusion then is to tell SQLAlchemy to connect with a different character set, using the charset=utf8
parameter, as pointed out by @mata.
Post a Comment for "Sqlalchemy Result For Utf-8 Column Is Of Type 'str', Why?"