Python 2.7.2: Plistlib With Itunes Xml
I'm reading an itunes generated xml playlist with plistib. The xml has a utf8 header. When I read the xml with plistib, I get both unicode (e.g., 'Name': u'Don\u2019t You Remember
Solution 1:
Wow this is a really weird behaviour. I would even say that this non-uniform behaviour is a bug in the 2.X implementation of the plistlib
. The plistlib
in Python 3 always returns unicode strings which is much better.
But you have to live with it :) So the answer to your question is yes. You should protect yourself always when reading a string from a plist
def safe_unicode(s):
if isinstance(s, unicode):
return s
return s.decode('utf-8', errors='replace')
value = safe_unicode(info['Name'])
I added the errors='replace'
just in case the string is not utf-8
encoded. You'll get a bunch of \ufffd
characters if it cannot be decoded. If you rather get an exception just leave it out and use e.decode('utf-8')
.
Update:
When I tried with ElementTree:
from xml.etree import ElementTree as et
tree = et.parse('test.plist')
map(lambda x: x.text, tree.findall('dict/dict/dict')[1].findall('string'))
Which gave me:
[u'Don\u2019t You Remember',
'Adele',
'21',
'Pop',
'MPEG audio file',
'7130C888606FB153',
'File',
'file://localhost/D:/music/Adele/21/04%20-%20Don%E2%80%99t%20You%20Remember.mp3']
So there are unicode and byte string mixed :-/
Post a Comment for "Python 2.7.2: Plistlib With Itunes Xml"