Skip to content Skip to sidebar Skip to footer

Python - Transliterate German Umlauts To Diacritic

I have a list of unicode file paths in which I need to replace all umlauts with an English diacritic. For example, I would ü with ue, ä with ae and so on. I have defined a dict

Solution 1:

The replace method returns a new string, it does not modify the original string.

So you would need

file = file.replace(key, value)

instead of just file.replace(key, value).


Note also that you could use the translate method to do all the replacements at once, instead of using a for-loop:

In [20]: umap = {ord(key):unicode(val) for key, val in umlautDictionary.items()}

In [21]: umap
Out[21]: {196: u'Ae', 214: u'Oe', 220: u'Ue', 228: u'ae', 246: u'oe', 252: u'ue'}

In [22]: print(u'ÄÖ'.translate(umap))
AeOe

So you could use

umap = {ord(key):unicode(val) for key, valin umlautDictionary.items()}
for filename in filePathsList:
    filename = filename.translate(umap)
    print(filename)

Solution 2:

Replace the line

file.replace(key, value)

with:

file = file.replace(key, value)

This is because strings are immutable in Python.

Which means that file.replace(key, value) returns a copy of file with replacements made.

Post a Comment for "Python - Transliterate German Umlauts To Diacritic"