Lowercase First Element Of Tuple In List Of Tuples
Solution 1:
so your data structure is [([str], str)]
. A list of tuples where each tuple is (list of strings, string)
. It's important to deeply understand what that means before you try to pull data out of it.
That means that for item in documents
will get you a list of tuples, where item
is each tuple.
That means that item[0]
is the list in each tuple.
That means that for item in documents: for s in item[0]:
will iterate through each string inside that list. Let's try that!
[s.lower() for item in documents for s in item[0]]
This should give, from your example data:
[u'a', u'p', u'i', u'o', u'a', u'm', ...]
If you're trying to keep the tuple format, you could do:
[([s.lower() for s in item[0]], item[1]) for item in documents]
# or perhaps more readably
[([s.lower() for s in lst], val) for lst, val in documents]
Both these statements give:
[([u'a', u'p', u'i', u'o', u'a', u'm', ...], 'cancer'), ... ]
Solution 2:
You are close. You are looking for a construction like this:
[([s.lower() for s in ls], cat) for ls, cat in documents]
Which essentially puts these two together:
[[x.lower() for x in element] for element in documents],
[(x.lower(), y) for x,y in documents]
Solution 3:
Try this:
documents = [([word.lower() for word in corpus.words(fileid)], category)
for category in corpus.categories()
for fileid in corpus.fileids(category)]
Solution 4:
Normally, tuples are immutable. However, since your first element of each tuple is a list, that list is mutable, so you can modify its contents without changing the tuple ownership of that list:
documents = [(...what you originally posted...) ... etc. ...]
for d in documents:
# to lowercase all strings in the list
# trailing '[:]' is important, need to modify list in place using slice
d[0][:] = [w.lower() for w in d[0]]
# or to just lower-case the first element of the list (which is what you asked for)
d[0][0] = d[0][0].lower()
You can't just call lower()
on a string and have it get updated - lower()
returns a new string. So to modify the string to be the lowercased version, you have to assign over it. This would not be possible if the string were itself a tuple member, but since the string you are modifying is in a list in the tuple, you can modify the list contents without modifying the tuple's ownership of the list.
Post a Comment for "Lowercase First Element Of Tuple In List Of Tuples"