How Can I Run A Set Method Over Lists In Terms Of Dictionary Keys / Values To Find Unique Items And List The Comparison Results?
I have a dictionary with values as lists of text values. (ID : [text values]) Below is an excerpt. data_dictionary = { 52384: ['text2015', 'webnet'], 18720: ['datascience'
Solution 1:
It's not the full solution to your problem, but part of it, as i believe it solves most of the problem:
In [1]: data_dictionary = {
...: 52384: ['text2015', 'webnet'],
...: 18720: ['datascience', 'bigdata', 'links'],
...: 82465: ['biological', 'biomedics', 'datamining', 'datamodel', 'semantics'],
...: 73120: ['links', 'scientometrics'],
...: 22276: ['text2015', 'webnet'],
...: 97376: ['text2015', 'webnet'],
...: 43424: ['biological', 'biomedics', 'datamining', 'datamodel', 'semantics'],
...: 23297: ['links', 'scientometrics'],
...: 45233: ['webnet', 'conference', 'links']
...: }
In [2]: from itertools import combinations
...:
...: intersections = []
...:
...: for first, second in combinations(data_dictionary.items(), r=2):
...: intersection = set(first[1]) & set(second[1])
...: if intersection:
...: intersections.append((first[0], second[0], list(intersection)))
...:
In [3]: intersections
Out[3]:
[(52384, 22276, ['webnet', 'text2015']),
(52384, 97376, ['webnet', 'text2015']),
(52384, 45233, ['webnet']),
(18720, 73120, ['links']),
(18720, 23297, ['links']),
(18720, 45233, ['links']),
(82465,
43424,
['semantics', 'datamodel', 'biological', 'biomedics', 'datamining']),
(73120, 23297, ['links', 'scientometrics']),
(73120, 45233, ['links']),
(22276, 97376, ['webnet', 'text2015']),
(22276, 45233, ['webnet']),
(97376, 45233, ['webnet']),
(23297, 45233, ['links'])]
What it does, it creates pairs of every element of your data_dictionary
and then checks if intersections of values is not empty, then it puts that in intersections
array in form of (key1, key2, intersection)
.
I hope that i gave you a quick-start from which you can finish your task.
Solution 2:
Using the answered example from vishes_shell above, I managed to get most of the desired output. In order to add individual sums, I considered having to rerun the extract sum method which seems non-optimal. So I left it out of the solution as I think up a different path.
forfirst, secondin combinations(data_dictionary.items(), r=2):
intersection=set(first[1]) &set(second[1])
if intersection:
sum1 = extract_sum(first[0], sum_dict)
sum2 = extract_sum(second[0], sum_dict)
if sum1 < sum2:
early =first[0]
late =second[0]
else:
early =second[0]
late =first[0]
filename.write('%d , %d , %s'% (early, late, list(intersection)))
filename.write('\n')
Post a Comment for "How Can I Run A Set Method Over Lists In Terms Of Dictionary Keys / Values To Find Unique Items And List The Comparison Results?"