Skip to content Skip to sidebar Skip to footer

Grouping On Tems In A List In Python

I have 60 records with a column 'skillsList' '('skillsList' is a list of skills) and 'IdNo'. I want to find out how many 'IdNo's' have a skill in common. How can I do it in python.

Solution 1:

You have to do it by yourself. you may use a dictionnary of skills , each item of the dic beeing inited to zero. Then iterate over your records and increment skill item when seen.

Solution 2:

struct = [{id: 1, skills: ['1', '2', '3']}, {...}]
for el in struct:
   if '1' in el.get('skills'):
      print 'id %s get this skill' % el.get('id')

Solution 3:

You can build a inverted index of skills. So you build a dictionary with each key as a skill name and the value of the key is a set of IdNo. That way you can also find out which IdNos have some set of skills

The code would look like

skills = {}
with open('filename.txt') as f:
    for line in f.readlines():
        items = [item.strip() for item in line.split(',')]
        idNo = items[0]
        skill_list = items[1:]
        for skill in skill_list:
            if skill in skills:
                skills[skill] = set([idNo, ])

Now you have skills dictionary which would look like

skills = {
    'Training': set(1,2,3),
    'Powerpoint': set(1,3,4),
    'E-learning': set(9,10,11),


Now you see that 1,3,4 have Powerpoint as a skill and if you want to know idNo who have both 'Training' and 'Powerpoint' skills you can do


and if you want to know idNo who have either 'Training' or 'Powerpoint' skills you can do


Post a Comment for "Grouping On Tems In A List In Python"