Skip to content Skip to sidebar Skip to footer

Aggregate Column Values In Pandas Groupby As A Dict

This is the question I had during the interview in the past. We have the input data having the following columns: language, product id, shelf id, rank For instance, the input wou

Solution 1:

Setup

df = pd.read_csv('file.csv', header=None)  
df.columns = ['Lang', 'product_id', 'shelf_id', 'rank_id']    

df
      Lang     product_id  shelf_id  rank_id
0  English         742005      4560     10.2
1  English  6000075389352      4560     49.0
2   French      899883993      4560     32.0
3   French      731317391      7868     81.0

You can use df.groupby to group by Lang and shelf_id. Then use df.apply to get a dictionary of {productid : rankid}:

(df.groupby(['Lang', 'shelf_id'], as_index=False)
   .apply(lambda x: dict(zip(x['product_id'], x['rank_id'])))
   .reset_index(name='mapping'))

      Lang  shelf_id                              mapping
0  English      4560  {6000075389352: 49.0, 742005: 10.2}
1   French      4560                    {899883993: 32.0}
2   French      7868                    {731317391: 81.0}

Post a Comment for "Aggregate Column Values In Pandas Groupby As A Dict"