Skip to content Skip to sidebar Skip to footer

How To Identify "keys" Of A Tuple/list Of 3-item Tuples?

Given a table of revenue values thus: A key point to note (and the core of my question) is the the brand name will almost always, but not always, contain the corresponding product

Solution 1:

You can use the fruits as keys and group the brands:

from collections import defaultdict
import csv

withopen("in.csv") as f:
    r = csv.reader(f)
    next(r) # skip header# fruite will be keys, values will be dicts# with brands as keys  and running totals for rev as values
    d = defaultdict(lambda: defaultdict(int))
    for fruit, brand, rev in r:
        d[fruit][brand] += float(rev)

Which using your input outputs:

from pprint import pprint as pp

pp(dict(d))
{'Apple': defaultdict(<type'int'>, {'CrunchApple': 1.7}),
 'Banana': defaultdict(<type'int'>, {'BananaBrand': 4.0,   'OtherBrand': 3.2}),
 'Kiwi': defaultdict(<type'int'>, {'NZKiwi': 1.2}),
 'Pear': defaultdict(<type'int'>, {'PearShaped': 6.2})

You can then subtract the expenses using the keys.

Using pandas life is even easier you can groupby and sum:

import pandas as pd

df = pd.read_csv("in.csv")

print(df.groupby(("A","B")).sum())

Output:

AB               
Apple  CrunchApple  1.7
Banana BananaBrand  4.0
       OtherBrand   3.2
Kiwi   NZKiwi       1.2
Pear   PearShaped   6.2

Or get the groups by fruit and brand:

groups = df.groupby(["A","B"])

print(groups.get_group(('Banana', 'OtherBrand')))

print(groups.get_group(('Banana', 'BananaBrand')))

Solution 2:

It seems to me that you want to group your data from the first table by product type. I suggest a dictionary where the key is the product type and the value is a list of tuples [(brand, revenue),(..., ...)].

Then, for each product type in the dictionary, you can easily pull out the list of brands for that product and, if needed, make a new dictionary containing lists of 3-tuples (brand, revenue, expenses).

Post a Comment for "How To Identify "keys" Of A Tuple/list Of 3-item Tuples?"