Skip to content Skip to sidebar Skip to footer

Pandas Counting Occurrence Of List Contained In Column Of Lists

I have this Pandas DataFrame that has a column with lists: >>> df = pd.DataFrame({'m': [[1,2,3], [5,3,2], [2,5], [3,8,1], [9], [2,6,3]]}) >>> df m 0 [

Solution 1:

You can utilise DataFrame.apply along with the builtin set.issubset method and then .sum() which all operate at a lower level (normally C level) than Python equivalents do.

subset_wanted = {2, 3}
count = df.m.apply(subset_wanted.issubset).sum()

I can't see shaving more time off that than writing a custom C-level function which'd be the equivalent of a custom sum with a check there's a subset to determine 0/1 on a row by row basis. At which point, you could have run this thousands upon thousands of times anyway.

Solution 2:

Since you are looking more a set-like behavior

(df.m.apply(lambda x: set(x).intersection(set([2,3]))) == set([2,3])).sum()

Returns

3

Post a Comment for "Pandas Counting Occurrence Of List Contained In Column Of Lists"