Skip to content Skip to sidebar Skip to footer

Deepcopy Pandas DataFrame Containing Python Objects (such As Lists)

Need help understanding variable assignment, pointers, ... The following is reproducible. import pandas as pd df = pd.DataFrame({ 'listData': [ ['c', 'f', 'd', 'a', 'e

Solution 1:

When you run

df['listDataSort'] = df['listData']

All you do is copy the references of the lists to new columns. This means only a shallow copy is performed and both columns reference the same lists. So any change to one column will likely affect another.

You can use a list comprehension with sorted which returns a copy of the data. This should be the easiest option for you.

df['listDataSort'] = [sorted(x) for x in df['listDataSort']]
df

             listData        listDataSort
0  [c, f, d, a, e, b]  [a, b, c, d, e, f]
1     [5, 2, 1, 4, 3]     [1, 2, 3, 4, 5]

Now, when it comes to the problem of making a copy of the entire DataFrame, things are a little more complicated. I would recommend deepcopy:

import copy
df2 = df.apply(copy.deepcopy)

Post a Comment for "Deepcopy Pandas DataFrame Containing Python Objects (such As Lists)"