Skip to content Skip to sidebar Skip to footer

Split Rows In Pandas Dataframe

i stuck with the problem how to devide pandas dataframe by row, i have similar dataframe with column where values separated by \r\n and they are in one cell, Color

Solution 1:

You can do:



Color    Shape   Price
0   Green   Rectangle   101   Green   Triangle    102   Green   Octangle    103   Blue    Rectangle   154   Blue    Triangle    15

Solution 2:

This might not be the most efficient way to do it but I can confirm that it works with the sample df:

data = [['Green', 'Rectangle\r\nTriangle\r\nOctangle', 10], ['Blue', 'Rectangle\r\nTriangle', 15]]   
df = pd.DataFrame(data, columns = ['Color', 'Shape', 'Price'])
new_df = pd.DataFrame(columns = ['Color', 'Shape', 'Price'])

for index, row in df.iterrows():
    split = row['Shape'].split('\r\n')
    for shape insplit:
        new_df = new_df.append(pd.DataFrame({'Color':[row['Color']], 'Shape':[shape], 'Price':[row['Price']]}))

new_df = new_df.reset_index(drop=True)


Color Price      Shape
0  Green    10  Rectangle
1  Green    10   Triangle
2  Green    10   Octangle
3   Blue    15  Rectangle
4   Blue    15   Triangle

Solution 3:

First, you'll need to split the Shape by white spaces, that will give you list of shapes. Then, use df.explode to unpack the list and create new rows for each of them

df["Shape"] = df.Shape.str.split()

Solution 4:

As commented, str.split() followed by explode is helpful. If you are not on Pandas 0.25, then you can use melt afterward:

(pd.concat( (df.Shape.str.split('\r\n', expand=True), 
   .melt(id_vars=['Color', 'Price'], value_name='Shape')


Color  Price variable      Shape
0  Green     100  Rectangle
1   Blue     150  Rectangle
2  Green     101   Triangle
3   Blue     151   Triangle
4  Green     102   Octangle

Post a Comment for "Split Rows In Pandas Dataframe"