Sort Within Group Without Changing Group Order?
Solution 1:
You could create a new temporary column that transforms B
, A
and C
to 1
, 2
and 3
, so that you maintain order of the unordered. Then, just drop the temporary column. In Answer #1, this is more dynamic and will work if the group
column values are not grouped together consecutively. For Answer #2, they must be consecutive (the inputs for answer #1 and answer #2 are ordered differently)
Updated Answer #1 (per comment - the groups are not consecutive in rows, but we still want to order them correctly by the order of appearance of the first value within each group.) The code [l for l in enumerate((df['group'].unique()))]
will assign a number to each group depending on the order of the first value of the group
column in the dataframe.
In[1]:
name group revenue
0 Name1 GroupB 1
3 Name4 GroupA 4
4 Name5 GroupA 5
8 Name7 GroupC 9
1 Name2 GroupB 2
2 Name3 GroupB 3
5 Name6 GroupA 6
6 Name7 GroupC 7
7 Name7 GroupC 8
dft = pd.DataFrame([l for l in enumerate((df['group'].unique()))], columns=['group_number','group'])
df = pd.merge(df, dft, how='left', on='group').sort_values(['group_number', 'revenue'], ascending = [True, False])
df
Out[1]:
name group revenue group_number
5 Name3 GroupB 3 0
4 Name2 GroupB 2 0
0 Name1 GroupB 1 0
6 Name6 GroupA 6 1
2 Name5 GroupA 5 1
1 Name4 GroupA 4 1
3 Name7 GroupC 9 2
8 Name7 GroupC 8 2
7 Name7 GroupC 7 2
I wanted to highlight the output of dft
of the enumerate
line of code before the merge and sort.
dft = pd.DataFrame([l for l in enumerate((df['group'].unique()))], columns=['group_number','group'])
dft
Out[2]:
group_number group
0 0 GroupB
1 1 GroupA
2 2 GroupC
Answer #2
import pandas as pd
df = pd.DataFrame({'name': ['Name1','Name2','Name3','Name4','Name5','Name6', 'Name7', 'Name7', 'Name7'],
'group':['GroupB','GroupB','GroupB','GroupA','GroupA','GroupA','GroupC','GroupC','GroupC'],'revenue':[1,2,3,4,5,6,7,8,9]})
df['cs'] = (df['group'] != df['group'].shift(1)).cumsum()
df = df.sort_values(['cs', 'revenue'], ascending = [True, False])
df
Out[11]:
name group revenue cs
2 Name3 GroupB 3 1
1 Name2 GroupB 2 1
0 Name1 GroupB 1 1
5 Name6 GroupA 6 2
4 Name5 GroupA 5 2
3 Name4 GroupA 4 2
8 Name7 GroupC 9 3
7 Name7 GroupC 8 3
6 Name7 GroupC 7 3
For both answers, then just drop the column:
df = df.drop('cs', axis=1)
Out[12]:
name group revenue
2 Name3 GroupB 3
1 Name2 GroupB 2
0 Name1 GroupB 1
5 Name6 GroupA 6
4 Name5 GroupA 5
3 Name4 GroupA 4
8 Name7 GroupC 9
7 Name7 GroupC 8
6 Name7 GroupC 7
Solution 2:
Why use groupby at all? You could just chain together multiple sort_values calls to get the correct sort order. e.g. using similar data to linked question and you wanted to sort by revenue descending but maintain groups ascending you could do:
import pandas as pd
df = pd.DataFrame({'name': ['Name1','Name2','Name3','Name4','Name5','Name6', 'Name7', 'Name7', 'Name7'],
'group':['GroupB','GroupB','GroupB','GroupA','GroupA','GroupA','GroupC','GroupC','GroupC'],'revenue':[1,2,3,4,5,6,7,8,9]})
df.sort_values(by='revenue', ascending= False).sort_values(by='group')
Which would return:
name group revenue
5 Name6 GroupA 6
4 Name5 GroupA 5
3 Name4 GroupA 4
2 Name3 GroupB 3
1 Name2 GroupB 2
0 Name1 GroupB 1
8 Name7 GroupC 9
7 Name7 GroupC 8
6 Name7 GroupC 7
Post a Comment for "Sort Within Group Without Changing Group Order?"