How to Add a Column to a Grouped Dataframe in Pandas
In data analysis, it is often necessary to group data and perform calculations on each group. Pandas offers a convenient way to do this through its groupby function. One common task is to count the values of a column within each group and add a column containing these counts to the dataframe.
Consider the dataframe df:
df = pd.DataFrame({'c':[1,1,1,2,2,2,2],'type':['m','n','o','m','m','n','n']})
To count the values of type for each c, we can use the value_counts function on the grouped dataframe:
g = df.groupby('c')['type'].value_counts().reset_index(name='t')
This creates a new dataframe g with the group counts. To add a column to g with the size of each group, we can use the transform function:
g['size'] = df.groupby('c')['type'].transform('size')
transform applies a function to each group in the original dataframe and returns a Series with its index aligned to the original dataframe. In this case, we use the size function to count the number of elements in each group and assign it to the new column size. The resulting dataframe g will now look like this:
c type t size
0 1 m 1 3
1 1 n 1 3
2 1 o 1 3
3 2 m 2 4
4 2 n 2 4
This demonstrates a straightforward way to add a new column to a grouped dataframe based on the results of a groupby aggregation.
Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.
Copyright© 2022 湘ICP备2022001581号-3