Dropping Rows from a Pandas Dataframe
In Pandas, we often encounter the need to remove certain rows from a dataframe, either for data cleaning purposes or to focus on specific subsets. One efficient way to achieve this is by utilizing the drop function, which allows us to selectively remove rows based on various criteria.
To demonstrate the process, let's consider a dataframe df:
import pandas as pd
df = pd.DataFrame({'sales': [2.709, 6.590, 10.103, 15.915, 3.196, 7.907],
'discount': [None, None, None, None, None, None],
'net_sales': [2.709, 6.590, 10.103, 15.915, 3.196, 7.907],
'cogs': [2.245, 5.291, 7.981, 12.686, 2.710, 6.459]})
print(df)
Now, suppose we want to drop rows with certain sequence numbers, represented by a list, such as [1, 2, 4]. To do so, we can use the drop function as follows:
indices_to_drop = [1, 2, 4]
conditions_to_drop = df['sales'] > 10
df = df[~conditions_to_drop]
By specifying the index parameter in drop, we can effectively remove the rows corresponding to the provided indices, leaving us with the desired subset:
df = df.drop(index=indices_to_drop)
print(df)
In this case, it would result in the following dataframe:
sales discount net_sales cogs STK_ID RPT_Date 600141 20060331 2.709 NaN 2.709 2.245 20061231 15.915 NaN 15.915 12.686 20070630 7.907 NaN 7.907 6.459
Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.
Copyright© 2022 湘ICP备2022001581号-3