Selecting Columns in Pandas Dataframes
When dealing with data manipulation tasks, selecting specific columns becomes necessary. In Pandas, there are various options for selecting columns.
Option 1: Using Column Names
To select columns by their names, simply pass a list of column names as follows:
df1 = df[['a', 'b']]
Option 2: Using Numerical Indices
If the column indices are known, use the iloc function to select them. Note that Python indexing is zero-based.
df1 = df.iloc[:, 0:2] # Select columns with indices 0 and 1
Alternative Option: Indexing Using Dictionary
For cases where column indices may change, use the following approach:
column_dict = {df.columns.get_loc(c): c for idx, c in enumerate(df.columns)} df1 = df.iloc[:, list(column_dict.keys())]
Unrecommended Approaches
The following approaches are not recommended as they can lead to errors:
df1 = df['a':'b'] # Slicing column names does not work df1 = df.ix[:, 'a':'b'] # Deprecated indexing method
Preserving Original Data
Note that selecting columns only creates a view or reference to the original dataframe. If you need an independent copy of the selected columns, use the copy() method:
df1 = df.iloc[:, 0:2].copy()
Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.
Copyright© 2022 湘ICP备2022001581号-3