"If a worker wants to do his job well, he must first sharpen his tools." - Confucius, "The Analects of Confucius. Lu Linggong"
Front page > Programming > How do I replace blank values (whitespace) with NaN in a Pandas DataFrame?

How do I replace blank values (whitespace) with NaN in a Pandas DataFrame?

Published on 2024-11-08
Browse:762

How do I replace blank values (whitespace) with NaN in a Pandas DataFrame?

Replacing Blank Values (Whitespace) with NaN in Pandas

Data cleaning is a crucial step in data analysis. One common task is replacing blank values (whitespace) with NaN. This can be done efficiently using Pandas.

To achieve this, utilize the df.replace() function. This function allows for a regular expression-based search and replace operation on DataFrame values. Here's how you can implement it:

import numpy as np
import pandas as pd

df = pd.DataFrame([
    [-0.532681, 'foo', 0],
    [1.490752, 'bar', 1],
    [-1.387326, 'foo', 2],
    [0.814772, 'baz', ' '],
    [-0.222552, '   ', 4],
    [-1.176781,  'qux', '  '],
], columns='A B C'.split(), index=pd.date_range('2000-01-01','2000-01-06'))

# Replace fields that contain only whitespace (or are empty) with NaN
print(df.replace(r'^\s*$', np.nan, regex=True))

# Output:
#                    A    B   C
# 2000-01-01 -0.532681  foo   0
# 2000-01-02  1.490752  bar   1
# 2000-01-03 -1.387326  foo   2
# 2000-01-04  0.814772  baz NaN
# 2000-01-05 -0.222552  NaN   4
# 2000-01-06 -1.176781  qux NaN

Note that this code replaces fields that contain only whitespace or are empty (i.e., match the regular expression r'^\s*$'**). If your valid data contains white spaces, adjust the regex accordingly (e.g., remove the **$ from the end for r'^\s ').

Latest tutorial More>

Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.

Copyright© 2022 湘ICP备2022001581号-3