Replacing Newlines With Spaces For Str Columns Through Pandas Dataframe

June 16, 2024 Post a Comment

Given an example dataframe with the 2nd and 3rd columns of free text, e.g. >>> import pandas as pd >>> lol = [[1,2,'abc','foo\nbar'], [3,1, 'def\nhaha', 'love it\

Solution 1:

Use replace - first first and last strip and then replace \n:

df = df.replace({r'\s+$': '', r'^\s+': ''}, regex=True).replace(r'\n',  ' ', regex=True)
print (df)
   0123012       abc  foo bar
131defhaha  love it

Solution 2:

You can select_dtypes to select columns of type object and use applymap on those columns.

Because there is no inplace argument for these functions, this would be a workaround to make change to the dataframe:

strs = lol.select_dtypes(include=['object']).applymap(lambdax: x.replace('\n', ' ').strip())
lol[strs.columns] = strs
lol
#   0  1         2        3#0  1  2       abc  foo bar#1  3  1  def haha  love it

Solution 3:

Adding to the other nice answers, this is a vectorized version of your initial idea:

columns = [2,3] 
df.iloc[:, columns] = [df.iloc[:,col].str.strip().str.replace('\n',' ') 
                       forcolin columns]

Details:

In [49]: df.iloc[:, columns] = [df.iloc[:,col].str.strip().str.replace('\n',' ') 
                                 for col in columns]  

In [50]: df
Out[50]: 
   0123012      abc  def haha
131  foo bar   love it

Solution 4:

You may use the following two regex replace approach:

>>>df.replace({ r'\A\s+|\s+\Z': '', '\n' : ' '}, regex=True, inplace=True)>>>df
   0  1         2        3
0  1  2       abc  foo bar
1  3  1  def haha  love it
>>>

Details

'\A\s+|\s+\Z' -> '' will act like strip() removing all leading and trailing whitespace:
- \A\s+ - matches 1 or more whitespace symbols at the start of the string
- | - or
- \s+\Z - matches 1 or more whitespace symbols at the end of the string
'\n' -> ' ' will replace any newline with a space.

Getting Started with Python

Replacing Newlines With Spaces For Str Columns Through Pandas Dataframe

Solution 1:

Solution 2:

Solution 3:

Solution 4:

Post a Comment for "Replacing Newlines With Spaces For Str Columns Through Pandas Dataframe"