Keep Elements With Pattern In Pandas Series Without Converting Them To List
I have the following dataframe: df = pd.DataFrame(['Air type:1, Space kind:2, water', 'something, Space blu:3, somethingelse'], columns = ['A']) and I want to create a new column
Solution 1:
You can use pd.Series.str.findall here.
df['new'] = df['A'].str.findall('\w+:\w+')
A new
0 type:1, kind:2, water [type:1, kind:2]
1 something, blu:3, somethingelse [blu:3]
EDIT:
When there are multiple words then try
df['new'] = df['A'].str.findall('[^\s,][^:,]+:[^:,]+').str.join(', ')
A new
0 Air type:1, Space kind:2, water Air type:1, Space kind:2
1 something, Space blu:3, somethingelse Space blu:3
Solution 2:
You can use findall with join:
import pandas as pd
df = pd.DataFrame(["type:1, kind:2, water", "something, blu:3, somethingelse"], columns = ['A'])
df['new'] = df['A'].str.findall(r'[^\s:,]+:[^\s,]+').str.join(', ')
df['new']
# => 0 type:1, kind:2# => 1 blu:3The regex matches
[^\s:,]+- one or more chars other than whitespace,:and,:- a colon[^\s,]+- one or more chars other than whitespace and,.
See the regex demo.
The .str.join(', ') concats all the found matches with ,+space.
Post a Comment for "Keep Elements With Pattern In Pandas Series Without Converting Them To List"