Keep Elements With Pattern In Pandas Series Without Converting Them To List
I have the following dataframe: df = pd.DataFrame(['Air type:1, Space kind:2, water', 'something, Space blu:3, somethingelse'], columns = ['A']) and I want to create a new column
Solution 1:
You can use pd.Series.str.findall
here.
df['new'] = df['A'].str.findall('\w+:\w+')
A new
0 type:1, kind:2, water [type:1, kind:2]
1 something, blu:3, somethingelse [blu:3]
EDIT:
When there are multiple words then try
df['new'] = df['A'].str.findall('[^\s,][^:,]+:[^:,]+').str.join(', ')
A new
0 Air type:1, Space kind:2, water Air type:1, Space kind:2
1 something, Space blu:3, somethingelse Space blu:3
Solution 2:
You can use findall
with join
:
import pandas as pd
df = pd.DataFrame(["type:1, kind:2, water", "something, blu:3, somethingelse"], columns = ['A'])
df['new'] = df['A'].str.findall(r'[^\s:,]+:[^\s,]+').str.join(', ')
df['new']
# => 0 type:1, kind:2# => 1 blu:3
The regex matches
[^\s:,]+
- one or more chars other than whitespace,:
and,
:
- a colon[^\s,]+
- one or more chars other than whitespace and,
.
See the regex demo.
The .str.join(', ')
concats all the found matches with ,
+space.
Post a Comment for "Keep Elements With Pattern In Pandas Series Without Converting Them To List"