Regex To Remove Specific Words In Python
I want to do the some manipulation using regex in python. So input is +1223,+12_remove_me,+222,+2223_remove_me and output should be +1223,+222 Output should only contain comma se
Solution 1:
Your regex seems incomplete, but you were on the right track. Note that a pipe symbol inside a character class is treated as a literal and your [0-9|+]
matches a digit or a |
or a +
symbols.
You may use
,?\+\d+_[^,]+
See the regex demo
Explanation:
,?
- optional,
(if the "word" is at the beginning of the string, it should be optional)\+
- a literal+
\d+
- 1+ digits_
- a literal underscore[^,]+
- 1+ characters other than,
import re
p = re.compile(r',?\+\d+_[^,]+')
test_str = "+1223,+12_remove_me,+222,+2223_remove_me"
result = p.sub("", test_str)
print(result)
# => +1223,+222
Solution 2:
A non-regex approach would involve using str.split()
and excluding items ending with _remove_me
:
>>>s = "+1223,+12_remove_me,+222,+2223_remove_me">>>items = [item for item in s.split(",") ifnot item.endswith("_remove_me")]>>>items
['+1223', '+222']
Or, if _remove_me
can be present anywhere inside each item, use not in
:
>>> items = [item for item in s.split(",") if"_remove_me"notin item]
>>> items
['+1223', '+222']
You can then use str.join()
to join the items into a string again:
>>> ",".join(items)
'+1223,+222'
Solution 3:
Solution 4:
You could perform this without a regex, just using string manipulation. The following can be written as a one-liner, but has been expanded for readability.
my_string = '+1223,+12_remove_me,+222,+2223_remove_me'#define stringmy_list = my_string.split(',') #create a list of wordsmy_list = [word for word in my_list if '_remove_me' not in word] #stop here if you want a list of wordsoutput_string = ','.join(my_list)
Post a Comment for "Regex To Remove Specific Words In Python"