Regex To Remove Specific Words In Python
I want to do the some manipulation using regex in python. So input is +1223,+12_remove_me,+222,+2223_remove_me and output should be +1223,+222 Output should only contain comma se
Solution 1:
Your regex seems incomplete, but you were on the right track. Note that a pipe symbol inside a character class is treated as a literal and your [0-9|+] matches a digit or a | or a + symbols.
You may use
,?\+\d+_[^,]+
See the regex demo
Explanation:
,?- optional,(if the "word" is at the beginning of the string, it should be optional)\+- a literal+\d+- 1+ digits_- a literal underscore[^,]+- 1+ characters other than,
import re
p = re.compile(r',?\+\d+_[^,]+')
test_str = "+1223,+12_remove_me,+222,+2223_remove_me"
result = p.sub("", test_str)
print(result)
# => +1223,+222Solution 2:
A non-regex approach would involve using str.split() and excluding items ending with _remove_me:
>>>s = "+1223,+12_remove_me,+222,+2223_remove_me">>>items = [item for item in s.split(",") ifnot item.endswith("_remove_me")]>>>items
['+1223', '+222']
Or, if _remove_me can be present anywhere inside each item, use not in:
>>> items = [item for item in s.split(",") if"_remove_me"notin item]
>>> items
['+1223', '+222']
You can then use str.join() to join the items into a string again:
>>> ",".join(items)
'+1223,+222'Solution 3:
Solution 4:
You could perform this without a regex, just using string manipulation. The following can be written as a one-liner, but has been expanded for readability.
my_string = '+1223,+12_remove_me,+222,+2223_remove_me'#define stringmy_list = my_string.split(',') #create a list of wordsmy_list = [word for word in my_list if '_remove_me' not in word] #stop here if you want a list of wordsoutput_string = ','.join(my_list)
Post a Comment for "Regex To Remove Specific Words In Python"