Skip to content Skip to sidebar Skip to footer

Regex To Remove Specific Words In Python

I want to do the some manipulation using regex in python. So input is +1223,+12_remove_me,+222,+2223_remove_me and output should be +1223,+222 Output should only contain comma se

Solution 1:

Your regex seems incomplete, but you were on the right track. Note that a pipe symbol inside a character class is treated as a literal and your [0-9|+] matches a digit or a | or a + symbols.

You may use

,?\+\d+_[^,]+

See the regex demo

Explanation:

  • ,? - optional , (if the "word" is at the beginning of the string, it should be optional)
  • \+ - a literal +
  • \d+ - 1+ digits
  • _ - a literal underscore
  • [^,]+ - 1+ characters other than ,

Python demo:

import re
p = re.compile(r',?\+\d+_[^,]+')
test_str = "+1223,+12_remove_me,+222,+2223_remove_me"
result = p.sub("", test_str)
print(result)
# => +1223,+222

Solution 2:

A non-regex approach would involve using str.split() and excluding items ending with _remove_me:

>>>s = "+1223,+12_remove_me,+222,+2223_remove_me">>>items = [item for item in s.split(",") ifnot item.endswith("_remove_me")]>>>items
['+1223', '+222']

Or, if _remove_me can be present anywhere inside each item, use not in:

>>> items = [item for item in s.split(",") if"_remove_me"notin item]
>>> items
['+1223', '+222']

You can then use str.join() to join the items into a string again:

>>> ",".join(items)
'+1223,+222'

Solution 3:

In your case you need regex with negotiation

[^(_remove_me)]

Demo

Solution 4:

You could perform this without a regex, just using string manipulation. The following can be written as a one-liner, but has been expanded for readability.

my_string = '+1223,+12_remove_me,+222,+2223_remove_me'#define stringmy_list = my_string.split(',')                         #create a list of wordsmy_list = [word for word in my_list if '_remove_me' not in word] #stop here if you want a list of wordsoutput_string = ','.join(my_list)

Post a Comment for "Regex To Remove Specific Words In Python"