Skip to content Skip to sidebar Skip to footer

Using Python To Remove All Lines Matching Regex

I'm attempting to remove all lines where my regex matches(regex is simply looking for any line that has yahoo in it). Each match is on it's own line, so there's no need for the mu

Solution 1:

Use fileinput module if you want to modify the original file:

import re
import fileinput
for line in fileinput.input(r'C:\temp\Scripts\remove.txt', inplace = True):
   ifnot re.search(r'\byahoo\b', line):
      print(line, end="")

Solution 2:

Here's Python 3 variant of @Ashwini Chaudhary's answer, to remove all lines that contain a regex pattern from a give filename:

#!/usr/bin/env python3"""Usage: remove-pattern <pattern> <file>"""import fileinput
import re
import sys

defmain():
    pattern, filename = sys.argv[1:] # get pattern, filename from command-line
    matched = re.compile(pattern).search
    with fileinput.FileInput(filename, inplace=1, backup='.bak') as file:
        for line in file:
            ifnot matched(line): # save lines that do not matchprint(line, end='') # this goes to filename due to inplace=1

main()

It assumes locale.getpreferredencoding(False) == input_file_encoding otherwise it might break on non-ascii characters.

To make it work regardless what current locale is or for input files that have a different encoding:

#!/usr/bin/env python3import os
import re
import sys
from tempfile import NamedTemporaryFile

defmain():
    encoding = 'utf-8'
    pattern, filename = sys.argv[1:]
    matched = re.compile(pattern).search
    withopen(filename, encoding=encoding) as input_file:
        with NamedTemporaryFile(mode='w', encoding=encoding,
                                dir=os.path.dirname(filename),
                                delete=False) as outfile:
            for line in input_file:
                ifnot matched(line):
                    print(line, end='', file=outfile)
    os.replace(outfile.name, input_file.name)

main()

Solution 3:

You have to read the file try something like:

import re
inputfile = open('C:\\temp\\Scripts\\remove.txt','w',encoding="utf8")

inputfile.write(re.sub("\[(.*?)yahoo(.*?)\n","",inputfile.read()))

file.close()
outputfile.close()

Post a Comment for "Using Python To Remove All Lines Matching Regex"