Skip to content Skip to sidebar Skip to footer

Python Truncate Lines As They Are Read

I have an application that reads lines from a file and runs its magic on each line as it is read. Once the line is read and properly processed, I would like to delete the line from

Solution 1:

Remove all lines after you've done with them:

withopen('myfile.txt', 'r+') as file:
    for line in file:
        processLine(line)
    file.truncate(0)

Remove each line independently:

lines = open('myfile.txt').readlines()

for line inlines[::-1]: # process linesinreverse order
    processLine(line)
    del lines[-1]  # remove the [last] line

open('myfile.txt', 'w').writelines(lines)

You can leave only those lines that cause exceptions:

import fileinput

for line in fileinput.input(['myfile.txt'], inplace=1):
    try: processLine(line)
    except Exception:
         sys.stdout.write(line) # it prints to 'myfile.txt'

In general, as other people already said it is a bad idea what you are trying to do.

Solution 2:

You can't. It is just not possible with actual text file implementations on current filesystems.

Text files are sequential, because the lines in a text file can be of any length. Deleting a particular line would mean rewriting the entire file from that point on.

Suppose you have a file with the following 3 lines;

'line1\nline2reallybig\nline3\nlast line'

To delete the second line you'd have to move the third and fourth lines' positions in the disk. The only way would be to store the third and fourth lines somewhere, truncate the file on the second line, and rewrite the missing lines.

If you know the size of every line in the text file, you can truncate the file in any position using .truncate(line_size * line_number) but even then you'd have to rewrite everything after the line.

Solution 3:

You're better off keeping a index into the file so that you can start where you stopped last, without destroying part of the file. Something like this would work :

try :
    for index, line inenumerate(file) :
        processLine(line)
except :
    # Failed, start from this line number next time.print(index)
    raise

Solution 4:

Truncating the file as you read it seems a bit extreme. What if your script has a bug that doesn't cause an error? In that case you'll want to restart at the beginning of your file.

How about having your script print the line number it breaks on and having it take a line number as a parameter so you can tell it which line to start processing from?

Solution 5:

First of all, calling the operation truncate is probably not the best pick. If I understand the problem correctly, you want to delete everything up to the current position in file. (I would expect truncate to cut everything from the current position up to the end of the file. This is how the standard Python truncate method works, at least if I Googled correctly.)

Second, I am not sure it is wise to modify the file while iterating on in using the for loop. Wouldn’t it be better to save the number of lines processed and delete them after the main loop has finished, exception or not? The file iterator supports in-place filtering, which means it should be fairly simple to drop the processed lines afterwards.

P.S. I don’t know Python, take this with a grain of salt.

Post a Comment for "Python Truncate Lines As They Are Read"