Skip to content Skip to sidebar Skip to footer

Adding Entries From Multiple Files In Python

I have a question on how to add entries from 100 files (each file contains two columns) and then writing them to a new file(which will also contain two columns)?

Solution 1:

This is very underspecified. It's not clear what your problem is.

Probabably you'd do something like:

entries = []
for f in ["file1.txt", "file2.txt", ..., "file100.txt"]:
  entries.append(open(f).readlines())
o = open("output.txt", "w")
o.writelines(entries)
o.close()

Solution 2:

Wasn't sure if you needed a solution to find all those 100 files as well? If so, here is one approach including reading them all and writing them to a joined file:

from os import walk
from os.path import abspath

lines = []
for root, folders, files in walk('./path/'):
    for file in files:
        fh = open(abspath(root + '/' + file), 'rb')
        lines.append(fh.read())
        fh.close()
    # breakif you only want the first level of your directory tree

o = open('output.txt', 'wb')
o.write('\n'.join(lines))
o.close()

You could also do a "memory efficient" solution:

from os import walk
from os.path import abspath

o = open('output.txt', 'wb')

for root, folders, files in walk('./path/'):
    for file in files:
        fh = open(abspath(root + '/' + file), 'rb')
        for line in fh.readline():
            o.write(line)
            del line
        fh.close()
        del fh
    # break if you only want the first level of your directory tree

o.close()

Much of this is automated (I think) within Python, but lazy or not, if you can then remove objects from the memory after closing the files and before and before reusing variable names.. just in case?

Solution 3:

a more scalable way, inspired by Torxed approach

from os import walk
from os.path import abspath

withopen('output.txt', 'wb') as o:
    for root, folders, files in walk('./path/'):
        for filename in files:
            withopen(abspath(root + '/' + filename), 'rb') as i:
                for line in i:
                    o.write(line)

Solution 4:

Do you want to chain them? I.e., do you want all lines of file 1, then all lines of file 2, ... Or do you want to merge them? Line 1 of file 1, line 1 of file 2, ...

For the first case:

from itertools import chain
filenames = ...
file_handles = [open(fn) forfninfilenames]

withopen("output.txt", "w") asout_fh:
    forlineinchain(file_handles):
        out_fh.write(line)

forfhinfile_handles:
    fh.close()

For the second case:

from itertools import izip_longest
filenames = ...
file_handles = [open(fn) for fn in filenames]

withopen("output.txt", "w") as out_fh:
    for lines in izip_longest(*file_handles, fillvalue=None):
        for line in lines:
            if line isnotNone:
                out_fh.write(line)

for fh in file_handles:
    fh.close()

Important: Never forget to close your files!

As @isedev pointed out, this approach is o.k. for 100 files, but as I open all handles immediately, for thousands this won't work.

If you want to overcome this problem, only option 1 (chaining) is reasonable...

filenames = ...

withopen("output.txt", "w") as out_fh:
    for fn in filenames:
        withopen(fn) as fh:
            for line in fh:
                out_fh.write(line)

Post a Comment for "Adding Entries From Multiple Files In Python"