Skip to content Skip to sidebar Skip to footer

Python Recursive Directory Reading

I wish to avoid os.walk, i am using a recursive function to read files and folders and store files to a dictionary I got rid of the os.chdir but for some reason function is now joi

Solution 1:

This seems to work for me

import os

op = os.path

def fileRead(mydir):
    data = {}
    root = set()
    for i inos.listdir(mydir):
        path = op.join(mydir, i)
        print(path)
        if op.isfile(path):
            data.setdefault(i, set())
            root.add(op.relpath(mydir).replace("\\", "/"))
            data[i] = root
        else:
            data.update(fileRead(path))
    return data


d = fileRead("c:\python32\programas")
print(d)

Still I am not sure why you use the set root. I think the purpose is to keep all the directories when you have the same file in two directories. But it doesnt work: each update deletes the stored values for repeated keys (file names).

Here you have a working code, using a defaultdict /you can do the same with an ordinary dictionary (as in your code) but using defauldict you dont need to check if a key has been initialized before:

import os
from collections import defaultdict
op = os.path

def fileRead(mydir):
    data = defaultdict(list)
    for i inos.listdir(mydir):
        path = op.join(mydir, i)
        print(path)
        if op.isfile(path):
            root = op.relpath(mydir).replace("\\", "/")
            data[i].append(root)
        else:
            for k, v in fileRead(path).items():
                data[k].extend(v)
    return data


d = fileRead("c:\python32\programas")
print(d)

Edit: Relative to the comment from @hughdbrown:

If you update data with data.update(fileRead(path).items()) you get tthis when calling for fileRead("c:/python26/programas/pack") in my computer (now in py26):

c:/python26/programas/pack\copia.py c:/python26/programas/pack\in pack.py c:/python26/programas/pack\pack2 c:/python26/programas/pack\pack2\copia.py c:/python26/programas/pack\pack2\in_pack2.py c:/python26/programas/pack\pack2\pack3 c:/python26/programas/pack\pack2\pack3\copia.py c:/python26/programas/pack\pack2\pack3\in3.py

defaultdict( 'list'>, {'in3.py': ['pack/pack2/pack3'], 'copia.py': ['pack/pack2/pack3'], 'in pack.py': ['pack'], 'in_pack2.py': ['pack/pack2']})

Note that files that are repeated in several directories (copia.py) only show one of those directories, the deeper one. However all the directories are listed when using:

for k, v in fileRead(path).items():  data[k].extend(v)

c:/python26/programas/pack\copia.py c:/python26/programas/pack\in pack.py c:/python26/programas/pack\pack2 c:/python26/programas/pack\pack2\copia.py c:/python26/programas/pack\pack2\in_pack2.py c:/python26/programas/pack\pack2\pack3 c:/python26/programas/pack\pack2\pack3\copia.py c:/python26/programas/pack\pack2\pack3\in3.py

defaultdict(, {'in3.py': ['pack/pack2/pack3'], 'copia.py': ['pack', 'pack/pack2', 'pack/pack2/pack3'], 'in pack.py': ['pack'], 'in_pack2.py': ['pack/pack2']})

Post a Comment for "Python Recursive Directory Reading"