Skip to content Skip to sidebar Skip to footer

How To Read Multiple Dictionaries From A File In Python?

I am relatively new to python. I am trying to read an ascii file with multiple dictionaries in it. The file has the following format. {Key1: value1 key2: value2 ... } {Key1: va

Solution 1:

Provided the inner elements are valid JSON, the following could work. I dug up the source of simplejson library and modified it to suit your use case. An SSCCE is below.

import re
import simplejson

FLAGS = re.VERBOSE | re.MULTILINE | re.DOTALL
WHITESPACE = re.compile(r'[ \t\n\r]*', FLAGS)

defgrabJSON(s):
    """Takes the largest bite of JSON from the string.
       Returns (object_parsed, remaining_string)
    """
    decoder = simplejson.JSONDecoder()
    obj, end = decoder.raw_decode(s)
    end = WHITESPACE.match(s, end).end()
    return obj, s[end:]

defmain():
    withopen("out.txt") as f:
        s = f.read()

    whileTrue:
        obj, remaining = grabJSON(s)
        print">", obj
        s = remaining
        ifnot remaining.strip():
            break

.. which with some similar JSON in out.txt will output something like:

> {'hello': ['world', 'hell', {'test': 'haha'}]}> {'hello': ['world', 'hell', {'test': 'haha'}]}> {'hello': ['world', 'hell', {'test': 'haha'}]}

Solution 2:

Since the data in your input file isn't really in JSON or Python object literal format, you're going to need to parse it yourself. You haven't really specified what the allowable keys and values are in the dictionary, so the following only allows them to be alphanumeric character strings.

So given an input file with the following contents nameddoc.txt:

{key1: value1
 key2: value2
 key3: value3
}
{key4: value4
 key5: value5
}

The following reads and transforms it into a Python list of dictionaries composed of alphanumeric keys and values:

from pprint import pprint
import re

dictpat = r'\{((?:\s*\w+\s*:\s*\w+\s*)+)\}'# note non-capturing (?:) inner group
itempat = r'(\s*(\w+)\s*:\s*(\w+)\s*)'# which is captured in this exprwithopen('doc.txt') as f:
    lod = [{group[1]:group[2] for group in re.findall(itempat, items)}
                                for items in re.findall(dictpat, f.read())]

pprint(lod)

Output:

[{'key1': 'value1', 'key2': 'value2', 'key3': 'value3'},
 {'key4': 'value4', 'key5': 'value5'}]

Solution 3:

You'll have to put it in a big list in order to get it work. i.e.

[
    {key1:val1, key2:val2, key3:val3, ...keyN:valN}
    , {key1:val1, key2:val2, key3:val3, ...keyN:valN}
    , {key1:val1, key2:val2, key3:val3, ...keyN:valN}
    ...
]

If you can't change the data file format, I'm afraid you'll have to roll your own function to interpret the data.

Solution 4:

import re

fl = open('doc.txt', 'rb')

result = map(
    lambda part: dict(
        re.match(
            r'^\s*(.*?)\s*:\s*(.*?)\s*$', # splits with ':' ignoring space symbols
            line
        ).groups()
        for line in part.strip().split('\n') # splits with '\n', new line is a new key-value
    ),
    re.findall(
        r'\{(.*?)\}', # inside of { ... }
        fl.read(),
        flags=re.DOTALL # considering '\n'-symbols
    )
)

fl.close()

Post a Comment for "How To Read Multiple Dictionaries From A File In Python?"