How To Read Multiple Dictionaries From A File In Python?
Solution 1:
Provided the inner elements are valid JSON, the following could work. I dug up the source of simplejson
library and modified it to suit your use case. An SSCCE is below.
import re
import simplejson
FLAGS = re.VERBOSE | re.MULTILINE | re.DOTALL
WHITESPACE = re.compile(r'[ \t\n\r]*', FLAGS)
defgrabJSON(s):
"""Takes the largest bite of JSON from the string.
Returns (object_parsed, remaining_string)
"""
decoder = simplejson.JSONDecoder()
obj, end = decoder.raw_decode(s)
end = WHITESPACE.match(s, end).end()
return obj, s[end:]
defmain():
withopen("out.txt") as f:
s = f.read()
whileTrue:
obj, remaining = grabJSON(s)
print">", obj
s = remaining
ifnot remaining.strip():
break
.. which with some similar JSON in out.txt will output something like:
> {'hello': ['world', 'hell', {'test': 'haha'}]}> {'hello': ['world', 'hell', {'test': 'haha'}]}> {'hello': ['world', 'hell', {'test': 'haha'}]}
Solution 2:
Since the data in your input file isn't really in JSON or Python object literal format, you're going to need to parse it yourself. You haven't really specified what the allowable keys and values are in the dictionary, so the following only allows them to be alphanumeric character strings.
So given an input file with the following contents nameddoc.txt
:
{key1: value1
key2: value2
key3: value3
}
{key4: value4
key5: value5
}
The following reads and transforms it into a Python list of dictionaries composed of alphanumeric keys and values:
from pprint import pprint
import re
dictpat = r'\{((?:\s*\w+\s*:\s*\w+\s*)+)\}'# note non-capturing (?:) inner group
itempat = r'(\s*(\w+)\s*:\s*(\w+)\s*)'# which is captured in this exprwithopen('doc.txt') as f:
lod = [{group[1]:group[2] for group in re.findall(itempat, items)}
for items in re.findall(dictpat, f.read())]
pprint(lod)
Output:
[{'key1': 'value1', 'key2': 'value2', 'key3': 'value3'},
{'key4': 'value4', 'key5': 'value5'}]
Solution 3:
You'll have to put it in a big list in order to get it work. i.e.
[
{key1:val1, key2:val2, key3:val3, ...keyN:valN}
, {key1:val1, key2:val2, key3:val3, ...keyN:valN}
, {key1:val1, key2:val2, key3:val3, ...keyN:valN}
...
]
If you can't change the data file format, I'm afraid you'll have to roll your own function to interpret the data.
Solution 4:
import re
fl = open('doc.txt', 'rb')
result = map(
lambda part: dict(
re.match(
r'^\s*(.*?)\s*:\s*(.*?)\s*$', # splits with ':' ignoring space symbols
line
).groups()
for line in part.strip().split('\n') # splits with '\n', new line is a new key-value
),
re.findall(
r'\{(.*?)\}', # inside of { ... }
fl.read(),
flags=re.DOTALL # considering '\n'-symbols
)
)
fl.close()
Post a Comment for "How To Read Multiple Dictionaries From A File In Python?"