Pyparsing: SetResultsName For Multiple Elements Get Combined
Solution 1:
Not documented that I know of, but I found something in pyparsing.py
:
I changed .setResultsName('model_definition')
to .setResultsName('model_definition*')
and they listed correctly!
Edit: it is documented, but it is a flag you pass to setResultsName
:
setResultsName( string, listAllMatches=False ) - name to be given to tokens matching the element; if multiple tokens within a repetition group (such as ZeroOrMore or delimitedList) the default is to return only the last matching token - if listAllMatches is set to True, then a list of matching tokens is returned.
Solution 2:
Here is enough of your code to get things to work:
from pyparsing import *
# fake in the bare minimum to parse the given test strings
identifier = Word(alphas, alphanums)
integer = Word(nums)
function_call = identifier + '(' + Optional(delimitedList(identifier | integer)) + ')'
expression = function_call
model_definition = Group(identifier.setResultsName('random_variable_name') + '~' + expression)
sample = """
x ~ normal(mu, 1)
y ~ normal(mu2, 1)
"""
The trailing '*'
is there in setResultsName
for those cases where you use the short form of setResultsName
: expr("name*")
vs expr.setResultsName("name", listAllMatches=True)
. If you prefer calling setResultsName
, then I would not use the '*'
notation, but would pass the listAllMatches
argument.
If you are getting names that step on each other, you may need to add a level of Grouping. Here is your solution using listAllMatches=True
, by virtue of the trailing '*'
notation:
model_definition1 = model_definition('model_definition*')
print OneOrMore(model_definition1).parseString(sample).dump()
It returns this parse result:
[['x', '~', 'normal', '(', 'mu', '1', ')'], ['y', '~', 'normal', '(', 'mu2', '1', ')']]
- model_definition: [['x', '~', 'normal', '(', 'mu', '1', ')'], ['y', '~', 'normal', '(', 'mu2', '1', ')']]
[0]:
['x', '~', 'normal', '(', 'mu', '1', ')']
- random_variable_name: x
[1]:
['y', '~', 'normal', '(', 'mu2', '1', ')']
Here is a variation that does not use listAllMatches
, but adds another level of Group:
model_definition2 = model_definition('model_definition')
print OneOrMore(Group(model_definition2)).parseString(sample).dump()
gives:
[[['x', '~', 'normal', '(', 'mu', '1', ')']], [['y', '~', 'normal', '(', 'mu2', '1', ')']]]
[0]:
[['x', '~', 'normal', '(', 'mu', '1', ')']]
- model_definition: ['x', '~', 'normal', '(', 'mu', '1', ')']
- random_variable_name: x
[1]:
[['y', '~', 'normal', '(', 'mu2', '1', ')']]
- model_definition: ['y', '~', 'normal', '(', 'mu2', '1', ')']
- random_variable_name: y
In both cases, I see the full content being returned, so I don't quit understand what you mean by "if you return multiple, it fails to split out each child."
Post a Comment for "Pyparsing: SetResultsName For Multiple Elements Get Combined"