1

I'm trying to make a parser to a string that contains one or more of the following patterns:

  1. -flag
  2. -flag object
  3. -flag object {nested}
  4. -flag {object {nested}}

For example I will use the following string:

-flag1 -flag2 object2 { -nested_flag1 nested_obj1 -nested_flag2 }

To parse it I use:

exp = '-flag1 -flag2 object2 { -nested_flag1 nested_obj1 -nested_flag2 }'
only_flag = '-' + Word(printables, excludeChars='-').setResultsName("flag")
flag_w_obj = only_flag + Optional("{") + Word(printables, excludeChars='-').setResultsName("object")
flag_w_obj_w_nested = flag_w_obj + originalTextFor(nestedExpr("{", "}")).setResultsName("nested_expr")

parser = flag_w_obj_w_nested | flag_w_obj | only_flag
parsed = parser.searchString(exp)

How can I evaluate the nested expression by the same rules to get the nested flags and objects?

My final desired result is to make a dict containing the data in the format:

{
  "flag1": null,
  "flag2": {
    "object1": {
      "nested_flag1": "nested_obj1",
      "nested_flag2": null
    }
  }
}

1 Answer 1

1

Most times, whenever you have an expression with nested contents, you will end up using a pyparsing Forward expression. A Forward allows you to refer to an expression that is not yet fully defined - it is a forward declaration. In your parser, the -flag args... form can itself contain flag nested within {}'s. Once the contents can be specified, then the "assignment" can be done using the <<= operator.

I also strongly recommend writing a short design of the parser before you begin actually writing any code (do this no matter what parsing library you plan on using). In parsing, a BNF (Backus-Naur Form) is the typical format to use. It does not have to be ultra-rigorous, but enough detail so that you can manually work through it to make sure you have the pieces right, and there are no ambiguities. Also good to write out a few example strings that you expect the parser to be able to process.

The following annotated code shows the BNF I wrote out for your problem, and the implementation of it into a pyparsing parser.

"""
BNF

flag_name := '-' alphanumeric...
flag_arg_word := alphanumeric...

flag_arg := flag_arg_word | '{' flag_expr... '}'
flag_expr := flag_name [flag_arg...]
"""

import pyparsing as pp

# define punctuation
LBRACE, RBRACE = map(pp.Suppress, "{}")

# recursive parser requires a forward declaraction
flag_expr = pp.Forward()

# implement BNF definitions
flag_name = pp.Word("-", pp.alphanums + "_")
flag_arg_word = pp.Word(pp.alphas, pp.alphanums + "_")
flag_arg = flag_arg_word | (LBRACE + flag_expr[...] + RBRACE)

# use '<<=' operator to define recursive expression
flag_expr <<= pp.Group(flag_name + pp.Group(flag_arg[...]))

# a command is 0 or more flag_exprs
cmd_expr = flag_expr[...]

# test it out
cmd = "-flag1 -flag2 object2 { -nested_flag1 nested_obj1 -nested_flag2 }"
parsed = cmd_expr.parseString(cmd)

# convert to a list and show with pprint
from pprint import pprint
pprint(parsed.asList(), width=12)

# runTests is very useful for running several test strings through a parser
cmd_expr.runTests("""\
-flag1 -flag2 object2 { -nested_flag1 nested_obj1 -nested_flag2 }
""")
Sign up to request clarification or add additional context in comments.

1 Comment

Is there a built-in method for converting the list to a dictionary as I showed above?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.