Python, анализирующий заключенные в скобки блоки

string и String идентичны во всех отношениях (кроме верхнего регистра "S"). Так или иначе нет никаких последствий производительности.

Нижний регистр string предпочтен в большинстве проектов из-за подсветки синтаксиса

30
задан Pedro Romano 19 October 2012 в 23:46
поделиться

7 ответов

Псевдокод:

For each string in the array:
    Find the first '{'. If there is none, leave that string alone.
    Init a counter to 0. 
    For each character in the string:  
        If you see a '{', increment the counter.
        If you see a '}', decrement the counter.
        If the counter reaches 0, break.
    Here, if your counter is not 0, you have invalid input (unbalanced brackets)
    If it is, then take the string from the first '{' up to the '}' that put the
     counter at 0, and that is a new element in your array.
24
ответ дан 27 November 2019 в 23:15
поделиться

Или эту версию pyparsing:

>>> from pyparsing import nestedExpr
>>> txt = "{ { a } { b } { { { c } } } }"
>>>
>>> nestedExpr('{','}').parseString(txt).asList()
[[['a'], ['b'], [[['c']]]]]
>>>
41
ответ дан 27 November 2019 в 23:15
поделиться

Я новичок в Python, так что не беспокойтесь, но вот реализация, которая работает:

def balanced_braces(args):
    parts = []
    for arg in args:
        if '{' not in arg:
            continue
        chars = []
        n = 0
        for c in arg:
            if c == '{':
                if n > 0:
                    chars.append(c)
                n += 1
            elif c == '}':
                n -= 1
                if n > 0:
                    chars.append(c)
                elif n == 0:
                    parts.append(''.join(chars).lstrip().rstrip())
                    chars = []
            elif n > 0:
                chars.append(c)
    return parts

t1 = balanced_braces(["{{ a } { b } { { { c } } } }"]);
print t1
t2 = balanced_braces(t1)
print t2
t3 = balanced_braces(t2)
print t3
t4 = balanced_braces(t3)
print t4

Вывод:

['{ a } { b } { { { c } } }']
['a', 'b', '{ { c } }']
['{ c }']
['c']
6
ответ дан 27 November 2019 в 23:15
поделиться

Разбор с использованием lepl (устанавливается с помощью $ easy_install lepl ):

from lepl import Any, Delayed, Node, Space

expr = Delayed()
expr += '{' / (Any() | expr[1:,Space()[:]]) / '}' > Node

print expr.parse("{{a}{b}{{{c}}}}")[0]

Вывод:

Node
 +- '{'
 +- Node
 |   +- '{'
 |   +- 'a'
 |   `- '}'
 +- Node
 |   +- '{'
 |   +- 'b'
 |   `- '}'
 +- Node
 |   +- '{'
 |   +- Node
 |   |   +- '{'
 |   |   +- Node
 |   |   |   +- '{'
 |   |   |   +- 'c'
 |   |   |   `- '}'
 |   |   `- '}'
 |   `- '}'
 `- '}'
5
ответ дан 27 November 2019 в 23:15
поделиться

Вы также можете проанализировать их все сразу, хотя я считаю, что {a} означает "a" , а не [" a "] немного странно. Если я правильно понял формат:

import re
import sys


_mbrack_rb = re.compile("([^{}]*)}") # re.match doesn't have a pos parameter
def mbrack(s):
  """Parse matching brackets.

  >>> mbrack("{a}")
  'a'
  >>> mbrack("{{a}{b}}")
  ['a', 'b']
  >>> mbrack("{{a}{b}{{{c}}}}")
  ['a', 'b', [['c']]]

  >>> mbrack("a")
  Traceback (most recent call last):
  ValueError: expected left bracket
  >>> mbrack("{a}{b}")
  Traceback (most recent call last):
  ValueError: more than one root
  >>> mbrack("{a")
  Traceback (most recent call last):
  ValueError: expected value then right bracket
  >>> mbrack("{a{}}")
  Traceback (most recent call last):
  ValueError: expected value then right bracket
  >>> mbrack("{a}}")
  Traceback (most recent call last):
  ValueError: unbalanced brackets (found right bracket)
  >>> mbrack("{{a}")
  Traceback (most recent call last):
  ValueError: unbalanced brackets (not enough right brackets)
  """
  stack = [[]]
  i, end = 0, len(s)
  while i < end:
    if s[i] != "{":
      raise ValueError("expected left bracket")
    elif i != 0 and len(stack) == 1:
      raise ValueError("more than one root")
    while i < end and s[i] == "{":
      L = []
      stack[-1].append(L)
      stack.append(L)
      i += 1
    stack.pop()
    stack[-1].pop()
    m = _mbrack_rb.match(s, i)
    if m is None:
      raise ValueError("expected value then right bracket")
    stack[-1].append(m.group(1))
    i = m.end(0)
    while i < end and s[i] == "}":
      if len(stack) == 1:
        raise ValueError("unbalanced brackets (found right bracket)")
      stack.pop()
      i += 1
  if len(stack) != 1:
    raise ValueError("unbalanced brackets (not enough right brackets)")
  return stack[0][0]


def main(args):
  if args:
    print >>sys.stderr, "unexpected arguments: %r" % args
  import doctest
  r = doctest.testmod()
  print r
  return r[0]

if __name__ == "__main__":
  sys.exit(main(sys.argv[1:]))
2
ответ дан 27 November 2019 в 23:15
поделиться

If you want to use a parser (lepl in this case), but still want the intermediate results rather than a final parsed list, then I think this is the kind of thing you were looking for:

>>> nested = Delayed()
>>> nested += "{" + (nested[1:,...]|Any()) + "}"
>>> split = (Drop("{") & (nested[:,...]|Any()) & Drop("}"))[:].parse
>>> split("{{a}{b}{{{c}}}}")
['{a}{b}{{{c}}}']
>>> split("{a}{b}{{{c}}}")
['a', 'b', '{{c}}']
>>> split("{{c}}")
['{c}']
>>> split("{c}")
['c']

That might look opaque at first, but it's fairly simple really :o)

nested is a recursive definition of a matcher for nested brackets (the "+" and [...] in the definition keep everything as a single string after it has been matched). Then split says match as many as possible ("[:]") of something that is surrounded by "{" ... "}" (which we discard with "Drop") and contains either a nested expression or any letter.

Finally, here's a lepl version of the "all in one" parser that gives a result in the same format as the pyparsing example above, but which (I believe) is more flexible about how spaces appear in the input:

>>> with Separator(~Space()[:]):
...     nested = Delayed()
...     nested += Drop("{") & (nested[1:] | Any()) & Drop("}") > list
...
>>> nested.parse("{{ a }{ b}{{{c}}}}")
[[['a'], ['b'], [[['c']]]]]
2
ответ дан 27 November 2019 в 23:15
поделиться

Раствор для очистки. Это вернет строку, заключенную в крайнюю скобку. Если возвращается None, совпадений не было.

def findBrackets( aString ):
   if '{' in aString:
      match = aString.split('{',1)[1]
      open = 1
      for index in xrange(len(match)):
         if match[index] in '{}':
            open = (open + 1) if match[index] == '{' else (open - 1)
         if not open:
            return match[:index]
3
ответ дан 27 November 2019 в 23:15
поделиться
Другие вопросы по тегам:

Похожие вопросы: