0

I have python string and contents are shown below:

Using '/tmp' as temporary  location
GNU gdb (GDB) 8.3.0.20190826-git
Copyright (C) 2019 Free Software Foundation, Inc.
Type "show copying" and "show warranty" for details.

(gdb) #0  snp
#3 0x081fc9bc in main (argc=<optimized out>, argv=0xffffde44) at ../../../../../../.
(gdb) quit

jdebug version: 5.0.1
[File is compressed. This may take a moment...]

The only part i want to retrieve is every thing between (gdb) -to- (gdb)quit

Meaning, final out put i am looking is:

(gdb) #0  snp
#3 0x081fc9bc in main (argc=<optimized out>, argv=0xffffde44) at ../../../../../../.
(gdb) quit

Python code which is not working:

with open('st.txt', 'r') as file:
    data = file.read()
print(re.search(r'(gdb).*(gdb) quit', data))

Any idea how can i extract this string using correct regular expression?

1
  • What are you getting for an output? Its hard to diagnose the problem when we only know it doesn't work. Commented Aug 20, 2020 at 4:31

2 Answers 2

2

Here is a solution without regex,

text = """Using '/tmp' as temporary  location
GNU gdb (GDB) 8.3.0.20190826-git
Copyright (C) 2019 Free Software Foundation, Inc.
Type "show copying" and "show warranty" for details.

(gdb) #0  snp
#3 0x081fc9bc in main (argc=<optimized out>, argv=0xffffde44) at ../../../../../../.
(gdb) quit

jdebug version: 5.0.1
[File is compressed. This may take a moment...]"""

s, e = '(gdb)', '(gdb) quit'

text[text.index(s) : text.rindex(e) + len(e)]

(gdb) #0  snp
#3 0x081fc9bc in main (argc=<optimized out>, argv=0xffffde44) at ../../../../../../.
(gdb) quit

timing info

text[text.index(s) : text.rindex(e) + len(e)]

636 ns ± 27.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

re.search(r'^\(gdb\).*?^\(gdb\) quit$', text, re.DOTALL | re.MULTILINE)

6.91 µs ± 360 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Sign up to request clarification or add additional context in comments.

2 Comments

regex will always be slower than a naive string match. It's also significantly more powerful and in this case is providing the anchors to beginning/end of line.
@kerasbaz yes i agree with you, regex is powerful but here i don't see the need to regex.
2

The answer below makes sure that the (gdb) strings appear at the beginning of a line and that the quit appears at the end of a line. The pattern is not greedy (that is, it will match the shortest matching string, not the longest).

Your initial regex did not escape the parentheses around gdb which means it was being processed as a regex capture group and not as a character in the text.

import re

in_str = """Using '/tmp' as temporary  location
GNU gdb (GDB) 8.3.0.20190826-git
Copyright (C) 2019 Free Software Foundation, Inc.
Type "show copying" and "show warranty" for details.

(gdb) #0  snp
#3 0x081fc9bc in main (argc=<optimized out>, argv=0xffffde44) at ../../../../../../.
(gdb) quit

jdebug version: 5.0.1
[File is compressed. This may take a moment...]"""

m = re.search(r'^\(gdb\).*?^\(gdb\) quit$', in_str, re.DOTALL | re.MULTILINE)
if m:
    print(m.group(0))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.