best way to find substring using regex in python 3

Question

I was trying to find out the best way to find the specific substring in key value pair using re for the following:

some_string-variable_length/some_no_variable_digit/some_no1_variable_digit/some_string1/some_string2
eg: aba/101/11111/cde/xyz or aaa/111/1119/cde/xzx or ada/21111/5/cxe/yyz

here everything is variable and what I was looking for is something like below in key value pair:

`cde: 2` as there are two entries for cde

cxe: 1 as there is only one cxe

Note: everything is variable here except /. ie cde or cxe or some string will be there exactly after two / in each case

input:aba/101/11111/cde/xyz/blabla
output: cde:xyz/blabla
input: aaa/111/1119/cde/xzx/blabla
output: cde:xzx/blabla
input: aahjdsga/11231/1119/gfts/sjhgdshg/blabla
output: gfts:sjhgdshg/blabla

If you notice here, my key is always the first string after 3rd / and value is always the substring after key

It's not really clear exactly what you're trying to achieve. Could you please edit your post with specific input data and the expected output from that? — Nick
– Nick, Commented May 29, 2020 at 0:20

Nick · Accepted Answer · 2020-05-29 00:43:04Z

1

Here are a couple of solutions based on your description that "key is always the first string after 3rd / and value is always the substring after key". The first uses str.split with a maxsplit of 4 to collect everything after the fourth / into the value. The second uses regex to extract the two parts:

inp = ['aba/101/11111/cde/xyz/blabla',
        'aaa/111/1119/cde/xzx/blabla',
        'aahjdsga/11231/1119/gfts/sjhgdshg/blabla'
        ]

for s in inp:
    parts = s.split('/', 4)
    key = parts[3]
    value = parts[4]
    print(f'{key}:{value}')

import re

for s in inp:
    m = re.match(r'^(?:[^/]*/){3}([^/]*)/(.*)$', s)
    if m is not None:
        key = m.group(1)
        value = m.group(2)
        print(f'{key}:{value}')

For both pieces of code the output is

cde:xyz/blabla
cde:xzx/blabla
gfts:sjhgdshg/blabla

answered May 29, 2020 at 0:43

Nick

147k23 gold badges67 silver badges106 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Jacob Over a year ago

This is exactly what I was looking for using re. Now, if I want to tweak it a bit to have key value pair as cde:2(2 entries of cde, ie count) and gfts:1 (only one entry of gfts). Can we do that?

Nick Over a year ago

@Jacob you could push the key values to a list and use a Counter to count them e.g. ideone.com/7Ip5uo

Jacob Over a year ago

looks like it's giving count based on nno of cde or gfts appearing in there. What I was looking was how many xzx in cde(say count1) and how many xyz in cde(count2) and total = addition of both(count1+count2) so that I can see count 1 coming from first and count2 coming for second and so on.....Is that even possible by regex library?

Nick Over a year ago

@Jacob I think this has gone beyond the ability of comments to properly describe the problem. You should ask a new question and include more details as to exactly what output you're after.

Jiří Baum · Accepted Answer · 2020-05-29 00:14:09Z

0

Others have already posted various regexes; a more broad question — is this problem best solved using a regex? Depending on how the data is formatted overall, it may be better parsed using

the .split('/') method on the string; or
csv.reader(..., delimiter='/') or csv.DictReader(..., delimiter='/') in the csv module.

answered May 29, 2020 at 0:14

Jiří Baum

6,9882 gold badges19 silver badges19 bronze badges

Comments

score 0 · Accepted Answer · 2020-05-29 18:06:21Z

0

Try (?<!\S)[^\s/]*(?:/[^\s/]*){2}/([^\s/]*)

demo

Try new per commnt

(?<!\S)[^\s/]*(?:/[^\s/]*){2}/([^\s/]*)(?:/(\S*))?

demo2

edited May 29, 2020 at 18:06

answered May 28, 2020 at 23:57

user13469682

1 Comment

user13469682 Over a year ago

aderd updats anser

Collectives™ on Stack Overflow

best way to find substring using regex in python 3

3 Answers 3

4 Comments

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related