Parse a substring in Python using regular expression

Question

I am trying to parse a substring using re.

From the string present in variable s,I would like to split the string present till the first !(the string stored in s has two !) and store it as a substring.From this substring(stored in variable result), I wish to parse another substring.

Here is the code,

import re
s='ecNumber*2.4.1.11#kmValue*0.57#kmValueMaximum*1.25#!ecNumber*2.3.1.11#kmValue*0.081#kmValueMaximum*#!'


Data={}

result = re.search('%s(.*)%s' % ('ec', '!'), s).group(1)
print result
ecNumber = re.search('%s(.*)%s' % ('Number*', '#kmValue*'), result).group(1)
Data["ecNumber"]=ecNumber
print Data

The value corresponding to each tag present in the substring(example:ecNumber) is stored in between * and # (example: *2.4.1.11#).I attempted to parse the value stored for the tag ecNumber in the first substring. The output I obtain is

result='Number*2.4.1.11#kmValue*0.57#kmValueMaximum*1.25#!ecNumber*2.3.1.11#kmValue*0.081#kmValueMaximum*#'
{'ecNumber': '*2.4.1.11#kmValue*0.57#kmValueMaximum*1.25#!ecNumber*2.3.1.11#kmValue*0.081'}

The desired output is

result= 'ecNumber*2.4.1.11#kmValue*0.57#kmValueMaximum*1.25#'
{'ecNumber': '2.4.1.11'}

I would like to store each tag and its corresponding value.For example,

{'ecNumber': '2.4.1.11','kmValue':'0.021','kmValueMaximum':'1.25'}

Timothy Zhang · Accepted Answer · 2017-12-11 08:37:25Z

1

Despite you are asking a solution with regular expression, I would say it's much easier to use direct string operations for this problem, since the source string is well formatted.

For infomation before the first i:

print dict([i.split('*') for i in s.split('!', 1)[0].split('#') if i])

For all information in s:

print [dict([i.split('*') for i in j.split('#') if i]) for j in s.split('!') if j]

edited Dec 11, 2017 at 8:37

answered Dec 8, 2017 at 3:56

Timothy Zhang

3221 silver badge8 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Natasha Over a year ago

Thanks a lot for suggesting this wonderful alternative.Could you please give some advice on how to loop through each substring? i.e substring1=ecNumber*2.4.1.11#kmValue*0.57#kmValueMaximum*1.25# substring2=ecNumber*2.3.1.11#kmValue*0.081#kmValueMaximum*#

Natasha Over a year ago

I tried the following,but didn't work 'substring=[0]*2 for j,i in zip(range(0,2),range(0,2)): substring[j]= dict([i.split('*') for i in s.split('!', j)[0].split('#') if i]) print substring'

Timothy Zhang Over a year ago

s.split('!', j) should be s.split('!', 1)

Timothy Zhang Over a year ago

From your original question, you only need infomation before the first !. If you need all information, try print [dict([i.split('*') for i in j.split('#') if i]) for j in s.split('!') if j]

Ajax1234 · Accepted Answer · 2017-12-08 14:16:01Z

1

You can try this:

import re
s='ecNumber*2.4.1.11#kmValue*0.57#kmValueMaximum*1.25#' 
new_data = re.findall('(?<=^)[a-zA-Z]+(?=\*)|(?<=#)[a-zA-Z]+(?=\*)|(?<=\*)[-\d\.]+(?=#)', s)
final_data = dict([new_data[i:i+2] for i in range(0, len(new_data)-1, 2)])

Output:

{'kmValue': '0.57', 'kmValueMaximum': '1.25', 'ecNumber': '2.4.1.11'}

edited Dec 8, 2017 at 14:16

answered Dec 8, 2017 at 2:49

Ajax1234

71.7k9 gold badges67 silver badges110 bronze badges

2 Comments

Natasha Over a year ago

Thank you so much.Could you please tell how to use the above for parsing kmValue ({'kmValue ': '-999'}?I tried altering the index of new_data.But it didn't work.

Natasha Over a year ago

Thank you so much.

Collectives™ on Stack Overflow

Parse a substring in Python using regular expression

2 Answers 2

4 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related