Regex in python splitting strings

Question

I have a string like this

SELECT [Orders$].[Category] AS [Category],&#13,&#10,  [Orders$].[City] AS [City],&#13,&#10,  [Orders$].[Country] AS [Country],&#13,&#10,  [Orders$].[Customer ID] AS [Customer ID],&#13,&#10,  [Orders$].[Customer Name] AS [Customer Name],&#13,&#10,  [Orders$].[Discount] AS [Discount],&#13,&#10,  [Orders$].[Profit] AS [Profit],&#13,&#10,  [Orders$].[Quantity] AS [Quantity],&#13,&#10,  [Orders$].[Region] AS [Region],&#13,&#10,  [Orders$].[State] AS [State],&#13,&#10,  [People$].[Person] AS [Person],&#13,&#10,  [People$].[Region] AS [Region (People)]&#13,&#10,FROM [Orders$]&#13,&#10,  INNER JOIN [People$] ON [Orders$].[Region] = [People$].[Region]

I want to get only Category and city dynamically without hardcoding the word . What kind of pattern should i use ?? So that i will store those two values in an array which is looped in downstream program .

I tried splitting the text

colName = re.split("\W+", result)

['SELECT',
 'Orders',
 'Category',
 'AS',
 'Category',
 '13',
 '10',
 'Orders',
 'City',
 'AS',
 'City',
 '13',
 '10',

it gave me the whole string , now do not know how to proceed . Can someone help ??

Thanks

So when using findall and you get category and city, how do you know which is which ? — user13843220
– user13843220, Commented Jul 16, 2020 at 21:25

Barmar · Accepted Answer · 2020-07-16 21:30:11Z

1

Don't use split, use re.findall().

matches = re.findall(r'\bAS\s+\[(.+?)\]', yourString)

The words you want are in group(1) of each match in matches.

edited Jul 16, 2020 at 21:30

answered Jul 16, 2020 at 21:02

Barmar

789k57 gold badges554 silver badges669 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Meera Over a year ago

Thanks so much for the quick response , it worked :)

Meera Over a year ago

I got only 9 colums but I have 12 columns I have edited in the question the actual select query . Can you please have a look . Sorry to bother ...Thanks What I did get was Customer ID , Customer Name and Region (People)

Barmar Over a year ago

Some of your column names have spaces, my regexp assumed they were just alphanumeric

Meera Over a year ago

Thank you very much it worked , you are a great genius for helping others :) .

Lucecpkn · Accepted Answer · 2020-07-16 21:02:28Z

0

Not sure if I understand your question correctly, seems you can simply continue with:

>>> category = colName[2]
>>> city = colName[8]

You can print to check:

>>> print(category, city)
Category City

answered Jul 16, 2020 at 21:02

Lucecpkn

1,14910 silver badges10 bronze badges

2 Comments

Barmar Over a year ago

Hard-coding specific indexes doesn't seem very robust.

Lucecpkn Over a year ago

You're absolutely right @Barmar, I just thought it can be a cheap solution for the example given in the question, which seems to be a fixed format query. The regex in your answer is surely more reliable.

Collectives™ on Stack Overflow

Regex in python splitting strings

2 Answers 2

4 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related