1

I have a program that goes through SQL statements, identifying component parts with a space. eg: -

SELECT * FROM tblSales WHERE CustomerID=10 AND Year=2011

Would produce the following as separate components: -

"SELECT","*","FROM","tblSales","WHERE","CustomerID=10","AND" and "Year=2011"

The problem I have however is the use of spaces within values eg.:-

SELECT * FROM tblSales WHERE CustomerNameID='Test Company' AND Year=2011

Using the same space-separating logic, this would produce components of: -

"SELECT,"*","FROM","tblSales","WHERE","CustomerID='Test","Company'","AND" and "Year=2011".

ie the space between "Test" & "Company" causes it to be treated as two separate components.

Question is, Is it possible via RegEx.Replace to replace the spaces between the quotation marks with another character whilst leaving all other spaces in the string in tact?

Shaun.

1
  • Answer to teh question is YES. :).. See this (stackoverflow.com/questions/7796904/regex-for-matching-literal-strings/7797006) then give it a try and post some examples that you have tried out, i.e. show some effort. Commented Oct 24, 2011 at 14:36

2 Answers 2

1

Assuming you don't have any escaped quotes, you could simply check if the space you're looking at is followed by an even number of quotes (which means it's outside of a string).

So

splitArray = Regex.Split(subjectString, " (?=(?:[^']*'[^']*')*[^']*$)");

gives you the desired result.

As a verbose regex:

[ ]        # Match a space
(?=        # only if it's followed by...
 (?:       # This subregex:
  [^']*'   #  any number of non-quote characters, followed by a quote
  [^']*'   #  same again (to ensure even number of quotes)
 )*        # zero or more times.
 [^']*     # and any number of non-quote characters until...
 $         # the end of the string.
)          # End of lookahead.
Sign up to request clarification or add additional context in comments.

4 Comments

Seems to me that assuming there aren't escaped quotes is a recipe for disaster...
@e.dan: You're certainly right. Those can be handled, too, but that would make the regex more complicated.
I think it borders on impossible, and the OP would be advised to read the famous HTML parsing with regex. SQL is arguably at least as complex as HTML, if not more so.
Given the context by which the SQL commands were written then there won't be any escaped characters and so this will work fine. Thanks
1

Possible? Maybe. Maybe not. Certainly not trivial. Parsing SQL with regexes is almost as bad, or quite possibly worse, than parsing HTML with regexes...

You might want to poke around SO and look for other people who have tried to parse SQL...

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.