0

I am building a 'keyword' highlighting script and I need to write a regular expression to parse the following url, to find the searched for keywords to highlight.

I am REALLY bad with regex, so I was hoping someone who rocks it, could help.

Our search string uses "skw" as the parameter and "%2c" (comma) to separate terms, with "+" for spaces.

Example URLS:

http://[url].com/Search.aspx?skw=term1 http://[url].com/Search.aspx?skw=term1%2c+term2

Is there a single RegExp that I can use that will give me a collection that looks like this?

var "http://[url].com/Search.aspx?skw=term+one%2c+term2".match([Expression]);

matches[0] = "term one"
matches[1] = "term2"

Thanks! Any help is greatly appreciated.

3 Answers 3

2

You can't do this with a single match, but you can do it with a matches, a replace and a split:

url.match(/\?(?:.*&)?skw=([^&]*)/)[1].replace(/\+/g, " ").split('%2c')

You may want to do the match separately and bail out if the match fails (which it could if yur URL didn't have an skw parameter).

You probably really want to do an unescape too, to handle other escaped characters in the query:

unescape(url.match(/\?(?:.*&)?skw=([^&]*)/)[1].replace(/\+/g, " ")).split(',')
Sign up to request clarification or add additional context in comments.

1 Comment

Great answer. It looks like he also has spaces following commas that I'm guessing he doesn't want to capture. (i.e. for "term1, term2" we get ["term1", " term2"]) You could split on /,\s*/ if you want to remove the extra space.
1
https?://[^\?]+\?skw=([^%]*)(?:%2c\+*(.*))?

In javascript this is

var myregexp = /https?:\/\/[^\?]+(?:\?|\?[^&]*&)skw=([^%]*)(?:%2c\+*(.*))?/;
var match = myregexp.exec(subject);
if (match != null && match.length > 1) {
    var term1 = match[1];
    var term2 = match[2];
}

EDIT:

Sorry, I re-read your question, to handle multiple terms you need to combine this with a split

var subject = "http://[url].com/Search.aspx?skw=term1+one%2c+term2";
var myregexp = /https?:\/\/[^\?]+(?:\?|\?[^&]*&)skw=([^&]*)?/;
var match = myregexp.exec(subject);
if (match != null && match.length > 1) {
  var terms = unescape(match[1]).split(",");
}

1 Comment

Or https?://[^\?]+(?:\?|\?[^&]*&)skw=([^%]*)(?:%2c\+*(.*))? to ignore any paramters before your skw
1

This task doesn't really lend itself to a single regular expression. Check out Parsing Query Strings in JavaScript for a script to assist you.

1 Comment

A more compact implementation: magnetiq.com/2009/07/08/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.