1

Lets say we have some text:

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vivamus cursus vestibulum quam, et tristique nisi tristique ac. Nam ac risus vehicula tortor facilisis tincidunt. Aliquam at nisi vel arcu aliquet dignissim nec et massa. Curabitur vel magna eros, accumsan rutrum augue. Lorem ipsum http://subdomain-1.example.com/dir1 dolor sit amet, consectetur adipiscing elit. Nunc ut vehicula purus. Phasellus nunc diam, hendrerit in ultrices vitae, adipiscing ut odio. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Cras molestie felis nec diam sollicitudin placerat pellentesque metus dapibus. Aliquam ipsum ante, lacinia porta http://subdomain-2.example.com/dir2 faucibus non, porttitor at nunc. Quisque suscipit, urna sit amet rhoncus bibendum, elit mi rhoncus lorem, ac luctus lectus nunc in velit.

need c# function which finds all URLs and replaces domain name with given one lets say for ex example.com to stackoverflow.com, but everything else remain the same (subdomain, and the rest of url).

For example the text should look like this after replacing:

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vivamus cursus vestibulum quam, et tristique nisi tristique ac. Nam ac risus vehicula tortor facilisis tincidunt. Aliquam at nisi vel arcu aliquet dignissim nec et massa. Curabitur vel magna eros, accumsan rutrum augue. Lorem ipsum http://subdomain-1.stackoverflow.com/dir1 dolor sit amet, consectetur adipiscing elit. Nunc ut vehicula purus. Phasellus nunc diam, hendrerit in ultrices vitae, adipiscing ut odio. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Cras molestie felis nec diam sollicitudin placerat pellentesque metus dapibus. Aliquam ipsum ante, lacinia porta http://subdomain-2.stackoverflow.com/dir2 faucibus non, porttitor at nunc. Quisque suscipit, urna sit amet rhoncus bibendum, elit mi rhoncus lorem, ac luctus lectus nunc in velit.

4
  • 1
    This initially seems like a pretty easy problem to solve - possibly even a homework assignment. What code do you already have, and what problems do you have with it? Commented Jan 6, 2010 at 18:59
  • Of course, in the real world it would be not quite as easy, since you would want subdomain-1.example.com replaced with subdomain-1.stackoverflow.com, and subdomain-1.example.co.uk replaced with subdomain-1.stackoverflow.co.uk, but not have example.google.com replaced with stackoverflow.google.com Commented Jan 6, 2010 at 19:06
  • AND you can't just check the third-level domains for anything that ends in .uk, because there are a handful of domains left around that were registered with just something.uk, from before the UK decided every domain had to be registered at the third-level. Commented Jan 6, 2010 at 19:11
  • Well does it even make sense to match all theoretical cases? Generally you will know what subdomains you have to deal with and what the url you are replacing is before you design the regex (I am assuming this is needed for a specific replacment). Commented Jan 6, 2010 at 19:15

2 Answers 2

1

I think this works:

Regex r = new Regex("@(?<SCHEME>https?://)(?<SUBDOMAIN>([^.]+\.)*)example\.com(?<PATH>/.*)?");
string newText = r.Replace(text, "${SCHEME}${SUBDOMAIN}stackoverflow.com${PATH}");

I use named groups because they're easier to keep track of and read. The first is the scheme, http:// or https://, the second grabs the subdomain, and the last one grabs an optional path (as you might have http://foo.example.com or http://foo.example.com/ or http://foo.example.com/bar)

Sign up to request clarification or add additional context in comments.

2 Comments

This might be the fix: (?<SUBDOMAIN>[^.]+\.)*example\.com etc
@Hogan Needs to be in the group, but otherwise yeah, you're right. Fixed.
0

The regular expression you use should look something like:

s!(http[s]?://[\w\-]+)\.domain\.com([\w\d/]+)!$1.newdomain.org$2!gi

Note: you will have to rewrite this in C#'s notation.

2 Comments

This requires the old domain and new domain be on the same TLD.
I have changed it to address tghw and Hogan's points - note it is just a general example (you should never just use someone else's regex without checking/customizing anyway).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.