1

I am having a problem with regular expression for getting the main domain name from a url. that is if i am having urls as given below..

http://domain.com/return/java.php?hello.asp
http://www.domain.com/return/java.php?hello.asp
http://blog.domain.net/return/java.php?hello.asp
http://us.blog.domain.co.us/return/java.php?hello.asp
http://domain.co.uk
http://domain.net
http://www.blog.domain.co.ca/return/java.php?hello.asp
http://us.domain.com/return/

from all this I should only get domain as the output of the regular expression.. so how do i do it? i used;

var url = urls.match(/[^.]*.(com|net|org|info|coop|int|co\.uk|org\.uk|ac\.uk|uk)/g);

but it does not work for

  http://domain.net

so can someone help me out with this.

5
  • 1
    Domain may end by "/" or end-of-line, so "match(/[^.]*.(com|net|org|info|coop|int|co\.uk|org\.uk|ac\.uk|uk)(/|$)g)" may work. Commented Jan 6, 2015 at 8:09
  • Seems to be working, regex101.com/r/dL6nN7/2 and jsbin.com/zanivonijo/1/edit?js,console but I might be missing the point? Commented Jan 6, 2015 at 8:10
  • @Fumu7: what ever you gave is not working.. Commented Jan 6, 2015 at 8:22
  • don't forget the escape signs for the . since it is a special character, after the [^.]* part. :) Commented Jan 6, 2015 at 8:41
  • @SusanWilliams did you find a solution, or do you still need some help. :) Commented Jan 8, 2015 at 6:55

4 Answers 4

4

You can use URL rather than regex

var url  = new URL("http://domain.com/return/java.php?hello.asp");
console.log(url.hostname);
=> domain.com

OR

If you want the protocol as well

var url  = new URL("http://domain.com/return/java.php?hello.asp");
console.log(url.protocol+"//"+url.hostname);
= > http://domain.com
Sign up to request clarification or add additional context in comments.

1 Comment

IE support is an issue.
0

would this help ?

(http|https|ftp):\/\/([a-zA-Z0-9.])+/g

matches at

http://domain.com
http://www.domain.com
http://blog.domain.net
http://us.blog.domain.co.us
http://domain.co.uk
http://domain.net
http://www.blog.domain.co.ca
http://us.domain.com

Comments

0

Here is a solution changing the regex a bit:

url.match(/https?:\/\/[^/]+((?=\/)|$)/g);
//tested with Chrome 38+ on Win7

Basiclly checking for slash / or string end $

Update replaced jsFiddle link with inline Stackoverflow-Code:

var urls = ['http://domain.com/return/java.php?hello.asp',
  'http://www.domain.com/return/java.php?hello.asp',
  'http://blog.domain.net/return/java.php?hello.asp',
  'http://us.blog.domain.co.us/return/java.php?hello.asp',
  'http://domain.co.uk',
  'http://domain.net',
  'http://www.blog.domain.co.ca/return/java.php?hello.asp',
  'http://us.domain.com/return/'
];

var htmlConsole = document.getElementById("result");
var htmlTab = "    ";
var htmlNewLine = "<br />";

htmlConsole.innerHTML = "";
for (var id in urls) {

  htmlConsole.innerHTML += "URL: " + urls[id] + htmlNewLine;

  var matchResults = urls[id].match(/https?:\/\/[^/]+((?=\/)|$)/g);

  for (var innerIdx in matchResults) {
    htmlConsole.innerHTML += htmlTab + "MatchNumber: " + innerIdx + " MatchValue: " + matchResults[innerIdx] + htmlNewLine;
  }

  htmlConsole.innerHTML += htmlNewLine;

}
<div id="result">
</div>

Comments

-1
var url = urls.match(/[^./]*.(com|net|org|info|coop|int|co\.uk|co\.us|co\.ca|org\.uk|ac\.uk|uk)/g);

just added a / and updated the list of top-level domains to match your examples.
Although I do not recommend to keep the list of top-level domains within a regexp. it's just too many. http://en.wikipedia.org/wiki/List_of_Internet_top-level_domains

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.