0

I am trying to get the extension (dk, com, org, eu) or any other domain extension from a String.

for example:

http://www.example.com/siteone/sitetwo/currentpage

From this String i would like to get the .com

I could go the very messy way around and do subString however the problem comes when an url looks like this:

dk.webpage.otherstuff.com/page

So how will i go around this in a way that doesnt require me to check everything every step of the way

5
  • 1
    Have you try using regular expressions? docs.oracle.com/javase/tutorial/essential/regex Commented Nov 25, 2013 at 11:59
  • 1
    This maybe can help you: stackoverflow.com/questions/3234090/… Commented Nov 25, 2013 at 12:01
  • if you have tried anything..than Show the Code Commented Nov 25, 2013 at 12:01
  • Will the strings you are checking always contain URLs? Commented Nov 25, 2013 at 12:04
  • @MrMisterMan Yes always! Commented Nov 25, 2013 at 12:16

4 Answers 4

1

Use the getHost() method like this:

public static String getDomainName(String testUrl) throws URISyntaxException {
    URI fullUri = new URI(testUrl);
    String domainName = fullUri.getHost();
    return domainName.startsWith("www.") ? domainName.substring(4) : domainName;
}

After you have done that then just use subString for the .com part of your domain name.

Sign up to request clarification or add additional context in comments.

3 Comments

if you are directly use "www." and subString(4) won't work because now a days there are some URL which starts like www3.xyz.com than your code will fail.
The OP can use a regex on the domainName then, depends on how detailed you want to go, with the introduction of custom domains this whole practical would be near impossible to do anyway.
@liquidsnake786 with sites like homesick.nu\ i get an error do you know why? (the error is invalid character at index 7 (meaning that \ is wrong but how come? )
1

Use Guava's InternetDomainName class. Specifically have a look at the publicSuffix method.

Comments

1

Try this:

String ext = url.replaceAll(".*//[^/]*(\\.\\w+)/.*", "$1");

Some test code:

String url = "http://www.example.com/siteone/sitetwo/currentpage";
String ext = url.replaceAll(".*//[^/]*(\\.\\w+)/.*", "$1");
System.out.println(ext);

Output:

.com

1 Comment

@RandomGuy $1 means "captured group 1", which is the bracketed pattern to capture the dot and word chars of the extension
0

Try this :

private String getExtensionFromDomain(String domainName){ int p = domainName.lastIndexOf(".") +1; return domainName.substring(p); }

In case of example.co.ma this will output : .ma

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.