0

I need a regex to break a given url into two parts.

part1 --> the domain (including the protocol [http or https] if present).

part2 --> the remainder

Thus something like this:

example 1

let url = "https://www.example.com/asdasd/123asdsd/sdasd?bar=1"

regex returns

domain = https://www.example.com

remaining path = /asdasd/123asdsd/sdasd?bar=1


example 2

let url = "www.example.com/asdasd/123asdsd/sdasd?bar=1"

regex returns

domain = www.example.com

remaining path = /asdasd/123asdsd/sdasd?bar=1


example 3

let url = "example.com/asdasd/123asdsd/sdasd?bar=1"

regex returns

domain = example.com

remaining path = /asdasd/123asdsd/sdasd?bar=1


example 4

let url = "http://example.com"

regex returns

domain = http://example.com

remaining path = null

2

3 Answers 3

1

I would recommend using the URL interface instead of a regex. Although it will not handle example 2 and 3, it can pull out all the bits you require.

From MDN:

The URL interface is used to parse, construct, normalize, and encode URLs. It works by providing properties which allow you to easily read and modify the components of a URL. You normally create a new URL object by specifying the URL as a string when calling its constructor, or by providing a relative URL and a base URL. You can then easily read the parsed components of the URL or make changes to the URL.

Example for your requirements:

let url = new URL("https://www.example.com/asdasd/123asdsd/sdasd?bar=1");

console.log("domain - " + url.origin);
console.log("remaining path - " + url.pathname + url.search);

Sign up to request clarification or add additional context in comments.

Comments

1

Use URL.

var url = new URL("https://www.example.com/asdasd/123asdsd/sdasd?bar=1");
var domain = `${url.protocol}//${url.host}`;
var path = `${url.pathname}?${url.searchParams.toString()}`;
console.log(`domain = ${domain}`)
console.log(`remaining path = ${path}`)

Someone beat me to the punch with URL so I'll post the regex as well.

var url = "https://www.example.com/asdasd/123asdsd/sdasd?bar=1";
var matches = /(https?:\/\/.*?)([/$].*)/.exec(url);
var domain = matches[1];
var path = matches[2];
console.log(`domain = ${domain}`)
console.log(`remaining path = ${path}`)

Comments

0

Here is the breakdown javascript version. Hope this helps understand

//removes protocol
let regEx = /^(?:www\.)?(.*?):\/\//gim;
let url = "https://www.example.com/asdasd/123asdsd/sdasd?bar=1"
let path = url.replace(regEx, "");
console.log("path = " + path);

//removes domain extracts route
let regEx2 = /^(.*?\/)/;
if (path.match(regEx2)) {
  let route = "/" + path.replace(regEx2, "");
  console.log("route", route);

  //extracts domain
  url = path.match(regEx2);
  let domainUrl = url[0].replace("/", "");
  console.log("domainUrl = ", domainUrl);
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.