83

I have run into an odd situation. I'm writing a JavaScript bookmarklet that will allow users to click and share external websites to our website very easily and quickly. It simply get's the title, page URL, and if they've selected any text on the page, it grabs it too.

The problem is it doesn't work with external domains for some reason, so if we use it internally we end up with a share window with the URL formatted like this:

http://internaldomain.com/sharetool.php?shareid=http://internaldomain.com/anotheroddpage.html&title=....

That works just fine, BUT if we try to use an external domain and end up with a URL formatted like this:

http://internaldomain.com/sharetool.php?shareid=http://externaldomain.com/coolpagetoshare.html&title=...

Then we get a Forbidden Error on our page and can't load it... If we manually remove the http:// from the externaldomain address, it loads just fine again.

So.. I'm thinking the best solution to get around this problem is to modify the JavaScript bookmarklet to remove the http as it's loading the window. Here is how my current bookmarklet looks:

javascript:var d=document,w=window,e=w.getSelection,k=d.getSelection,x=d.selection,s=(e?e():(k)?k():(x?x.createRange().text:0)),f='http://internaldomain.com/sharetool.php',l=d.location,e=encodeURIComponent,u=f+'?u='+e(l.href)+

As you can see, e(l.href) is where the URL is passed.

How can I modify that so it removes the external domains http://?

0

5 Answers 5

226

I think it would be better to take into account all possible protocols.

result = url.replace(/(^\w+:|^)\/\//, '');
Sign up to request clarification or add additional context in comments.

8 Comments

That worked like a charm '+e(l.href.replace(/.*?:\/\//g, "")
This is a very poor regex. .*? - means ungreedy match, but /g modifier forces the expression to be applied many times (i.e. cut all found protocols?). Also expression has no ^ to match the start. Better one: /^.*?:\/\//
@disjunction Even ignoring your comments, that was precisely why this regex was written like this, as THIS is clearly stated in the answer.
Please note that in real web pages relative protocol // is a common practice paulirish.com/2010/the-protocol-relative-url. So I suggest regexp /^\/\/|^.*?:\/\// (you can make it better I'm sure)
@Dan, good call! So let's take it even further and make this work with 'mailto:' with this edit: .replace(/^\/\/|^.*?:(\/\/)?/, '');
|
59
url = url.replace(/^https?:\/\//, '')

5 Comments

small improvement: /^(https?:|)\/\//
Works great thanks, for easier use: let removeHttp = function(link) { return link.replace(/^(https?:|)\/\//, ''); }; let string = removeHttp(link);
url.replace(/^(?:https?:\/\/)?(?:www\.)?/i, "").split('.')[0] This will take care of http, https and www
@SirPhemmiey Nah, that ended up removing subdomains for me
How about using the URL class instead of writing the regex yourself? new URL('https://test.com').host gives test.com.
7
l.href.replace(/^http:\/\//, '')

Comments

2

I think the regular expression you need is /(?:http://)(.*)/i. The first match of this should be it.

Comments

-6

Try using replace function

var url = url.replace("http%3A%2F%2F", "");

1 Comment

This is unideal for the lack of Regular Expression usage. With simple text-replacement like this, you would need to chain several .replace() function calls to accomodate all the different variations needed (http/https/ etcetera..)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.