6

I know I've seen this done a lot in places, but I need something a little more different than the norm. Sadly When I search this anywhere it gets buried in posts about just making the link into an html tag link. I want the PHP function to strip out the "http://" and "https://" from the link as well as anything after the .* so basically what I am looking for is to turn A into B.

A: http://www.youtube.com/watch?v=spsnQWtsUFM
B: <a href="http://www.youtube.com/watch?v=spsnQWtsUFM">www.youtube.com</a>

If it helps, here is my current PHP regex replace function.

ereg_replace("[[:alpha:]]+://[^<>[:space:]]+[[:alnum:]/]", "<a href=\"\\0\" class=\"bwl\" target=\"_new\">\\0</a>", htmlspecialchars($body, ENT_QUOTES)));

It would probably also be helpful to say that I have absolutely no understanding in regular expressions. Thanks!

EDIT: When I entered a comment like this blahblah https://www.facebook.com/?sk=ff&ap=1 blah I get html like this<a class="bwl" href="blahblah https://www.facebook.com/?sk=ff&amp;ap=1 blah">www.facebook.com</a> which doesn't work at all as it is taking the text around the link with it. It works great if someone only comments a link however. This is when I changed the function to this

preg_replace("#^(.*)//(.*)/(.*)$#",'<a class="bwl" href="\0">\2</a>',  htmlspecialchars($body, ENT_QUOTES));
2
  • 2
    Always prefer preg* instead of ereg* functions, since the ereg* functions are slow and deprecated. Commented Jun 18, 2011 at 4:50
  • possible duplicate of How to add anchor tag to a URL from text input Commented Mar 11, 2012 at 0:37

7 Answers 7

5

This is the simples and cleanest way:

$str = 'http://www.youtube.com/watch?v=spsnQWtsUFM';
preg_match("#//(.+?)/#", $str, $matches);

$site_url = $matches[1];

EDIT: I assume that the $str had been checked to be a URL in the first place, so I left that out. Also, I assume that all the URLs will contain either 'http://' or 'https://'. In case the url is formatted like this www.youtube.com/watch?v=spsnQWtsUFM or even youtube.com/watch?v=spsnQWtsUFM, the above regexp won't work!

EDIT2: I'm sorry, I didn't realize that you were trying to replace all strings in a whole test. In that case, this should work the way you want it:

$str = preg_replace('#(\A|[^=\]\'"a-zA-Z0-9])(http[s]?://(.+?)/[^()<>\s]+)#i', '\\1<a href="\\2">\\3</a>', $str);
Sign up to request clarification or add additional context in comments.

7 Comments

this will also match ftp://... :)
@Tudor Constantin: Yes, I just edited it to say that with this function I assume it had previously been checked to be a valid URL.
Your's didn't seem to work with what I'm trying to do either, as I'm trying to replace all of the links within a user's comment so it just displayed a 1
I didn't realize that the $str was a text fragement, not just a confirmed URL. Nor did I realize that you wanted to replace all. I updated the code. Should work like a charm.
Yes, that one does seem to work completely even with multiple links! thanks so much for that :)
|
2

I am not a regex whizz either,

^(.*)//(.*)/(.*)$
<a href="\1//\2/\3">\2</a>

was what worked for me when I tried to use as find and replace in programmer's notepad.

^(.)// should extract the protocol - referred as \1 in the second line. (.)/ should extract everything till the first / - referred as \2 in the second line. (.*)$ captures everything till the end of the string. - referred as \3 in the second line.


Added later

^(.*)( )(.*)//(.*)/(.*)( )(.*)$
\1\2<a href="\3//\4/\5">\4</a> \7

This should be a bit better, but will only replace just 1 URL

4 Comments

This will work just fine (if checked to be valid URL before calling this). As a proper PHP, this would be preg_replace("#^(.*)//(.*)/(.*)$#",'<a href="\0">\2</a>', $str), where \0 is the entire matched string.
@stumpx: I am not sure why you selected this answer to be the correct one, but after realizing that in your situation 1) the $str value has not been checked to be a valid URL and 2) you want to replace ALL URLs in the $str, this code won't work the way you want at all. It will, first of all, not work on just http(s) links, but also ftp(s) or irc (for example). Also, it will return ONLY the HTML formatted link of the last occurring link in $str, not the rest of the string (in any shape or form).
Actually this did not work. When I entered a comment like this blahblah https://www.facebook.com/?sk=ff&ap=1 blah I get html like this <a class="bwl" href="Links dont work https:">www.facebook.com</a> which doesn't work at all. It works great if someone only comments a link however
Okay, so you have the URL inside a comment... I posted the expression assuming that the string had just the URL. In that case try to find '^(.*)( )(.*)//(.*)/(.*)( )(.*)$' '\1\2<a href="\3//\4/\5">\4</a> \7' This should be a bit better, but will only replace just 1 URL.
0

The \0 is replaced by the entire matched string, whereas \x (where x is a number other than 0 starting at 1) will be replaced by each subpart of your matched string based on what you wrap in parentheses and the order those groups appear. Your solution is as follows:

ereg_replace("[[:alpha:]]+://([^<>[:space:]]+[:alnum:]*)[[:alnum:]/]", "<a href=\"\\0\" class=\"bwl\" target=\"_new\">\\1</a>

I haven't been able to test this though so let me know if it works.

3 Comments

This function has been depricated in PHP 5.3.0. It would not be smart to use this function anymore. On top of that, the expression is far more complicated than it needs to be.
That did not seem to work, it actually still just used the whole link and changing the number to 2 just gave me a 2 in the output
Ahh i didn't even see your first comment, thanks for explaining this though, I didn't even realize
0

I think this should do it (I haven't tested it):

preg_match('/^http[s]?:\/\/(.+?)\/.*/i', $main_url, $matches);
$final_url = '<a href="'.$main_url.'">'.$matches[1].'</a>';

1 Comment

This won't work on https links. Also, there might not be a need to check if it's a URL in the first place, so the last part of the regexp (/.*) isn't really needed. Lastly, since the forward slash is being used so intensively in the expression, it would be smarter to use a different expression delimiter, such as ; or #.
0

I'm surprised no one remembers PHP's parse_url function:

$url = 'http://www.youtube.com/watch?v=spsnQWtsUFM';
echo parse_url($url, PHP_URL_HOST); // displays "www.youtube.com"

I think you know what to do from there.

2 Comments

Of course...kind of forgot about this. I suppose I have grown so used to preg_match/ preg_replace x). Anyway, parse_url will require much more lines of code. I don't know how it will benchmark against preg_replace, but I imagine that, given the fact that PHP needs to build arrays, and you probably need to use a preg_match_all to fetch all of the URLs in a text in the first place, it's not going to outperform the preg_replace function.
Yeah, at first I didn't realize he was doing a search and replace in a document. I thought he was just processing a single URL....
0

$result = preg_replace('%(http[s]?://)(\S+)%', '<a href="\1\2">\2</a>', $subject);

2 Comments

ereg_replace has been depricated since PHP 5.3.0. It would be unwise to use this function now.
@Battle_707, you are correct, I was in autopilot mode and just used the same function the poster used w/o thinking about it. I updated my answer w/ preg instead.
0

The code with regex does not work completely.

I made this code. It is much more comprehensive, but it works:

See the result here: http://cht.dk/data/php-scripts/inc_functions_links.php

See the source code here: http://cht.dk/data/php-scripts/inc_functions_links.txt

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.