4

im having problems with understanding regex in PHP. I have img src:

src="http://example.com/javascript:gallery('/info/2005/image.jpg',383,550)"

and need to build from it this:

src="http://example.com/info/2005/image.jpg"

How it it possible to cut first and last part from string to obtain clear link without javascript part?

Right now im using this regex:

$cont = 'src="http://example.com/javascript:gallery('/info/2005/image.jpg',383,550)"'

    $cont = preg_replace("/(src=\")(.*)(\/info)/","$1http://example.com$3", $cont);

and output is:

src="http://example.com/info/2005/image.jpg',383,550)"
2
  • (src=\")(.*)(\/info[^']*)[^"]* will work with the string you have. Commented Nov 13, 2020 at 20:18
  • It works, thanks :) Commented Nov 13, 2020 at 21:34

2 Answers 2

3

As an alternative solution, you might also capture the src="http://example.com part by matching the protocol in group 1, so you can use it in the replacement.

(src="https?://[^/]+)/[^']*'(/info[^']*)'[^"]*

Explanation

  • (src="https?://[^/]+)/ Capture group 1, match src="http, optional s, :// and till the first /
  • [^']*' Match any char except ', then match '
  • (/info[^']*) Capture group 2, match /info followed by any char except '
  • '[^"]* Match the ' followed by matching any char except "

Regex demo | Php demo

$cont = 'src="http://example.com/javascript:gallery(\'/info/2005/image.jpg\',383,550)"';
$cont = preg_replace("~(src=\"https?://[^/]+)/[^']*'(/info[^']*)'[^\"]*~", '$1$2', $cont);
echo $cont;

Output

src="http://example.com/info/2005/image.jpg"
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for giving me an alternative solution :) Finally, I understand a little more about how regex works
2

Use

preg_replace("/src=\"\K.*(\/info[^']*)'[^\"]*/", 'http://example.com$1', $cont)

See regex proof.

Explanation

--------------------------------------------------------------------------------
  src=                     'src='
--------------------------------------------------------------------------------
  \"                       '"'
--------------------------------------------------------------------------------
  \K                       match reset operator
--------------------------------------------------------------------------------
  .*                       any character except \n (0 or more times
                           (matching the most amount possible))
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    \/                       '/'
--------------------------------------------------------------------------------
    info                     'info'
--------------------------------------------------------------------------------
    [^']*                    any character except: ''' (0 or more
                             times (matching the most amount possible))
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  '                        '\''
--------------------------------------------------------------------------------
  [^\"]*                   any character except: '\"' (0 or more
                           times (matching the most amount possible))

2 Comments

It works also, thanks for this precise explanation :)
@PiciuU Glad to hear, please kindly accept the answer by clicking on the left.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.