1

I have string which contain space in its html tags

$mystr = "< h3> hello mom ?< / h3>"

so i wrote regex expression for it to detect the spaces in it

$pattern = '/(?<=&lt;)\s\w+|\s\/\s\w+|\s\/(?=&gt;)/mi';

so next i want to modify the matches by removing space from it and replace it, so any idea how it can be done? so that i can fix my string like "&lt;h3&gt; hello mom ?&lt;/h3&gt;"

i know there is php function pre_replace but not sure how i can modify the matches

$result = preg_replace( $pattern, $replace , $mystr );

13
  • 1
    The question on everybody's lips is of course: How did you end up with a string like that? The idea is that it is always better to prevent these anomalies in the first place. Commented Mar 17, 2022 at 19:23
  • 1
    is not equal to &. Try preg_replace_callback('/&lt;(?:\s*\/)?\s*\w+\s*&gt;/ui', function($m) { return preg_replace('/\s+/u', '', $m[0]); }, $mystr). Commented Mar 17, 2022 at 19:51
  • 1
    To expand upon and analogize KIKO's response: Every time you paper over an issue like this with a kludge instead of fixing the underlying issue, you add another layer onto the house of cards that is your application, and the more trouble you're going to have if there's ever a breeze. Commented Mar 17, 2022 at 20:45
  • 1
    @Wiktor Stribiżew 's answer is very good to remove ALL spaces between the brackets, but since these are html tags and you need spaces between the tagname and each of the attributes I wonder if the OP really only wants to remove LEADING spaces, in which case preg_replace('/&lt;\s+/ui','&lt;',$mystr) would do the job Commented Mar 17, 2022 at 20:49
  • 1
    @ChrisMaurer My regex, '/&lt;(?:\s*\/)?\s*\w+\s*&gt;/ui', only deals with tags that have no attributes, like the example in the question. Commented Mar 17, 2022 at 20:51

2 Answers 2

1

For the specific tags like you showed, you can use

preg_replace_callback('/&lt;(?:\s*\/)?\s*\w+\s*&gt;/ui', function($m) { 
    return preg_replace('/\s+/u', '', $m[0]); 
}, $mystr)

The regex - note the u flag to deal with Unicode chars in the string - matches

  • &lt; - a literal string
  • (?:\s*\/)? - an optional sequence of zero or more whitespaces and a / char
  • \s* - zero or more whitespaces
  • \w+ - one or more word chars
  • \s* - zero or more whitespaces
  • &gt; - a literal string.

The preg_replace('/\s+/u', '', $m[0]) line in the anonymous callback function removes all chunks of whitespaces (even those non-breaking spaces).

Sign up to request clarification or add additional context in comments.

Comments

1

You could keep it simple and do:

$output = str_replace(['&lt; / ', '&lt; ', '&gt; '],
                      ['&lt;/',   '&lt;',  '&gt;'], $input);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.