I have a place where i have to include a HTML template. The HTML is written by employees only but i dont wanna be an idiot and include it without masking|checks :)
It should allow HTML tags only without any attributes.
So no <a href... links and no <div style=... divs or w/e.
My test script:
$string
= '
<p>
<strong>Foo</strong>
</p>
<p>Bar</p>
<p>
<strong>Baz</strong> Mmmpf
</p>
<ul>
<li>someting</li>
<li>someting more</li>
<li>even more</li>
</ul>
<p>
<strong>Foo</strong>
</p>
<i>Foo <u>Bar</u></i>Baz
<!-- xss -->
<script>alert(1)</script>
<p onmouseover="alert(1)"></p>
<!-- ... -->
';
$htmlWhitelist = [
'u',
'i',
'p',
'strong',
'ul',
'li',
];
// replace allowed tags with placeholders
// that not get changed by htmlspecialchars()
foreach ($htmlWhitelist as $tag) {
$string = str_replace(
["<{$tag}>", "</{$tag}>"],
["{OPEN}{$tag}{OPEN}", "{CLOSE}{$tag}{CLOSE}"],
$string
);
}
// htmlspecialchars() on everything
$string = htmlspecialchars($string);
// put back the allowed tags
foreach ($htmlWhitelist as $tag) {
$string = str_replace(
["{OPEN}{$tag}{OPEN}", "{CLOSE}{$tag}{CLOSE}"],
["<{$tag}>", "</{$tag}>"],
$string
);
}
I cannot imagine anything could go wrong with this but would like to ask you guys if i missed someting.
$string? I'd like to see how the output is used \$\endgroup\$$stringis really retrieved? I'd like to see how you get the user input \$\endgroup\$<?php echo cleanHtmlTagsOnly($string); ?>to write it into the output, wherecleanHtmlTagsOnly()is the code in the question. \$\endgroup\$