2
    <?php
$search = array("cencor","cencors");
$change = array("prohibited","***");
if(isset($_POST['submit']) AND !empty($_POST['text'])){
$text = $_POST['text'];
$text = str_replace($search,$change,$text);
}
?>
<form action="index.php" method="post">
    <textarea name="text"><?php echo $text;?></textarea>
    <input type="submit" name="submit">
</form>

Hello I am using this code. I have a problem with this code that when I submit

Don't use cencor and cencors

It returns

Don't use prohibited and prohibiteds

I want to get result like

Don't use prohibited and ***

How can I do that? Thank you!

1 Answer 1

2

There's something to be wary of on this front, and that is that some words are substrings of others.

I remember reading an article on the subject a few years ago, and it was titled something like "The Clbuttic Mistake". The article went into detail how a poorly implemented profanity filter can create as many problems as it solves. This particular article showed the pitfalls of exactly the same method as you are using in your question.

The replacement they focused on was naturally $content = str_replace('ass', 'butt', $content);. This yielded some amusing transformations, such as "assassinate" becoming "buttbuttinate".

This is exactly the same issue you are seeing, because "cencor" is a substring of "cencors". Even though you intend to replace both, you're still seeing the same problem.

The solution is to be much more targeted in the way you do your replacements, by making sure that you only match on a full word. You can do this with a regex based approach:

<?php

$replacements = array(
    "cencor" => "prohibited",
    "cencors" => "***"
);

$text = "Don't use cencor, and cencors";

foreach($replacements as $search => $change) {
    $text = preg_replace("~(^|[\s\W])" . preg_quote($search, '~') . "([\s\W]|$)~ism", "\\1{$change}\\2", $text);
}

echo $text;
Sign up to request clarification or add additional context in comments.

2 Comments

It is possible to achive that with str_ireplace?
str_ireplace suffers from the same pitfall as str_replace: you can't target whole words only. Seeing your comment on one of the other answers I made sure my answer was case insensitive though - you see at the end of the first parameter of preg_replace I have ~ism? The ~ is my delimiter of choice, however each of the i, s, and m are flags. the i is saying "case insensitive"

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.