7

Firstly, I want to inform that, what I need is the reverse of in_array PHP function.

I need to search all items of array in the string if any of them found, function will return true otherwise return false.

I need the fastest solution to this problem, off course this can be succeeded by iterating the array and using the strpos function.

Any suggestions are welcome.

Example Data:

$string = 'Alice goes to school every day';

$searchWords = array('basket','school','tree');

returns true

$string = 'Alice goes to school every day';

$searchWords = array('basket','cat','tree');

returns false

4
  • Well, I don't think you are getting any faster than strpos(). Commented Jun 3, 2011 at 14:51
  • Disagree with u, @Erisco. Regular expressions will do it and are faster. I just don't know much of it. Commented Jun 3, 2011 at 14:55
  • Didn't check the answer by malko before I posted the initial comment. Commented Jun 3, 2011 at 14:56
  • @afaolek, I believe it is largely going to depend on the number of search words. For small numbers, I doubt regexp is going to win, unless the string being searched becomes very large and the number of search words is greater than one. Commented Jun 3, 2011 at 15:18

8 Answers 8

11

You should try with a preg_match:

if (preg_match('/' . implode('|', $searchWords) . '/', $string)) return true;

After some comments here a properly escaped solution:

function contains($string, Array $search, $caseInsensitive = false) {
    $exp = '/'
        . implode('|', array_map('preg_quote', $search))
        . ($caseInsensitive ? '/i' : '/');
    return preg_match($exp, $string) ? true : false;
}
Sign up to request clarification or add additional context in comments.

9 Comments

$searchWords should be properly escaped
Im not going to down vote you as it is a valid answer, but I do think that preg_match is pointless when you have functions like strstr and stristr
@binaryLV: yes this is just a quick idea to point in the right direction and that should work with the sample code in the question but a more reliable solution must properly escaped the words @RobertPitt, strstr or stristr wouldn't test against multiple strings in a single pass, or i missed something ? If we end to use it in a loop i think that preg_match would be more efficient no ?
Looping through values (which may mean looping through a whole lot of values) can be very inefficient. Using a regular expression may well be faster than that. But, as always, only profiling will reveal the faster method.
@RobertPitt So you're saying using foreach and if is always faster than matching with a regular expression? I'd love to see some benchmarks on that really, because I don't see that happening.
|
3
function searchWords($string,$words)
{
    foreach($words as $word)
    {
        if(stristr($string," " . $word . " ")) //spaces either side to force a word
        {
            return true;
        }
    }
    return false;
}

Usage:

$string = 'Alice goes to school every day';
$searchWords = array('basket','cat','tree');

if(searchWords($string,$searchWords))
{
     //matches
}

Also take note that the function stristr is used to make it not case-sensitive

5 Comments

You might also specify that $words must be an array, i.e., function searchWords($string, array $words) { /* ... */ }
Nope, That is only supported in very new releases of PHP, that would just confuse people in my opinion.
Array type hinting was introduced in PHP 5.1.0 which was released in 24-Nov-2005 - I would not call it a "very new release" in year 2011.
Hmm That's a good point, I must of been confused with something else along those lines. My Apologies.
This does not work if the match is the word on either the beginning or end of the "sentence" or string. Any ideas how to resolve that?
3

As per the example of malko, but with properly escaping the values.

function contains( $string, array $search ) {
    return 0 !== preg_match( 
        '/' . implode( '|', preg_quote( $search, '/' ) ) . '/', 
        $string 
    );
}

5 Comments

isn't that what @malko just posted :/
@Robert yes but this one properly escaped the searched string and is more reliable, i we want to make a "perfect" solution we can add a third parameter $caseInsensitive=false by default that add a 'i' to the end of the regexp to allow searching in a case insensitive manner ie: '/' . implode( '|', preg_quote( $search, '/' ) ) . '/'.($caseInsensitive?'i':'')
in fact this won't work because preg_quote doesn't handle array as parameter, see my post i've edited your solution a litlle bit
ayea it should be implode( '|', array_map(function($e){return preg_quote( $e, '/' );},$search)
Fatal error: Uncaught TypeError: preg_quote(): Argument #1 ($str) must be of type string, array given
3

If string can be exploded using space following will work:

var_dump(array_intersect(explode(' ', $str), $searchWords) != null);

OUTPUT: for 2 examples you've provided:

bool(true)
bool(false)

Update:

If string cannot be exploded using space character, then use code like this to split string on any end of word character:

var_dump(array_intersect(preg_split('~\b~', $str), $searchWords) != null);

2 Comments

Sometimes it's not possible to explode using space in my case, but thanks, this is a cool method.
@WebolizeR In that case you can use var_dump(array_intersect(preg_split('~\b~', $str), $searchWords) != null); that will split original string by any end of word character not just space.
1

There is always debate over what is faster so I thought I'd run some tests using different methods.

Tests Run:

  1. strpos
  2. preg_match with foreach loop
  3. preg_match with regex or
  4. indexed search with string to explode
  5. indexed search as array (string already exploded)

Two sets of tests where run. One on a large text document (114,350 words) and one on a small text document (120 words). Within each set, all tests were run 100 times and then an average was taken. Tests did not ignore case, which doing so would have made them all faster. Test for which the index was searched were pre-indexed. I wrote the code for indexing myself, and I'm sure it was less efficient, but indexing for the large file took 17.92 seconds and for the small file it took 0.001 seconds.

Terms searched for included: gazerbeam (NOT found in the document), legally (found in the document), and target (NOT found in the document).

Results in seconds to complete a single test, sorted by speed:

Large File:

  1. 0.0000455808639526 (index without explode)
  2. 0.0009979915618897 (preg_match using regex or)
  3. 0.0011657214164734 (strpos)
  4. 0.0023632574081421 (preg_match using foreach loop)
  5. 0.0051533532142639 (index with explode)

Small File

  1. 0.000003724098205566 (strpos)
  2. 0.000005958080291748 (preg_match using regex or)
  3. 0.000012607574462891 (preg_match using foreach loop)
  4. 0.000021204948425293 (index without explode)
  5. 0.000060625076293945 (index with explode)

Notice that strpos is faster than preg_match (using regex or) for small files, but slower for large files. Other factors, such as the number of search terms will of course affect this.

Algorithms Used:

//strpos
$str = file_get_contents('text.txt');
$t = microtime(true);
foreach ($search as $word) if (strpos($str, $word)) break;
$strpos += microtime(true) - $t;

//preg_match
$str = file_get_contents('text.txt');
$t = microtime(true);
foreach ($search as $word) if (preg_match('/' . preg_quote($word) . '/', $str)) break;
$pregmatch += microtime(true) - $t;

//preg_match (regex or)
$str = file_get_contents('text.txt');
$orstr = preg_quote(implode('|', $search));
$t = microtime(true);
if preg_match('/' . $orstr . '/', $str) {};
$pregmatchor += microtime(true) - $t;

//index with explode
$str = file_get_contents('textindex.txt');
$t = microtime(true);
$ar = explode(" ", $str);
foreach ($search as $word) {
    $start = 0; 
    $end = count($ar);
    do {
        $diff = $end - $start;
        $pos = floor($diff / 2) + $start;
        $temp = $ar[$pos];
        if ($word < $temp) {
            $end = $pos;
        } elseif ($word > $temp) {
            $start = $pos + 1;
        } elseif ($temp == $word) {
            $found = 'true';
            break;
        }
    } while ($diff > 0);
}
$indexwith += microtime(true) - $t;

//index without explode (already in array)
$str = file_get_contents('textindex.txt');
$found = 'false';
$ar = explode(" ", $str);
$t = microtime(true);
foreach ($search as $word) {
    $start = 0; 
    $end = count($ar);
    do {
        $diff = $end - $start;
        $pos = floor($diff / 2) + $start;
        $temp = $ar[$pos];
        if ($word < $temp) {
            $end = $pos;
        } elseif ($word > $temp) {
            $start = $pos + 1;
        } elseif ($temp == $word) {
            $found = 'true';
            break;
        }
    } while ($diff > 0);
}
$indexwithout += microtime(true) - $t;

3 Comments

You forgot to pass $word through preg_quote(). You also did not test preg_match() without foreach loop (by passing list of quotted "words" as word1|word2|word3).
Great points. I've taken your suggestions and updated the results above.
$orstr = preg_quote(implode('|', $search)); does not look right. Each word should be quoted rather than whole pattern. Replace it with $orstr = implode('|', array_map('preg_quote', $search)); $orstr = str_replace('/', '\\/', $orstr);. Second operation is needed for escaping delimiters. It would also be worth doing some tests with different sets of words (e.g., with 2 words and with 5 words).
0

try this:

$string = 'Alice goes to school every day';
$words = split(" ", $string); 
$searchWords = array('basket','school','tree');

for($x = 0,$l = count($words); $x < $l;) {
        if(in_array($words[$x++], $searchWords)) {
                //....
        }
}

Comments

0

Below prints the frequency of number of elements found from the array in the string

function inString($str, $arr, $matches=false)
    {
        $str = explode(" ", $str);
        $c = 0;
        for($i = 0; $i<count($str); $i++)
        {
            if(in_array($str[$i], $arr) )
            {$c++;if($matches == false)break;}
        }
        return $c;
    }

Comments

-1

Below link will help you : just need to customize as you required.

Check if array element exists in string

customized:

function result_arrayInString($prdterms,208){
  if(arrayInString($prdterms,208)){
      return true;
  }else{
     return false;
  }
}

This may be helpful to you.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.