28

Possible Duplicate:
What is the cost / complexity of a String.indexof() function call

What is the complexity of java indexof(String str) method. I mean there are String matching algorithms like KMP which runs in linear time. I am implementing a system that needs to search for large substring in a really large string,so can I use java indexof(String str) method or I should to implement KMP.

2
  • 1
    You should just use the convenient way unless you have hard data indicating that it won't cut it. No use in asking hypothetical questions with vague definitions (how long is "really large"? 15,000 Chars? 5 GB?) on the internet. Commented Oct 5, 2012 at 18:49
  • related: stackoverflow.com/questions/19543547/… Commented Nov 27, 2021 at 5:09

2 Answers 2

45

The complexity of Java's implementation of indexOf is O(m*n) where n and m are the length of the search string and pattern respectively.

What you can do to improve complexity is to use e.g., the Boyer-More algorithm to intelligently skip comparing logical parts of the string which cannot match the pattern.

Sign up to request clarification or add additional context in comments.

2 Comments

-1, your answer is wrong/misleading. There are two lengths to worry about -- the string you're searching in and the string you're searching for -- and the algorithm is O(mn).
O(m*n) is worst case, not typical. Typical is O(n). When you say O(m*n) you're totally ignoring the short-circuiting of matching pattern to current position, which will stop as soon a pattern doesn't match. That would rarely progress past the 3rd character of the pattern. indexOf() even has specific optimization to look for first character before even entering the pattern matching logic. Amortized, it's probably something like O(1.1 * n), so O(n).
26

java indexOf function complexity is O(n*m) where n is length of the text and m is a length of pattern
here is indexOf original code

   /**
     * Returns the index within this string of the first occurrence of the
     * specified substring. The integer returned is the smallest value
     * <i>k</i> such that:
     * <blockquote><pre>
     * this.startsWith(str, <i>k</i>)
     * </pre></blockquote>
     * is <code>true</code>.
     *
     * @param   str   any string.
     * @return  if the string argument occurs as a substring within this
     *          object, then the index of the first character of the first
     *          such substring is returned; if it does not occur as a
     *          substring, <code>-1</code> is returned.
     */
    public int indexOf(String str) {
    return indexOf(str, 0);
    }

    /**
     * Returns the index within this string of the first occurrence of the
     * specified substring, starting at the specified index.  The integer
     * returned is the smallest value <tt>k</tt> for which:
     * <blockquote><pre>
     *     k &gt;= Math.min(fromIndex, this.length()) && this.startsWith(str, k)
     * </pre></blockquote>
     * If no such value of <i>k</i> exists, then -1 is returned.
     *
     * @param   str         the substring for which to search.
     * @param   fromIndex   the index from which to start the search.
     * @return  the index within this string of the first occurrence of the
     *          specified substring, starting at the specified index.
     */
    public int indexOf(String str, int fromIndex) {
        return indexOf(value, offset, count,
                       str.value, str.offset, str.count, fromIndex);
    }

    /**
     * Code shared by String and StringBuffer to do searches. The
     * source is the character array being searched, and the target
     * is the string being searched for.
     *
     * @param   source       the characters being searched.
     * @param   sourceOffset offset of the source string.
     * @param   sourceCount  count of the source string.
     * @param   target       the characters being searched for.
     * @param   targetOffset offset of the target string.
     * @param   targetCount  count of the target string.
     * @param   fromIndex    the index to begin searching from.
     */
    static int indexOf(char[] source, int sourceOffset, int sourceCount,
                       char[] target, int targetOffset, int targetCount,
                       int fromIndex) {
    if (fromIndex >= sourceCount) {
            return (targetCount == 0 ? sourceCount : -1);
    }
        if (fromIndex < 0) {
            fromIndex = 0;
        }
    if (targetCount == 0) {
        return fromIndex;
    }

        char first  = target[targetOffset];
        int max = sourceOffset + (sourceCount - targetCount);

        for (int i = sourceOffset + fromIndex; i <= max; i++) {
            /* Look for first character. */
            if (source[i] != first) {
                while (++i <= max && source[i] != first);
            }

            /* Found first character, now look at the rest of v2 */
            if (i <= max) {
                int j = i + 1;
                int end = j + targetCount - 1;
                for (int k = targetOffset + 1; j < end && source[j] ==
                         target[k]; j++, k++);

                if (j == end) {
                    /* Found whole string. */
                    return i - sourceOffset;
                }
            }
        }
        return -1;
    }

you can simply implements the KMP algorithm without using of indexOf Like this

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.PrintWriter;
import java.util.Scanner;


public class Main{
    int failure[];
    int i,j;
    BufferedReader in=new BufferedReader(new InputStreamReader(System.in));
    PrintWriter out=new PrintWriter(System.out);
    String pat="",str="";
    public Main(){
            try{

            int patLength=Integer.parseInt(in.readLine());
            pat=in.readLine();
            str=in.readLine();
            fillFailure(pat,patLength);
            match(str,pat,str.length(),patLength);
            out.println();
            failure=null;}catch(Exception e){}

        out.flush();
    }
    public void fillFailure(String pat,int patLen){
        failure=new int[patLen];
        failure[0]=-1;
        for(i=1;i<patLen;i++){
            j=failure[i-1];
            while(j>=0&&pat.charAt(j+1)!=pat.charAt(i))
                j=failure[j];
            if(pat.charAt(j+1)==pat.charAt(i))
                failure[i]=j+1;
            else
                failure[i]=-1;
        }
    }
    public void match(String str,String pat,int strLen,int patLen){
        i=0;
        j=0;
        while(i<strLen){
            if(str.charAt(i)==pat.charAt(j)){
                i++;
                j++;
                if(j==patLen){
                    out.println(i-j);
                    j=failure[j-1]+1;
                }
            } else if (j==0){
                    i++;
            }else{
                j=failure[j-1]+1;
            }

        }
    }
    public static void main(String[] args) {
        new Main();
    }
}

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.