0

I am writing some java code i wrote a method and for the test input it is taking morethan 5sec's to execute I really really wanna keep it less than 5sec can anyone suggest me how i can optimize my method

private static String getShortestSub(ArrayList<String> paraWordsList,
        ArrayList<Integer> paraWordsIndexes,
        ArrayList<Integer> lowFreqIndexes) {

    long d = System.currentTimeMillis();
    // Finding the substring
    int startTxtIndex = 0, endTxtIndex = 0;
    int tempLength = paraWordsList.size();
    for (int i = 0; i < lowFreqIndexes.size(); i++) 
    {
        int point = lowFreqIndexes.get(i), startIndex = 0;
        HashSet<String> frame = new HashSet<String>();


        // index is the indexes of paraWordsIndexes
        startIndex =paraWordsIndexes.indexOf(point);
        for (int index = paraWordsIndexes.indexOf(point); index >= 0; index--) 
        {
            if (frame.add(paraWordsList.get(paraWordsIndexes.get(index))))
            {
                startIndex = index;
                if (frame.size() == K
                        || (paraWordsIndexes.get(startIndex) - point) >= tempLength) 
                    index = -1;                 
            }
        }
        frame.clear();

        for (int start = startIndex, index = startIndex; start <= paraWordsIndexes
                .indexOf(point) && index < paraWordsIndexes.size(); index++) 
        {
            int tempStart = paraWordsIndexes.get(start), tempEnd = paraWordsIndexes.get(start);
            int currIndex = paraWordsIndexes.get(index);
            String word = paraWordsList.get(currIndex);
            if ((tempStart - point) >= tempLength)          break;
            if ((tempStart - currIndex) >= tempLength)      break;
                    frame.add(word);
            if (frame.size() == K) 
            {
                tempEnd = currIndex;
                int newLength;
                if ((newLength = tempEnd - tempStart) > 0)
                    if (tempLength > newLength) 
                    {
                        tempLength = newLength;
                        startTxtIndex = tempStart;
                        endTxtIndex = tempEnd;
                        if (K == (tempLength+1)) {
                            i = lowFreqIndexes.size();
                            break;
                        }
                    }
                frame.clear();
                tempStart = paraWordsList.size();
                start++;
                index = start - 1;
            }
        }
        frame.clear();
        System.out.println(System.currentTimeMillis() - d);
    }

    String[] result = paraText.split(" ");
    ArrayList<String> actualParaWordsList = new ArrayList<String>(
            Arrays.asList(result));

    return textCleanup(actualParaWordsList.subList(startTxtIndex,
            endTxtIndex + 1).toString());
}
9
  • Why do you want to keep it under 5 seconds? Commented Jul 25, 2013 at 5:23
  • 1
    have you tried a profiler? visualvm.java.net Commented Jul 25, 2013 at 5:23
  • Thats the requiremebt (< 5sec) Commented Jul 25, 2013 at 5:54
  • Where can i get profiler and whats it ? Commented Jul 25, 2013 at 5:55
  • Please describe what this method does (what arguments it takes and what is the expected output). Additionally please decribe (or post the code) for the textCleanup() call at the end. Commented Jul 25, 2013 at 6:14

3 Answers 3

2

As a first optimization you could remove redundant calls to indexOf()

During the outer loop point variable does not change so the first call of indexOf() is the only one that is actually required.

// index is the indexes of paraWordsIndexes
startIndex =paraWordsIndexes.indexOf(point);

Introduce instead a new variable that would store the result of indexOf() and would not change inside the loop

int pointLFIndex = paraWordsIndexes.indexOf(point); // new variable. should not change
startIndex = pointLFIndex;

Then change all occurences of indexOf(point) to the above variable.

// you don't need this. change to for (int index = pointLFIndex; ...);
for (int index = paraWordsIndexes.indexOf(point); index >= 0; index--)  

// use for (int start = ...; start <= pointLFIndex ...; index++) {
for (int start = ...; start <= paraWordsIndexes.indexOf(point) ...; index++) {

indexOf() searches your array list linearly. Especially the second occurence is executed on every loop iteration, so it would be a killer for large lists

If the above doesn't help, I don't understand why you don't edit your question to add a simple test case since so many people have asked you too (including myself).

A simple scenario like this:

Input text: "Some words are larger while some other words are smaller"

paraWordsList: contains the string split of the above text e.g. {"Some", "words", ...}

paraWordsIndexes: contains the indexes of blah blah e.g. {0, 3}

lowFreqIndexes: contains blah blah e.g. {0, 1}

Expected output: it should return {value} but not {other_value}

Sign up to request clarification or add additional context in comments.

Comments

1

Your code appears to be complex ( for - if - for) in this case the best way to optimize it is using a profiler for check where is the code that is taking more time in the execution process.

Since you don't specify your IDE, y will try to suggest some interesting tools:

https://profiler.netbeans.org/ http://www.eclipse.org/tptp/

Best regards

Comments

0

Well, it would help if you didn't have nested loops, and, it would also help if you could minimize the number of if statements within each loop (especially if you have nested loops).

Can you explain what you are trying to do? Your code isn't totally obvious, and maybe there is a way to do it completely different from your approach.

5 Comments

I cant explain you in text its really complex. All i can think of is a java optimization as algo cant be changed
Give it a try? If you can't explain it in text then I'm not sure you will be able to think of it outside of it's current structure - which only you fully understand. I don't need anything specific, just basics. Often optimizations come from changing the process of what you are doing, not little things like removing an if statement. I want to try to help you get rid of those nested loops, but I can't do that if I don't know what they do.
I have a Big words of a Paragraph in a list lets call it as paraWordsList. and then i have some unique words list lets say it as wordsList. paraWordsList contains words which are in wordsList and ALSO NOT in wordsList. My target is to find the start and end index of paraWordsList which contains all the words in wordsList and it should be short[means between the two indexes the number of words should be less than any other such occurances].
So what i did was first added all the indexes of the paraWordsList in a list which is paraWordsIndexes and then found the lowest freqency word so that i can try finding the substring around it . The method i gave you goes to each lowest freqency word and then start from its left to find that sub string. refer
I cant explain you in text its really complex. That is a bad sign. If you don't have a clear and simple idea of what the code should do, you are going to have trouble find the optimal code for it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.