Consider the following Java code (complete, compiles and runs fine).
The code creates an array containing 5,000,000 integers (1 to 5 million), loops over it, and creates an ArrayList of the perfect squares it finds. Perfect squares are detected using a naive technique, and not bit manipulation, but that is not the focus of the issue at hand.
Mathematically, between 1 and 5M, there are 2236 perfect squares. So the ArrayList that the perfect squares are being put into will have a final size of 2236.
import java.util.ArrayList;
public class PerfSquares {
public static ArrayList<Integer> perfectSquares(int[] arr) {
ArrayList<Integer> al = new ArrayList<Integer>();
// ArrayList<Integer> al = new ArrayList<Integer>(arr.length);
for (int i = 0; i < arr.length; i++) {
double root = Math.sqrt(arr[i]);
int irt = (int) Math.floor(root);
if (irt * irt == arr[i]) {
al.add(arr[i]);
}
}
return al;
}
public static void main(String[] args) {
int[] arr = new int[5000000];
for (int i = 0; i < arr.length; i++) {
arr[i] = i + 1;
}
long s = System.currentTimeMillis();
perfectSquares(arr);
long e = System.currentTimeMillis();
System.out.println(e - s);
}
}
I'd like to focus on the declaration of the ArrayList. These two lines, one of which is commented out:
ArrayList<Integer> al = new ArrayList<Integer>();
//ArrayList<Integer> al = new ArrayList<Integer>(arr.length);
When I run with the first declaration (without the size explicitly supplied), the timediff I see is:
~96 milliseconds.
When I run with the second declaration (with the size explicitly supplied), the timediff increases to:
~105 milliseconds
Question:
Why is this behavior so? Shouldn't the second case (size supplied) be faster?
As per my understanding, in the first case, when we omit the size parameter to the ArrayList creation, behind the scenes an array of length 10 will be initialized. And when this capacity is exceeded, a new array with a larger capacity (not sure how much larger) will be allocated, and the previous elements will be copied over.
For 2236 elements and no initial size being specified, this "cap exceeded - allocate new - copy over - append more till cap" cycle should repeat many times, slowing it down.
Consequently, I was expecting the size supplied declaration to be faster - Since the allocation will happen once, and there will be no capacity exceeding / new array creation and copying over happening.
Or is this basically so because 2236 appends to an ArrayList, even with all the cap-exceeds-copy-over cycles, will still be faster than creating an ArrayList of size 5,000,000?
ArrayListwith the full array size, try to use(int) Math.sqrt(arr.size()), which is approximately the number o f perfect squares. Add a few if you want to be sure.LinkedListeven appending and iterating are much slower (by a factor of 2 plus the GC overhead). There's you only one operation where it's good: Removal at theIterator's position. That's close to nothing. While there are good use cases forLinkedList, they're much rarer than most people expect.