Monday, June 20, 2011

Array iteration optimization in Java

The problem
I am working on a project right now which involves lots of loops that look like:

for( int x = 0; x < size; x++ ) {
    for( int y = 0; y < size; y++ ) {
        for( int z = 0; z < height; z++ ) {
            if( somearray[x][y][z] == somevalue) {
                // do something
            }
        }
    }
}

Writing out these loops by hand got tedious, error-prone, high-maintenance, and makes the code longer.
If I was writing in C++, I could macro it out.
In Java, for reasons I tend to agree with, there are no macros, not an option. Same deal in C#, as far as I know.
Idea One: anonymous classes
My first idea was to write a class which would use an anonymous class to call back into my code. I figured one could use it like this:
new Iterator(startVector3i, endVector3i).iterate( new Callback(){
   public void callback(Vector3i next ){
      // do stuff here
   }
));

Technically this is possible, except the code inside has no access to our local variables in our calling method, which makes it not terribly useful I feel.

Discovery: using an iterator class surprisingly slow

Next, I made an iterator class that worked like this:
ArrayIterator iterator = new ArrayIterator(startVector3i,endVector3i);
while( iterator.next() ){
    if( somearray[iterator.x][iterator.y][iterator.z] == somevalue ) {
        // do something
    }
}

This took a long time to execute. Compared to the bog-standard nested for-loops we started with, it took 20 times longer to execute!

Why?

Deduction: java optimizes processor cache requests in nested for loops

It baffled me why this approach was so slow. Surely a method call is not so expensive?

I tried all sorts of different performance tests, and in the end found the following very specific code sample:

for( int x = 0; x < size; x++ ) {
    for( int y = 0; y < size; y++ ) {
        for( int z = 0; z < height; z++ ) {
            if( somearray[x][y][z] == somevalue) {
                // do something

This runs quickly.

for( int x = 0; x < size; x++ ) {
    for( int y = 0; y < size; y++ ) {
        for( int z = 0; z < height; z++ ) {
            if( wrapperObject.getArrayValueAt(x,y,z) == somevalue) {
                // do something
This runs 20 times slower, even though normal simple method calls were not particularly expensive I found.
My conclusion is that when we iterate over an array with an obvious in-lined for loop, java/jvm/processor has enough information available to realize it can optimize by fetching a batch of values from the array all at once.
When the array access is not directly inside the nested for loop, this doesn’t work.
So, my conclusion is that it seems any iterative access to large arrays needs to be explicitly inlined with the iterating loop in the code.
Edit: found a relevant sun wiki page that I think explains this effect
http://wikis.sun.com/display/HotSpotInternals/RangeCheckElimination
Basically, a bunch of range checks are carried out on array access, and when the access is done in a for loop, many of these checks are eliminated, where the loop body is inlined.

No comments:

Post a Comment

Chitika