Saturday, June 25, 2011

A catalyst method in Java

Imagine that you have a code that invokes a method – lets call it ‘A’. Now is it possible that by adding another method (B) to the code I can make that invocation redirect to executing B instead of A? Sure! It’s simple stuff – just override or hide the method A with B and you’re done. Easy stuff…

Lets do something more difficult… is it possible that by adding method B the previous invokation of A will execute a totally different method ‘C’ and NEVER execute B? Can I add a catalyst method to the code that never is executed but it existence changes the code execution? Can I?? Well, with a little bit of magic everything can be done and the following code snippet proves it:

public class BaseCounter {
    static <E> Object count(List<E> input) {
        System.out.println("Base");
        return input.size();
    }
}

public class CatalystCounter extends BaseCounter {
//    static <E> Integer count(List<E> input) {
//        System.out.println("CatalystCounter");
//        return input.size();
//    }
}

public class ComparableCounter extends CatalystCounter {
    static <E extends Comparable<? super E>> Integer count(List<E> input) {
        System.out.println("ComparableCounter");
        return input.size();
    }
}

Now in the main method:

public static void main(String[] args) {
    List<Integer> integers = new ArrayList<Integer>();
    ComparableCounter.count(integers);

    List<Number> numbers = new ArrayList<Number>();
    ComparableCounter.count(numbers);
}

In summary the code above contains three implementations of a Counter class that counts the number of items in a given list. There is also a main method that uses the most advanced implementation of the Counter on two empty lists. In this code we do not really care how many objects are there in a list, but which method will be executed and since every method prints its name we want to know what will get printed out into the console. So… what will be printed out?

One might expect to see in the console two times “ComparableCounter” – this makes sense as we are invoking the method of this class, right? Well, not really… It is true for the first invocation in line 24, but in line 27 BaseCounter method will be invoked instead. This is because (who would guess) the Number class does not implement Comparable! This is why compiler cannot choose ComparableCounter.count() method and links to BaseCounter.count() instead. Everything still makes sense, right?
Now the magic begins! Cry havoc and uncomment the count() method of CatalystCounter!! After running the code again you will see that the output changed. Now in both cases ComparableCounter is used… To put your mind at ease I ensure that Number still does not implement Comparable

Before I will explain what have happened lets make some comments about the code itself. You probably have noticed that even though we have three classes and make use of inheritance the methods we invoke are static. Therefore there is no extending of those methods – we are overloading them. This should be the first sign of a problem with a code as we are clearly hacking through something that should be done in a plain simple OO way. Second thing we have to notice is the difference in signature of those three methods: count from BaseCounter and CatalystCounter differ only in the returned type, while the difference between metod from CatalystCounter and ComparableCounter is that the latter adds a requirement that the list generic type should be comparable.
How does all of this adds up to an solution? The problem is caused by the way indirect references to static methods are compiled. The compiler at first finds the static method that is the best fitting for a given invocation. When it does it creates a bytecode linking to the method found and that link is resolved later by JVM. The issue is in the way the bytecode link is created. To identify a method only the following information is used: class name, method name, returned and arguments types. What is more for static methods the information about class is the name of a class on which the method was executed. In essence that means invokation in line 24 will be linked by compiler to ComparableCounter.count() and in bytecode it will say: run a method called ‘count’ in class ‘ComparableCounter’ with argument ‘List’ and returned type ‘Integer’. Notice that after compilation information about generics is erased.
How will look bytecode for line 27? It depends on whether the code in CatalystCounter is commented out or not. Let’s assume first that the code is not there – the compiler will choose to link to BaseCounter.count() as Number does not implement comparable and the bytecode will say the following: run a method called ‘count’ in class ‘ComparableCounter’ (not BaseCounter as in line 27 we clearly wrote ComparableCounter) with argument ‘List’ and returned type ‘Object’. When JVM execute that it will find only one method fitting this description – the one implemented in BaseCounter.
How will that go with uncommented code? In that case the compiler will prefer to link to the CatalystCounter instead of BaseCounter class, but the created bytecode will be following: run a method called ‘count’ in class ‘ComparableCounter’ with argument ‘List’ and returned type ‘Integer’. But wait… this is the same bytecode as for line 24! When JVM sees that it will start looking for the method fitting that description, but because generic information is at that point long gone instead of CatalystCounter it will use ComparableCounter! Mystery solved!
In summary: a combination of factors contributed to the magic of catalyst method. Because of type erasure and the way static methods are compiled into bytecode we got into situation when compiler wanted to link to one method and JVM ‘wrongly’ resolved it to another. If you got lost in this explanation do not worry – this takes a while to understand

The lesson from that is never to try to solve a problem with staticinheritance’. Besides few really rare cases this will cause you problems.

No comments:

Post a Comment

Chitika