Pages

Tuesday, September 14, 2010

Getting started with ASM

Byte-code manipulation is something I’ve been aware of but never thought much about actually doing myself. I never felt the need for writing my own aspect framework or even getting much out of the Java mainstream. So why am I looking at ASM? Blame Geek Night Dallas!



ThoughtWorks sponsors a Geek Night in their Dallas office on Wednesday nights. We work on some open source code (well-established projects or personal pet projects doesn’t matter) and try to make sure to commit (or submit a patch) each week. One of my colleagues (Paul Hammant) introduced me to LambdaJ and we paired for the evening trying to extend the framework to support additional ‘group’ functions. For example, LambdaJ supports summing a particular property of the elements of a collection:


int totalAge = sum(meAndMyFriends, on(Person.class).getAge());

But does not (currently) have an arithmetic mean function that can be applied in the same way. So, we tried to create an avg function that could be applied in the same way as sum.

Why? Well, it is kind of cool to be able to reference an arbitrary method on a class and use it in more of a closure manner. We could do the same thing in normal Java:

int totalAge = 0;
for (Person person : meAndMyFriends)
 totalAge += person.getAge();

Or even in Google Collections with the appropriate infrastructure of Functions (We’d need one to reduce a Collection to a single value (e.g., Function<Collection<T>, Integer>) and if I were sufficiently clever I might even be able to pass another function to be used to perform the reduction).  The LambdaJ solutions is more concise (and the Google team has indicated they don’t want to add more functional programming to their Collections framework).


Upon further investigation Paul determined that there are some incompatibilities between LambdaJ and Google’s AppEngine (as well as some known performance issues with LambdaJ). Paul speculated that we might be able to do some post-compile processing to address these issues by creating custom classes for our desired functionality. At the next Geek Night another ThoughtWorker (Srini Raguraman) and I paired to determine if we could leverage ASM to do things similar to LambdaJ but with better performance.  Thus was born JProxyGen.

Vision

Ultimately we would like to leverage Annotations to cause specific classes to be generated after the compile step. These generated classes would be packaged in the jar files and would have well-known names to allow us to use them in our framework. Two possible uses we’d like to explore are NullObject and ProxyObject implementations.

First Steps

Before we can create new class files we need to understand how to use the ASM framework. Srini and I decided we would create a simple class and attempt to use the framework to give us a list of the method signatures of the class. Here is the class:
public class AClassWithOnlyOnePrimitive {
    private int intValue;
    public int getIntValue() {
        return intValue;
    }
    public void setIntValue(int intValue) {
        this.intValue = intValue;
    }
}
And here is our first test based on the output we got from using the existing TraceClassVisitor:
public class SourceReaderTest {
    @Test
    public void shouldGetMethodNamesUsingObjectImpl() throws IOException {
        SourceReader reader = new SourceReaderObjectImpl(
                     SourceReaderTest.class.getResourceAsStream(
                     "AClassWithOnlyOnePrimitive.class"));
        assertEquals(newArrayList( "()V",
                     "getIntValue()I", "setIntValue(I)V"),
                      reader.getMethodNames());
     }
 }

One constraint of a Geek Night is trying to commit working code at least once in the few hours we have. We didn’t have a complete working Visitor of our own at the end of the evening. Maybe it was the time constraint or our not finding the examples that would help us.  But Srini figured out how to get this to work between Geek Night sessions so we can make faster progress in the future.

Srini: ASM provides two convenient APIs for accessing the class meta structure. The Event API is like SAX and the Object API is like the DOM. It looks like the library provides great flexibility in mixing and matching the Event and Object APIs to fit the task at hand.

What Next?

The ASM website gives this as the pattern for determining what byte code to generate:
The best way to learn to use ASM is to write a Java source file that is equivalent to what you want to generate and then use the ASMifier mode of the Bytecode Outline plugin for Eclipse (or the ASMifier tool) to see the equivalent ASM code. If you want to implement a class transformer, write two Java source files (before and after transformation) and use the compare view of the plugin in ASMifier mode to compare the equivalent ASM code.
We have our first class (AClassWithOnlyOnePrimitive) and we have the byte code emitted by the TraceClassVisitor. Using Behavior Driven Development we will:
  1. Implement the class we would like to generate via ASM.
  2. Extract functionality snippets by comparing the original and desired byte codes.
  3. Implement features of our code generator to create those snippets.
  4. Add an annotation processor to trigger our code generation.
At some point we will have to integrate the code generation into the build process (probably as a maven plugin).

No comments:

Post a Comment