Jakub Korab
Tech, Opinion, and Doing Stuff

Get Functional

November 21st, 2009

That was the message that was coming through the Devoxx conference presentations this year. The idea that it will help your code run in the brave new world of multi everything (multi-core, multi-thread etc.) is one that’s widely touted, but rarely the primary driver for its use. Instead, it’s about less code, that’s more easily understood. When you do get to scaling it, it won’t do any harm either.

As Guillaume Laforge tweeted, from 800 Java developers in his session, only 10 knew/used Scala, 3 Clojure, 20 Ruby, and 50 were on Groovy – which gives a nice gentle introduction to some of the constructs for those looking to wade in. Good stats to cut through they hype. So what of the roughly 90% slogging on without closures, does this mean that they have to miss out on this fun?

Quite simply, no. There’s heap of drop in libraries that you can add into a Java project for all manner of functional goodness, and which don’t change the syntax of the language. LambdaJ for example gives a nice functional way of dealing with collections. To steal an example directly from the website, the following typical Java code:

List<Person> sortedByAgePersons = new ArrayList<Person>(persons);
Collections.sort(sortedByAgePersons, new Comparator<Person>() {
        public int compare(Person p1, Person p2) {
           return Integer.valueOf(p1.getAge()).compareTo(p2.getAge());
        }
});

is replaced with:

List<Person> sortedByAgePersons = sort(persons, on(Person.class).getAge());

Fancy a bit of map-reduce without a grid? Well, it comes stock-standard with the Fork Join (JSR166y) framework that will be added to the concurrency utilities in JDK 7. If you don’t fancy waiting until September 2010 (the latest expected date for the GA release), it’s downloadable here. As an aside, Doug Lea has written a really good paper on the FJ framework.

Don’t fancy loops in loops in loops to filter, aggregate, do set operations with all the null checking that Java programming typically entails? Well, the Google Collections library (soon to be integrated into Guava, a set of Google’s core libs), contains predicates and transform functions that make all of this a lot easier to write and reason about. Dick Wall had a great presentation about this showing just how much code can be reduced (heaps).

A thing I heard a number of times outside the sessions was, “I don’t know about all this stuff, surely as we get further from the metal, performance suffers”. Sure, it gets harder to reason about timings as the abstractions get weirder, but the environment gets better all the time, and the productivity gains more than outweigh performance in all but the most perf-intensive environments. Brian Goetz spoke about how the JVM supports this new multi-language world. Not something that I had ever really given much thought to, but the primary optimizations aren’t at the language compiler level (javac, scalac, groovyc etc.)- they’re are all done at runtime, when the JVM compiles the bytecode. The number of optimizations in HotSpot are massive (there was a striking slide showing 4 columns of individual techniques in a tiny font). Multiple man-centuries of effort have gone into it, and each new release tightens it up. If you’re not sure, then profile it and make up your own mind. JDK 7 will also see the VM with some goodness that will make dynamic languages really fly.

One thing that still sticks out like a sore thumb is Closures support in Java. It’s not a candidate for inclusion in JDK 7, and the proposed syntax shown at the conf by Mark Reinhold looks pretty ugly when compared to other langs (see the proposal by Neal Garter). Either way, not a sniff of actual implementation. I understand there’s some serious work on the VM to make any of this possible regardless of the syntax. Not holding my breath. [Closures will actually be in JDK7 - thanks Neal.]

All up, I’m pretty excited by all this, and can’t wait to get my hot little hands on some of these tools. The functional style yields code that’s much easier to read and reason about, and the fact that it’s essentially all Java syntax, means that there’s no reason not to apply it. If you’re already comfortable with using EasyMock on your team, you won’t find it a huge mind shift.


Filed under: conference, java, software engineering, tools | No Tag
No Tag
November 21st, 2009 16:08:35

Deep Diff Pizza

November 17th, 2009

There is nothing I love more than a proprietary, undocumented API. Call it an unfortunate fact of life, but weird object models that hang together by the skin of their teeth are out there. Most of the time there’s no validation logic to check that they’re semantically or syntatically correct until you send this tangle of objects to a system you’re integrating with. Having been burned badly in the past on this sort of work, I’ve been looking for a way to work effectively in these types of scenarios.

In my most recent case, I had some example code that built up these object graphs for various use cases. The code wasn’t transportable, but it did generate correct inputs. After reverse engineering the graph construction logic into a builder by working out of XStream dumps to console, I was then able to build up exploratory integration tests that triggered the various use case scenarios. So far, so good. Now to just remove the need to have the back-end system up.

What I needed was a way to compare a test object graph with a constructed one. I already had an XML representation thanks to XStream of the expected output, so I could reload it into memory as needed – so there’s the test data. The equals() and hashCode() methods on the model were unreliable, so that’s no good. I toyed with the idea of writing a general purpose deep diff mechanism for object graphs, but came to the conclusion that it wasn’t a good idea. Aside from not wanting to get into writing gnarly reflection code I realised that the problem was more difficult than it seemed. You get into questions like “What is equality?”, “How much change is acceptable?”, “Where do I stop? Primitives, java.util.*?”, and “How can I mark things as being acceptably different?”. It’s probably a good indication that noone had written this sort of thing already.

Then I realised that the answer was staring me in the face. Just diff the XML representation programatically! By loading up an expected input from a file next to the unit test, I could then dump the model under test out to another String and… Rock and Roll.

XMLUnit has a quite good diffing utility, which was about 80% of the way there. Some of the data in my model changed over time, so I needed a way to say “ignore these bits of the tree”. There’s a neat little interface in XMLUnit called DifferenceListener, that gets notified whenever the mechanism finds a diff. You can implement a method that decides whether to report the difference as different, the same, or similar (why you’d want to do that is beyond me – “this is different-ish”). I hacked up my own implementation that took some XPath expressions, mapping the nodes in the control graph that ought to be ignored and voila.

The more I code the more uses I find for XStream. In combination with XMLUnit, it’s like hazelnuts to chocolate. It’s an excellent option next time you need to diff object graphs.


Filed under: java | No Tag
No Tag
November 17th, 2009 18:15:29