Archive for the ‘technology’ Category


Running ActiveMQ Replicated LevelDB on VirtualBox

I have wanted to spend some more time recently playing with Replicated LevelDB in ActiveMQ. Not wanting to hit a cloud environment such as EC2, I set up 3 VirtualBox nodes on my workstation running Fedora. The host system needs to be fairly beefy – you need at least a dedicated CPU core per image + RAM. The setup itself was relatively straightforward except for a couple of gotchas. I didn’t use Docker or similar simply because I haven’t played with it; as a rule I try not to introduce too many new pieces into a setup, otherwise it runs into a yak shaving exercise. What follows is an outline of how I got it running, and some of the things I bumped my head against.

My approach was to get the software set up on a single disk image, and then replicate that over to another couple of VMs, tweaking to get the clustering working correctly. Turns out, that works pretty well. When creating VirtualBox images, set the Network Adapter1 as a Bridged Adapter to your ethernet port, with Promiscuous Mode set to “Allow All” .

VirtualBox network adapter settings

VirtualBox network adapter settings

As a starting point, I downloaded Zookeeper 3.4.6 and ActiveMQ 5.10, and unzipped them into /opt, and created symlinks as /opt/zk and /opt/activemq respectively.

If you follow the setup instructions from the ActiveMQ site, the first thing you’ll see that what’s not covered is the Zookeeper (ZK) setup. Thankfully, the ZK website itself has a good breakdown of how to do this in the Clustered (Multi-Server) Setup section of the Administrator’s guide. The standard config file I used was /opt/zk/conf/zk.conf, configured with the following:

tickTime=2000
dataDir=/var/zk/data
clientPort=2181
initLimit=5
syncLimit=2

You have to create the /var/zk/data directory yourself. This is where ZK keeps its logs, as well as a file called myid, which contains a number that defines which node in the cluster this particular machine is.

Later, once you know the IP addresses of the machines you add the following to the end of the zk.conf file (here I’m using a 3 node cluster):

server.1=192.168.0.8:2888:3888
server.2=192.168.0.13:2888:3888
server.3=192.168.0.12:2888:3888

The number that follows server. corresponds to the contents of the myid file on that particular server.

For Zookeeper to work correctly, 3 ports have to be opened on the server’s firewall:

  • 2181 – the port that clients will use to connect to the ZK ensemble
  • 2888 – port used by ZK for quorum election
  • 3888 – port used by ZK for leader election

The firewall changes that you’ll need for ActiveMQ are:

  • 61616 – default Openwire port
  • 8161 – Jetty port for web console
  • 61619 – LevelDB peer-to-peer replication port

On Fedora, you access the firewall through the Firewall program (oddly enough). I set up ZK and ActiveMQ as Services in the Permanent configuration (not Runtime), and turned those two services on in the Public zone (default used).

Firewall Config

Fedora firewall settings

The ActiveMQ configuration changes are pretty straightforward. This is the minumum config change that was needed in the conf/activemq.xml:

<persistenceAdapter>
  <replicatedLevelDB zkAddress="192.168.0.8:2181,192.168.0.13:2181,192.168.0.2181"
      directory="${activemq.data}/LevelDB"
      hostname="192.168.0.8"/>
</persistenceAdapter>

The zkAddress is pretty clear – these are the ZK nodes and client ports of the ZK ensemble. The data directory is defined by directory. I’ll get onto the hostname attribute in a little while.

Once I had everything set up, it was time to start playing with the cluster. Starting the first ActiveMQ node was problem free – it connected to ZK, and waited patiently until one other ActiveMQ node connected to it (quorum-based replication requires numNodesInCluster/2 + 1 instances). When a second ActiveMQ node was started, the master exploded:

 INFO | Using the pure java LevelDB implementation.
 INFO | No IOExceptionHandler registered, ignoring IO exception
java.io.IOException: com.google.common.base.Objects.firstNonNull(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
	at org.apache.activemq.util.IOExceptionSupport.create(IOExceptionSupport.java:39)[activemq-client-5.10.0.jar:5.10.0]
	at org.apache.activemq.leveldb.LevelDBClient.might_fail(LevelDBClient.scala:552)[activemq-leveldb-store-5.10.0.jar:5.10.0]
	at org.apache.activemq.leveldb.LevelDBClient.replay_init(LevelDBClient.scala:657)[activemq-leveldb-store-5.10.0.jar:5.10.0]
	at org.apache.activemq.leveldb.LevelDBClient.start(LevelDBClient.scala:558)[activemq-leveldb-store-5.10.0.jar:5.10.0]
	at org.apache.activemq.leveldb.DBManager.start(DBManager.scala:648)[activemq-leveldb-store-5.10.0.jar:5.10.0]
	at org.apache.activemq.leveldb.LevelDBStore.doStart(LevelDBStore.scala:235)[activemq-leveldb-store-5.10.0.jar:5.10.0]
	at org.apache.activemq.leveldb.replicated.MasterLevelDBStore.doStart(MasterLevelDBStore.scala:110)[activemq-leveldb-store-5.10.0.jar:5.10.0]
	at org.apache.activemq.util.ServiceSupport.start(ServiceSupport.java:55)[activemq-client-5.10.0.jar:5.10.0]
	at org.apache.activemq.leveldb.replicated.ElectingLevelDBStore$$anonfun$start_master$1.apply$mcV$sp(ElectingLevelDBStore.scala:226)[activemq-leveldb-store-5.10.0.jar:5.10.0]
	at org.fusesource.hawtdispatch.package$$anon$4.run(hawtdispatch.scala:330)[hawtdispatch-scala-2.11-1.21.jar:1.21]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)[:1.7.0_60]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)[:1.7.0_60]
	at java.lang.Thread.run(Thread.java:745)[:1.7.0_60]
 INFO | Stopped LevelDB[/opt/activemq/data/LevelDB]

After a bit of scratching, I happened upon AMQ-5225 in the ActiveMQ JIRA. It seems that the problem is with a classpath conflict in 5.10 (readers from the future – if you’re using a newer version, you won’t see this problem :) ). To get around it, follow these instructions:

1. remove pax-url-aether-1.5.2.jar from lib directory
2. comment out the log query section

<bean id="logQuery" class="org.fusesource.insight.log.log4j.Log4jLogQuery"
lazy-init="false" scope="singleton"
init-method="start" destroy-method="stop">
</bean>

When I made the changes, everything started to work without this exception. Once the second ActiveMQ instance was started, the master sprung to life and started logging the following to the console:

WARN | Store update waiting on 1 replica(s) to catch up to log position 0.
WARN | Store update waiting on 1 replica(s) to catch up to log position 0.
WARN | Store update waiting on 1 replica(s) to catch up to log position 0.
...

The slave on the other hand had the following text output:

INFO | Using the pure java LevelDB implementation.
INFO | Attaching to master: tcp://localhost.localdomain:61619
WARN | Unexpected session error: java.net.ConnectException: Connection refused
INFO | Using the pure java LevelDB implementation.
INFO | Attaching to master: tcp://localhost.localdomain:61619
WARN | Unexpected session error: java.net.ConnectException: Connection refused
...

Ad infinitum.

After a bunch of fruitless searching, I realised that the answer was right in front of me in the slave output:
(Attaching to master: tcp://localhost.localdomain:61619). It seems that the master was registering itself into ZK with its hostname (localhost.localdomain).

Once that clicked, the documentation led me to the hostname attribute on the replicatedLevelDB tag in the persistenceAdapter. This value is used by the broker to advertise its location through ZK so that slaves can connect to its LevelDB replication port. The default behaviour sees ActiveMQ trying to automatically work out the hostname.

That value being used was coming from /etc/hosts – here’s the default contents for that file:

127.0.0.1            localhost.localdomain localhost
::1            localhost6.localdomain6 localhost6

I would imagine that in a proper network setup this wouldn’t typically happen, as the box name would be worked out through DHCP + internal DNS. Simply overriding the default behaviour of the zkAddress by putting in the IP address of the virtual machine worked a treat.

With a test client on the host machine running the 3 VirtualBox instances I was able to connect to all 3 nodes and fail them over without problems.

ZK uses a list to work out which ActiveMQ slave node is going to take control when the master broker dies; the ordering is defined by who connected to the ensemble first. If you have less than quorum of ActiveMQ nodes running, no ActiveMQ nodes in the cluster will accept connections (this is managed by the replicated LevelDB store).

Since ZK also maintains a quorum, when less than the required number of servers are available, the remaining nodes will shut down their client port (2181). This will disconnect all the brokers from the ensemble, and the master broker to shut down its transportConnectors.

Happy replicated LevelDB-ing!

ActiveMQ Network Connectors Demystified

When trying to get your head around how ActiveMQ’s networks of brokers work, it helps to have an understanding of the underlying mechanisms involved. I’ve always thought of it as the “Lego brick model” – once you understand the smaller pieces, you can reason about how they can be combined together.

When a regular client connects to a broker and sets up a subscription on a destination, ActiveMQ sees it as consumer and dispatches messages to it. On the consumer, the thread that receives messages is generally not the same one that triggers the event listener – there is an intermediate holding area for messages that have been dispatched from the broker, but not yet picked up for processing – the prefetch buffer, and it is here that received messages are placed for consumption. The thread responsible for triggering the main business logic consumes messages from this buffer one by one – either through MessageConsumer.receive() or through a MessageListener.onMessage(Message).

ActiveMQ dispatches messages out to interested consumers uniformly. In the context of queues, this means in a round-robin fashion. It does not make a distinction between whether that consumer is actually a piece of client code that triggers some business logic, or whether it is another broker.

Now, when you think about it, this implies that when two brokers (let’s call them A1 and A2) are connected in a network A1 -> A2, somewhere there must be a consumer with a prefetch buffer. A network is established by connecting A1‘s networkConnector to a transportConnector on A2.

I have always found it useful to think of a transportConnector as simply an inbound socket that understands a particular wire protocol. When you send a message into it as a client, the broker will take it, write it to a store, and send you a reply that all’s good, or send you a NACK if there’s a problem. So really it’s a straight-through mechanism; the message gets reacted to instantly – there’s no prefetch buffers here.

So if there’s no prefetch buffer on a transportConnector, for a network to work it must actually be a part of the networkConnector on the sending broker (A1). In fact, this is exactly the case.

When a network is established from A1 to A2, a proxy consumer (aka local consumer; I’m using Gary Tully’s terminology here) is created as part of the networkConnector for each destination on which there is demand in A2, or which is configured inside the networkConnector‘s staticallyIncludedDestinations block. By default, each destination being pushed has one consumer, with its own prefetch buffer.

The thread that consumes from this buffer, takes each message one-by-one and (in the case of persistent messages) synchronously sends it to the remote broker A2. A2 then takes that message, persists it and replies that the message has been consumed. The proxy consumer then marks the message as having been processed, at which point it is deemed as having been consumed from A1, is marked for deletion, and will not be redelivered to any other consumers. That’s the general principle behind store and forward.

Now, because the send from A1 to A2 is being performed by a single thread, which is sending the messages one-by-one for each destination, you will see that the throughput across the networkConnector can be quite slow relative to the rate at which messages are being placed onto A1 in the first place. This is the cost of two guarantees that ActiveMQ gives you when it accepts messages:

  1. messages will not be lost when they are marked as persistent, and
  2. messages will be delivered in the order in which they are sent

If you have messages that need to be sent over the network at a higher rate than that which the networkConnector can send given these constraints, you can relax one of these two guarantees:

  1. By marking the messages as non-persistent when you send them, the send will be asynchronous (and so, super fast), but if you lose a broker any messages that are in-flight are gone forever. This is usually not appropriate, as in most use cases reliability trumps performance.
  2. If you can handle messages out-of-order, you can define N number of networkConnectors from A1 to A2. Each connector will have a proxy consumer created for the destinations being pushed over the network – N connectors means N consumers each pushing messages. Keep in mind that message distribution in a broker is performed in a round-robin fashion, so you lose all guarantee of ordering before the messages arrive on A2. Of course, you can opt to group related messages together using message groups, which will be honored on both A1 (between proxy consumers), and actual consumers on A2.

Hopefully, this helps you in developing a mental model of how the underlying mechanics of ActiveMQ’s broker networks work. Of course there’s loads more functionality available, in that there are many more Lego bricks of different shapes there, to help you build up your broker infrastructure.

Camel Cookbook Launch at LJC, March 5th

After all the hard work comes the fun. On Wednesday 5th of March, I will be giving a presentation entitled “Effective System Integrations with Apache Camel” at Skillsmatter, in London; hosted by the good folks from the London Java Community. The evening will kick off at 6:15, and I’ll be headlining ;) at 6:45.

The premise for the talk is going to be a touch different from most of the “Intro to X” talks you may have watched – I will be discussing why most integrations you encounter tend to be half-baked and brittle, and what Apache Camel brings to the table that allows you to address the reasons behind this. The technical ones at least (there are other, non-technical ones too).

The evening is going to be the official London launch of the Apache Camel Developer’s Cookbook. I will be giving away a few copies, and there will also be a chance to buy a copy at the end of the talk for yourself, your workmates, family, friends, pets, etc. All proceeds on the night will be going to Shelter, the housing and homelessness charity. So bring your hard earned cash, grab yourself over 100 recipes for rock-solid integration, and do some good at the same time. Donate an extra £5, and I won’t deface your book with my signature ;)

Apache Camel Developer’s Cookbook

I am pleased to finally announce the reason that I have fallen off the face of the Earth over the past year – myself and Scott Cranton have just finished writing the Apache Camel Developer’s Cookbook. The book will be published by Packt Publishing later this month. It is available for pre-order on the Packt website, and also on Amazon (where it’s currently listed by its working title – Camel Enterprise Integration Cookbook). After 12 months of hard work and late nights (as well as early mornings) at home, in planes, trains, automobiles, hotel rooms, and airport lounges we’re very excited. And relieved – it has been a huge job writing about the ins-and-outs of this incredibly rich integration toolset.

Apache Camel Developer's Cookbook Our intention when writing the book was to develop something that complemented Claus Ibsen’s and Jon Anstey’s awesome Camel in Action. After 3 years Camel in Action continues to be an outstanding reference and a great text for learning Apache Camel inside-out, and has aged incredibly well – a testament to Camel’s solid underlying design. One of the comments that we heard frequently when out on site with clients was that they had bought it, and were going to get around to reading it… eventually.

Our main driver, therefore, was to write a book for the busy system integration developer who just needed to Get Stuff Done. All without requiring them to read a whole book and learn the entire framework, while at the same time not glossing over any of the key issues that they needed to consider in order to build a complete production-ready integration with Apache Camel – you know, with good performance, error handling, transactions etc. The result is a book with over 100 how-to recipes, each written in both of Camel’s Java and XML DSL variants, put together in such a way that people can jump around and get just the info that they need in order to get the job done.

There is a lot of good content there across all difficulty levels. Developers new to Apache Camel will find useful information on how to set up Camel in both regular Java and Spring-based applications, through to the ins-and-outs of the various Enterprise Integration Patterns (EIPs) (how they are affected by multithreading, transactions etc.), payload transformations and testing. There is plenty of good stuff for experienced developers too as we work through parallel and asynchronous processing, error handling and compensation, transactions and idempotency, monitoring and debugging, as well as detailing Camel’s support for security – your company encrypts all traffic, right? No? Well, no more excuses.

The chapters cover:

  • Structuring routes – everything from how to integrate the framework through to route templating
  • Message routing – a coverage of the main routing patterns
  • Routing to your code – how Camel interacts with your Java Code (bean binding, processors etc.)
  • Transformation – moving between XML, JSON, CSVs etc.
  • Splitting and Aggregating – a deep dive into the related Splitter and Aggregator EIPs
  • Parallel Processing – outlines Camel’s support for scaling out processing
  • Error Handling and Compensation – dealing with failure, including capabilities for triggering compensating logic
  • Transactions and Idempotency – how to handle failure of transactional (JDBC, JMS) and non-transactional (web services) resources
  • Testing – how to verify your routes’ behavior without the need for backend systems
  • Monitoring and Debugging – describes Camel’s support for logging, tracing, and debugging
  • Security – encrypting communication between systems, hiding sensitive configuration information, non-repudiation using certificates, and applying authentication and authorization to your routes
  • Web Services – a deep dive into working with one of Camel’s main use cases: SOAP web services

Some of the juicier recipes include:

  • Working with asynchronous APIs
  • Defining completion actions dynamically
  • Testing routes with fixed endpoints using conditional events
  • Digitally signing and verifying messages
  • Enabling step-by-step tracing in code
  • Monitoring other systems using the Camel JMX Component
  • Idempotency inside transactions
  • Setting up XA transactions over multiple transactional resources (many thanks to the guys at Atomikos for their help on this one)

The book is right up to date, with coverage of Camel 2.12.2 and a bit of eye candy thrown in for the monitoring chapter thanks to the hawtio project.

One of the things that we agreed that bugged us about most technical books was that the code frequently did not work. So right at the start we decided that every single snippet of code had to be taken straight from a fully-functional unit test. Whenever we made a statement in the book about how something behaved, we wrote a test for it (and found lots of interesting undocumented corner cases and gotchas, which we subsequently wrote up). The result is a massive set of examples, with 450 supporting tests – all readily usable (I have used them myself with clients on site). If you are interested in taking a look, all of the code is available on github at CamelCookbook/camel-cookbook-examples.

From the feedback that we have received from our reviewers – both formal and informal – is that we hit the mark; something we are really thrilled about. We were really privileged to have a number of Camel committers reviewing the book as well as our own clients who had varying levels of expertise with the framework. Many thanks go out to Claus Ibsen, Christian Posta, Bilgin Ibryam and Phil Wilkins for going through it all with a fine-toothed comb.

We hope that the book proves useful to the entire Apache Camel community, and further popularizes this amazingly productive integration framework.

Monitoring ActiveMQ via HTTP

29/4/2014 The following information is a little bit stale. Since version 5.9.1 of ActiveMQ, Hawt.io is no longer part of the ActiveMQ distribution from Apache, but Jolokia is. Jolokia runs on http://localhost:8161/api/jolokia by default, so if you keep this in mind while reading this post, the remaining instructions are still correct. Dejan Bosanac has written about how to install Hawt.io on ActiveMQ post 5.9.1 if you want to set it up (which of course you do – because it’s awesome.)

It’s been a while since I blogged, since my year has been consumed by a not-so-secret-if-you-follow-me-on-twitter project, but I thought I’d take a break to write about something pretty significant in the ActiveMQ landscape. Unusually enough it’s not the addition of a new feature, but rather an enabling technology.

Unless you have been hiding under a rock this year, you will have probably heard about the “One Console to Rule Them All” – Hawt.io. In short it’s a console for building consoles, with built-in plugins for all the shiny things in the Apache integration toolchain that I work with – ActiveMQ, Camel, Karaf etc. as well as a bunch of other stuff including Fabric, JBoss, Tomcat, Infinispan, OpenEJB etc. And you can also build your own plugins. Hawt.io is pretty damn cool – you can deploy it into a Karaf or web container, it automatically detects what tech pieces it knows about, and exposes a super-shiny console.

Hawt.io login page

Ooo shiny!

The killer feature for me, since I use it on site with clients all the time, is that it gets all this info through JMX – in a JavaScript console written in AngularJS running in your browser! Which for me means no installing jvisualvm and opening ports in firewalls to access the JMX port 1099 on the host machine.

Hawt.io on ActiveMQ

Look mum, no JMX client!

As neat as Hawt.io is, it’s what’s inside that’s really exciting. The secret sauce here is the framework that exposes all the JMX stats to the front-end – a tiny tool called Jolokia. Jolokia is a web app that exposes the JMX stats of the JVM that it’s running inside of over HTTP via REST/JSON. It’s also been included in ActiveMQ since version 5.8.

I’ll let this sink in for a bit.

Ready? What this means is that in ActiveMQ 5.9 you can do this:

$ curl -u admin http://localhost:8161/hawtio/jolokia/ && echo ""

In ActiveMQ 5.8 you replace hawtio/jolokia in the above URI with api/jolokia.

Which after prompting you for the default JMX password (admin/admin) gives you this:

{
    "timestamp":1384809685,
    "status":200,
    "request":{
        "type":"version"
    },
    "value":{
        "protocol":"7.0",
        "agent":"1.1.4",
        "info":{
            "product":"jetty",
            "vendor":"Eclipse",
            "version":"7.6.9.v20130131"
        }
    }
}

This means that you can now access any JMX stats exposed by ActiveMQ from any environment that can make a web request – shell scripts, admin GUIs, anything!

You can access specific parts of the JMX tree by passing a JSON request through that acts a a query parameter. For example to aceess the heap memory usage of the JVM you would pass in:

{
    "type":"read",
    "mbean":"java.lang:type=Memory",
    "attribute":"HeapMemoryUsage",
    "path":"used"
}

Here’s the escaped curl request:

$ curl -u admin -d "{\"type\":\"read\",\"mbean\":\"java.lang:type=Memory\",\"attribute\":\"HeapMemoryUsage\",\"path\":\"used\"}" http://localhost:8161/hawtio/jolokia/ && echo ""

And here’s the response:

{
    "timestamp":1384811291,
    "status":200,
    "request":{
        "mbean":"java.lang:type=Memory",
        "path":"used",
        "attribute":"HeapMemoryUsage",
        "type":"read"
    },
    "value":224135568
}

We can also use this to access the really juicy, interesting ActiveMQ stats. Here we’ll grab the broker’s MemoryPercentUsage. You can locate the path you are interested in via jvisualvm/jconsole:

ActiveMQ JMX MemoryPercentUsage

Hunting through the JMX tree…


Here’s our formatted JSON payload for the query:

{
    "type":"read",
    "mbean":"org.apache.activemq:type=Broker,brokerName=localhost",
    "attribute":"MemoryPercentUsage"
}

Here’s the escaped curl request:

$ curl -u admin -d "{\"type\":\"read\",\"mbean\":\"org.apache.activemq:type=Broker,brokerName=localhost\",\"attribute\":\"MemoryPercentUsage\"}" http://localhost:8161/hawtio/jolokia/ && echo ""

And here’s the response:

{
    "timestamp":1384811228,
    "status":200,
    "request":{
        "mbean":"org.apache.activemq:brokerName=localhost,type=Broker",
        "attribute":"MemoryPercentUsage",
        "type":"read"
    },
    "value":0
}

Plenty of room for non-persisted queue messages! ;)

It’s yet another tool that adds an already rich set of features for monitoring ActiveMQ that include:

Combine it with Camel, and you could even periodically send SNMP traps. Open source FTW!

Deep testing of integrations with Camel

One of the things that often comes up in client conversations about developing integration code with Camel is what test support is available, and more to the point appropriate, for testing integrations. There is a spectrum of test types that can be performed, ranging from fully automated unit tests to full-blown multi-system, user-based “click and see the flow through effects” tests. Camel came with comprehensive test support baked in from it’s very inception, but the mechanisms that are available can be used to go way beyond the standard unit test.

Unit tests

Without wanting to get academic about it, let’s define a unit test as being one that tests the logic encapsulated within a block of code without external side effects. Unit testing straightforward classes is trivial. If you want to make use of external service classes, these can be mocked using your favourite mocking library and injected into the class under test. Camel routes are a little different, in that what they define isn’t executed directly, but rather builds up a set of instructions that are handed to the Camel runtime for execution.

Camel has extensive support for testing routes defined using both the Java DSL as well as the Spring/Blueprints XML DSL. In general the pattern is:

  1. instantiate a RouteBuilder or Spring context containing the routes with a CamelContext, and start the context (this is handled for you by CamelTestSupport or CamelSpringTestSupport – see Camel testing). These should contain direct: endpoints as the inputs to the routes (consumers) and mock: endpoints as the outputs (producers).
  2. get a hold of the mock endpoints, and outline the expectations. A MockEndpoint itself uses a directed builder DSL to allow tou to define a suite of comprehensive expectation, ranging from checking the number of messages received to the details of an individual message. You can make full use of Camel expressions in these tests as well.
  3. create messages that you want to feed in to the route and send them to the direct: endpoint at the top of the route under test using a ProducerTemplate.
  4. assert that the mock endpoints received the expected messages.

An example of this approach can be seen in the RssConsumerRouteBuilderTest in the horo-app I blogged about yesterday.

There are a couple of things that you need to employ this approach successfully. If using Java, the RouteBuilder class that defines your routes should have the ability to have the route endpoint URIs injected and any beans that touch external resources into it – see RssConsumerRouteBuilder. The external beans can easily be mocked as in a standard unit test.

Using the Spring DSL, we can still employ the same general approach, but we need to jump through a couple of hoops to do it. Consider what you would need to do the equivalent. A simple route might be defined via:

    <route id="fileCopyRoute">
        <from uri="file:///some/directory"/>
        <to uri="file:///some/other/directory"/>
    </route>

You can externalise any URIs using Spring’s property support:

    <route id="fileCopyRoute">
        <from uri="${fileCopyRoute.input}"/>
        <to uri="${fileCopyRoute.output}"/>
    </route>

You could then define a PropertyPlaceHolderConfigurer with a properties file that defines these properties as

fileCopyRoute.input=file:///some/directory
fileCopyRoute.output=file:///some/other/directory

The definition of this class should be in a Spring context file seperate to that of your route definitions. For testing you would run the routes with another test XML file that defines a PropertyPlaceHolderConfigurer that points to a test file with the test URIs:

fileCopyRoute.input=direct:fileCopyRoute.in
fileCopyRoute.output=mock:fileCopyRoute.out

This is usually why Spring DM/Blueprints based bundle projects split the config across (a minimum of) two context files. One (META-INF/spring/spring-context-osgi.xml) contains all of the beans that touch the OSGi runtime including the properties mechanism, and the other (META-INF/spring/spring-context.xml) contains your physical routes. When testing you can easily switch out the OSGi bits via another config file. This allows you to inject in other bits during a unit test of the XML-based routes, or when using the camel-maven-plugin in order to run those routes off the command line without an OSGi container like ServiceMix.

Embedded integration tests

Sometimes, testing just the route logic isn’t enough. When I was building out the horo-app, I happily coded up my routes, tested tham and deployed, only to have them blow up immediately. What happened? The objects that I was expecting to receive from the RSS component didn’t match those the component actually sent out. So I changed tact. To engage the component as part of the route I needed a web server to serve the file that fed the test.

Integration testing is usually pretty problematic in that you need an external system servicing your tests – and when you are in an environment where the service changes, you can break the code of the other people working against the same system. But there is a solution! Sun’s Java 6 comes with an embeddable web server that you can start up as part of your integration tests.

The approach that I used was to spin up this server at the start of my test, and configure it programatically to serve up a response suitable for my test when a certain resource was consumed. The server was started on port 0, which means that it’s up to the runtime to assign an available port on the machine when the test runs. This is very important as it enables multiple instances of the same test to run at the same time, as is often the case on CI servers. Without it, tests would trip over each other. Similar approaches are possible using other embeddable server types, such as LDAP via ApacheDS, messaging via ActiveMQ, or databases via H2 or Derby.

Tests that require an external resource often start failing on large projects without any changes on the programmer’s side due to this exact reason – the underlying system dependencies changing. By embedding the server to test your integration against, you decouple yourself from that dependency.

The routes in your test then inject the URI to the embedded resource. In my case, I whipped up an integration test version of the original unit test (RssConsumerRouteBuilderITCase) to do exactly this. Integration tests can be wired in to a seperate part of the Maven build lifecycle using the maven-failsafe-plugin and use a different naming convention (*ITCase.java as opposed to *Test.java).

Usually the way the you structure your tests to avoid duplicating the lifecycle of these embedded backends ends up relying on a test class hierarchy, which may end up looking like:

  • CamelTestSupport
    • CamelTestSupportWithDatabase
    • CamelTestSupportWithWebserver

which I don’t really like, as you inevitably end up requiring two kinds of resource in a test. A much better option is to manage these extended resources using JUnit’s @Rule annotation. This treats any object that implements the org.junit.rules.ExternalResource interface as an aspect of the test, stopping and starting it as part of the test’s lifecycle. As such, you can compose your test of as many of these dependencies as you like – all without a rigid class hierarchy.

This approach allows you to test your integration code against a physical backend, without requiring that backend to be shared between developers. This decouples your development from the rest of the team and allows your integration tests to be run in a CI server. A huge win, as only tests which are deterministic end up being run and maintained in the long term.

#winning!

Transactional persistence with MyBatis in ServiceMix

This post has been a long time coming. A while back I cooked up a sample application, kind of a pet store for integration, that demonstrates a bunch of things that you might want to do beyond the standard bootstrap examples. This app takes the form of a horoscope aggregator, which allows you to view the last x days worth of horoscopes for a starsign (perhaps for the purposes of checking how accurate, or not, they were). The project is available as usual on GitHub at FuseByExample/horo-app.

The app demonstrates a number of useful things that you would typically want to do in an integration:

  • MyBatis via Spring for persistence directly from Camel routes, as well as for use directly by a web app (seperate bundles)
  • database transactions via a Spring PlatformTransactionManager
  • testing of your MyBatis templates against an embedded database; the one I have used is H2, the one used by the real app is Postgres
  • templatize your Camel routes; the app consumes two seperate RSS feeds in exactly the same way, though with different endpoints
  • unit test your routes by dependency injecting your endpoints, and using the CamelTestSupport mechanisms
  • perform “semi-integration” tests via the @Rule annotation against an embedded server within your JUnit tests, in such a way that multiple integration tests can run at the same time on the same server. This plugs in to the integration test part of the Maven build lifecycle, via the maven-failsafe-plugin Incredibly useful for CI! In this case, the purpose was to test the behaviour of the camel-rss component as part of the route.
  • deploy CXF JAX-RS services alongside your Camel bundles to provide access to your data via XML and JSON using just the one mechanism
  • share expensive resources such as DataSources between bundles using metadata, such as the database name that they provide access to
  • idempotent consumption, so the same stuff doesn’t keep getting processed (saved to the database) over and over. This is saved to a JDBC IdempotentRepository in a live environment, and an in-memory one in tests.

Full documentation available in the README. Enjoy!

System Integrations as Plugins using Camel and ServiceMix

I recently had a client with a use case that I thought would be interesting to share (and they were happy for me to talk about – no names, industry changed). Sample code and full instructions for the solution available as always at FuseByExample on Github.

Imagine a system integration where the core logic is static, but the systems that participate in a particular process change over time. Take as an example, a travel booking system that accepts orders for buying flights. How do you go about adding new integration logic for a particular capability such as the booking of a flight with a new airline via its system? On top of this:

  • without changing any of the core logic around the main business process (the booking of a flight also includes taking payment)
  • with no application downtime (hot deployment)
  • enabling other development teams to define integrations for new airlines

Conceptually, these requirements could be satisfied via an application-level “plugin” solution. Your core flight booking application forms a platform alongside which you deploy capabilities specific to individual airlines. This may seem like a complicated set of requirements, but there is a set of tools that you can use to enable exactly this type of application.

Using Camel provides us with a way that to easily partition integration logic into a core process (flight bookings) and sub-processes (booking a ticket with a particular airline), and dynamically route to the right sub-process depending on the booking request. Routes can be connected within the one container by using the ServiceMix NMR to pass messages beteen bundles.

Deploying logic that is specific to an individual airline as an OSGi component enables the separation of code away from the core process, and provides the required dynamicity (and allows others to write that logic). The trick is in putting it all together.

Conceptual overview

OSGi bundles can be thought of as mini-applications. Basing a bundle on SpringDM/Blueprints (essentially defining a Spring-like config in a known location), we can embed a Camel context inside it and have routing logic start up when that bundle is deployed. This is a good candidate for our airline-specific booking code.

We then need to somehow advise the main application bundle that the new process is on-line and ready to do business by having messages routed to it. To achieve this part, we use of the OSGi service registry. To those unfamiliar with it, the registry acts as a conceptual whiteboard inside an OSGi container. Our plugin bundles can register beans inside the registry as implementors of an interface. The core application that wishes to use these services looks up the registry to get a handle on the implementations.

To advertise the availability of a sub-process and provide the name of the Camel endpoint that accepts bookings for a particular airline, we use an interface that indicates to the main application that a plugin bundle is available to take bookings. This is placed in its own bundle such that it could be implemented in the airline-specific bundles, and used in the main booking process.

public interface BookingProcessor {
        public String getAirlineCode();
        public String getBookingRouteUri();
}

Each airline bundle defines its own implementation that returns the airline code that it accepts booking messages for, and the URI for the endpoint of the Camel route that it would be listening for messages on.

Bundle dependencies

This object is then registered with the OSGi service registry:

<osgi:service ref="germanAirlinePlugin" 
        interface="com.fusesource.examples.booking.spi.BookingProcessor" />

The main process bundle is then able to get a dynamic proxy to a set of BookingProcessors that may come and go:

<osgi:set id="bookingProcessors" 
        interface="com.fusesource.examples.booking.spi.BookingProcessor" 
        cardinality="0..N"/>

This set can then be injected into a bean (BookingProcessorRegistry) that makes decisions such as:

  • Is this airline currently supported by the system?
  • What is the route that can be invoked to process this booking?

The Camel routing logic can then be really simple:

<route id="placeBooking">
	<from uri="jetty:http://0.0.0.0:9191/booking" />
	<transform>
		<simple>${headers.flightNumber}</simple>
	</transform>
	<choice>
		<when>
			<method bean="bookingProcessorRegistry" method="isAirlineSupported"/>
			<recipientList stopOnException="true" strategyRef="bookingResponseAggregationStrategy">
				<constant>direct:takePayment, direct:placeBookingWithAirline</constant>
			</recipientList>
		</when>
		<otherwise>
			<transform>
				<simple>Unable to book flight for ${body} - unsupported airline</simple>
			</transform>
		</otherwise>
	</choice>
</route>

<route id="placeBookingWithAirline">
	<from uri="direct:placeBookingWithAirline" />
	<!-- work out who to send the message to -->
	<setHeader headerName="bookingProcessor">
		<method bean="bookingProcessorRegistry" method="getBookingProcessorUri"/>
	</setHeader>
	<log message="Calling out to ${headers.bookingProcessor}"/>
	<recipientList>
		<header>bookingProcessor</header>
	</recipientList>
</route>

So to make a new airline available for bookings, you:

  1. create a new OSGi bundle
  2. register a BookingProcessor in the service registry that indicates the route that processes these bookings
  3. write the integration logic in a route that listens on that endpoint
  4. build the bundle and drop it into Servicemix alongside the main process.

Voila! An application-specific plugin system. You can then use the Karaf web console as a mechanism to make the bundle logic available via REST to a single container, or if you want to distribute it across a cluster – Fuse Fabric via Fuse ESB Enterprise.

IT side-effects at the NHS

My mother has a a phrase – professional illness. It’s the moment that she (an environmental engineer) walks into a random building and promptly looks at the air ducts. I suffer from the same thing – only around tech. Before and after the birth of my daughter, I have had more chances than ever to deal with the NHS. In that time, I witnessed a couple of events that made me step back and think about the way that IT in general conducts itself.

I don’t work with end-user business apps these days, but having spent years doing just that, still feel the pain of those that do. While agility and user input are all the rage, the reality is that we as developers are often so disconnected from end users that we just don’t feel that pain, and some things don’t fit in neatly into bug reports. Add to the normal IT project multiple layers of go-betweens, project managers, business analysts, ivory tower architects, and things of concern fall through the cracks.

At a late stage appointment with a midwife, we had the pleasure of arriving on the day of the rollout of a new patient record system. We have all heard about these things, mostly because they’re delivered way over time and budget. It was interesting to see it from an end user perspective. Having spent an hour or so doing what midwives do, we sat down while for the first time a mid-50′s lady set down in front of a system she’d only ever presumably seen in an “induction”. Like code handovers, these involve a dreary talk to a group of people with some vague handwaving, all moving too fast for anyone to get a sense of what’s really going on. Then a pat on the back and “off you go”.

It all seemed so straightforward, a menu on the top all looking very Office 2010, a list of appointments in a side pane and forms to fill out. Everything looking uniform, and as a result fairly difficult to navigate without reading it all out. To the developers it must have all seemed so obvious, it’s just a forms application, and of course Checkup B follows Investigation A. It’s easy to think this way when you have been looking non-stop at the same app for weeks/months. A quick usability test with a fresh pair of eyes would have made life so much easier for a new user.

Coming in towards the end of the pregnancy we arrived, as you do, with a folder of paperwork from previous scans and checkups. Presumably this was exactly what this record system was to replace. You carry these collections of paper around the whole time, just in case something happens, so that the relevant health professionals can at a glance get your background details. It seems there was a bit of a hitch. In order to record details of a final scan, you obviously need the details of all the previous appointments. The system wouldn’t let you submit just that form. Cue 45 minutes of a midwife copying paperwork into the system, all while our eyes glazed over and other patients were filled up the waiting room. Major own goal, and yet so simple to deal with. Presumably since getting the data imported from paper would be impractical (mums aren’t going to hand over their potentially needed baby notes to get sent to an offshore data entry shop), either use the system for new pregnancies only, or loosen the constraints such that the workflow can be entered into in the middle.

Our next IT speedbump got me thinking about open standards/data, when a pediatric doctor checked Alex before discharge. The doctor had come from another hospital, and had no idea how to use the software at the one we were in. Presumably, both systems had access to the same data, though managed it differently. Open standards are often touted as a “Good Thing”, providers can develop systems that operate against those standards and consumers (hospitals, GP surgeries and the like) buy the “best” solutions (from an indistinguishable selection). On the face of it, the idea is actually quite good – increased competition yields better prices, and innovation (though I’m skeptical as to how much of that you can have filling in forms). I get the impression that side-effects such as the one of staff moving around having difficulties are the tip of the iceberg. Centralised procurement often yields sub-optimal results, and massive cost overruns from kitchen sink requirements. Building 95% similar systems over and over seems like it has its own problems. I don’t know what the answer to that one is, or whether one in fact exists, but I have no doubt it’s worth thinking about.

Developing web services in ServiceMix

I have added a number of projects to FuseByExample/smx-ws-examples on GitHub that demonstrate how to go about developing a number of common web service use cases. The samples are designed to get you up and running quickly with SOAP based web services in an OSGi world.

The examples include:

  • a Maven project that generates all relevant Java code from a WSDL using the cxf-codegen-plugin, and wraps it in an OSGi bundle
  • a plain Java web service implementation of a web service using CXF
  • an implementation of the web service using a Camel CXF route
  • a web service proxy using a Camel route
  • a client based on a Camel route that uses the CXF component to invoke those web services; no Java code required

As usual, full documentation in the README. Enjoy!