Archive for January, 2011

Testing Asynchronous Code

January 16, 2011

A while back I had a project which involved the fitting of certain processes with their own lifecycle, i.e., being able to instruct a process to stop then have the process finish off whatever tasks it was running and set its status to stopped so that the machine could be shutdown. As well as the thread for the process, a task coordinator runs in a separate thread. This task coordinator listens for instructions and creates the tasks. Each task also runs in its own separate thread.

If the stop instruction is given to the process then:

  1. Process lifecycle is set to stopping status
  2. No new tasks are allowed to start
  3. Tasks already running are allowed to run to completion
  4. Last task finishes
  5. Process lifecycle is set to stopped status

The threads for an API test would be:

  1. Test thread which gives the Process lifecycle the instruction to stop
  2. The Process itself which has the lifecycle state (running, stopping, stopped)
  3. The task initiator which listens for instructions to initiate tasks and runs them
  4. The task threads

How can we test that the correct sequence of events occur when a process is given the instruction to “stop”? For example, to ensure that the a “stop” command doesn’t set the process lifecycle status to “stop” while tasks are still running, or sets the lifecycle status to “stopping” yet still picks up and initiates tasks. Also we need to do things in the test such as wait for tasks to start and then have the test give the “stop” instruction.

The test needs some mechanism to wait for all relevant events to take place then when it’s done waiting it needs some way of checking that the sequence of events happened in the right order, i.e., lifecycle stop happened after tasks had run to completion.

Talking it over with someone in the team he suggested I take inspiration from UI frameworks, e.g., Java Swing and make use of an event listener/notifier pattern whereby listeners in separate threads can be registered to listen out for events of interests and other (notifier) threads can notify their registered listeners of certain events.

I fitted in a listener/notification pattern which meant that I could create a test helper class like this (all code examples in java):


class LatchMonitor implements Listener {
	public static HashMap<String, long> NOTIFICATIONS_LOGGED = new ConcurrentHashMap<String, long>();
	private CountDownLatch myLatch;
	private String myNotification;
	private Long myLatchTimeout;

	public LatchMonitor (String notification, CountDownLatch latch, Long latchTimeout) {
		myLatch = latch;
		myNotification = notification;
		myLatchTimeout = latchTimeout
	}

	public boolean setToWaitOnNotifier() {
		boolean countReachedZero = false;
		try {
			countReachedZero = myLatch.await(myLatchTimeout, TimeUnit.MILLISECONDS);
		} catch (InterrupedException e) {
			LOG.error("Interrupted", e);
		}
		return countReachedZero;
	}

	public void doNotify(Event event) {
		if (myNotification.equals(event.getOccurrence())) {
			synchronized(this) {
				NOTIFICATIONS_LOGGED.put(event.getOccurrence(), event.getTime());
			}
			myLatch.countDown;
		}
	}

 } 

The LatchMonitor uses a CountDownLatch, giving it the capacity to be notified of multiple events, for example, if 3 tasks are running 3 notifications of “task finished” should be received. It uses a HashMap class variable so all instances of LatchMonitor can log their events of interest and the time that they were notified of the event (ConcurrentHashMap is used because of multiple separate threads calling doNotify). Thus if the event sequence matters then NOTIFICATIONS_LOGGED can be used to make assertions as to the sequence of events.

Within the test you can use LatchMonitors to wait and listen for events of interest, in our case, when the lifecycle status has been set “stopping”, when tasks have finished, when the lifecycle status has been set to “stopped”. When these events have occurred the test can continue and go on to make assertions as to timings of these events, i.e., lifecycle stopping time < tasks finished time < lifecycle stopped time.

The LatchMonitor class proved useful however, it did sometimes have the snag of introducing race conditions in the test, for example,

  1. Test instructs the process to execute some tasks
  2. Test sets a LatchMonitor waiting for task start events
  3. Test sends the stop command to the process
  4. Tests sets a LatchMonitor waiting for the lifecycle status to be stopping
  5. Test sets a LatchMonitor to wait for the tasks to finish and the lifecycle status to be set to stopped

Between 1. and 2. the tasks could already have started before setting the LatchMonitor to wait; so the LatchMontor is waiting for an event that has already occurred. The same thing can happen between 3. and 4, and between 3. and 5.

What I then did was to have the LatchMonitor implement Runnable and set the LatchMonitors running and waiting in separate threads (FutureTasks) right at the beginning of the test. So now the LatchMonitor is given a run method:

public void run() {
	setToWaitOnNotifier();
}

The LatchMonitor is set running and waiting at the beginning of the test before step 1.:

FutureTask<?> futureStart = new FutureTask<Object>((Runnable) taskStartLatchMonitor, null);
Executor executor = Executors.newFixedThreadPool(1);
executor.execute(futureStart);

At step 2. the Test could check:

boolean tasksNotStarted = true;
while (tasksNotStarted) {
	tasksNotStarted = !futureStart.isDone();
}

The test will wait at step 2. for the event, or if the event had already occured the test continues.

This of course makes the test complicated.

Meanwhile, again someone on the team suggested that I take a look at the awaitility package. This is a DSL for testing asynchronous code. Rather than blocking it uses polling to poll for events of interest. I played around with rewriting the tests that I’d done using the LatchMonitor approach to using awaitility. I think awaitility is great and I found it easy to use and thought it made the code less complex, however, it uses polling which could add time to the tests whereas with the LatchMonitor approach the tests block and continue immediately there’s been a notification of the event.

Having looked at the awaitility code I used it as inspiration for a pattern I’ve started to use in both multithreaded unit and api tests. For example here is a Listener:

class TestListener implements Listener {
	private boolean beenNotified false;
	private String myNotification;
		
	public Listener(String notification) {
		myNotification = notification;
	}
		
	public void doNotify(Event event) {
		if (myNotification.equals(event.getOccurrence())) {
			beenNotified = true;
		}
	}
		
	public boolean haveBeenNotified() {
		return beenNotified;
	}
}
  1. Set the Listener to listen out for the event of interest
  2. Test sets off something which will end up with the event of interest taking place.
  3. Test waits :
    int waitTime = 0; int waitTimeout = 5000; long pollingTime = 100l;
    while (!listener.haveBeenNotified && !(waitTime > waitTimeout)) {
    	Thread.sleep(pollingTime);
    	waitTime+=pollingTime;
    }
    assertTrue(listener.haveBeenNotified);
  4. Test continues with assertions.

This method uses polling rather waiting, but has the advantage of being in the same thread as the test (thus making the test less complex than the LatchMonitor approach). It also has the advantage of not introducing race conditions because if the event of interest has already taken place then the test falls through the while loop straight to the assertion. The waitTimout of course stops the test polling indefinitely if the event never takes place. If you find you have Thread.sleeps in your tests which are causing intermittent failures then I’ve found this pattern helpful in making such tests deterministic (though of course it makes the tests more complicated and verbose).

One book I’ve recently acquired, and which I wish I’d had at the beginning of the project, is Growing Object-Oriented Software Guided by Tests by Steve Freeman and Nat Pryce. I can recommend chapters 26 and 27 for anyone who has to write tests for asynchronous code.

Advertisements