/**
* Executes the task the given number of times in the given number of threads
* @return All the unchecked exceptions thrown by the task during execution, or
* an empty collection; never null.
*/
public Collection<Throwable> run(Runnable task, int nThreads, final int iterationsPerThread) throws BrokenBarrierException, InterruptedException {
// each thread will await upon the start barrier, then run the task, then await at the finish barrier (where the main thread will be waiting)
final CyclicBarrier startBarrier = new CyclicBarrier(nThreads);
final CyclicBarrier finishBarrier = new CyclicBarrier(nThreads + 1); // +1 for the main thread
// exceptions raised by threads will be logged and returned
final Collection<Throwable> exceptions = new ConcurrentLinkedQueue<Throwable>();
for (int i = 0; i < nThreads; i++) {
new Thread("Thread " + i) {
public void run() {
awaitOnBarrier(startBarrier, 5);
try {
for (int j = 0; j < iterationsPerThread; j++) {
task.run();
}
}
catch (Throwable e) {
e.printStackTrace();
exceptions.add(e);
}
finally {
awaitOnBarrier(finishBarrier, 60);
}
}
}.start();
}
finishBarrier.await();
return exceptions;
}
/** Calls barrier.await and supresses all its checked exceptions */
private void awaitOnBarrier(CyclicBarrier barrier, int timeoutSeconds) {
try {
barrier.await(timeoutSeconds, TimeUnit.SECONDS);
}
catch (InterruptedException e) {
e.printStackTrace();
throw new RuntimeException(e);
}
catch (BrokenBarrierException e) {
e.printStackTrace();
throw new RuntimeException(e);
}
catch (TimeoutException e) {
e.printStackTrace();
throw new RuntimeException(e);
}
}The general answer for actor-based systems is: wait on a blocking message receive from the actor task.
The general answer for simple shared mutable state thread-systems is: Write a complex condition-lock and thread combination (which will probably be subtly wrong the first time) to block your thread and wait on the result triggering the condition.
It was about 6AM after a night of constantly building debug ROMs with different instrumentation to try and get one that I could get both meaningful debug information out of but still suffered from the defect that I finally tracked down the problem. It turned out that a circular pseudo-dependency involving 3 different subsystems and five different components was deadlocking, and all I'd done is change the timing a fraction of a second to trigger the deadlock.
Asynchronicity is hard to test, and you'll always, always get it wrong. Don't sweat it too much. Unless you're building flight control software, just make sure the common use cases work 100% of the time and if broken edge cases turn up, fix them as and when.