Thanks for posting this. I'm starting to get a feel for when Spark is usable-- you need an underlying indexed data store which lets you fetch small subsets of your data into RDDs (or, your data can be tiny to begin with). We've been trying to use Spark on input sizes which, while smaller than our cluster's available memory, are probably too big for Spark to handle (> 1TB).
These guys look to be doing some nice work integrating Cassandra and Spark http://blog.tuplejump.com/
They've piggybacked on the Cassandra clustering using a java agent to run the Spark masters. Doesn't seem to be a realease available yet though.