The flexible storage backends, clustering, and the open source license are all very enticing. I've been looking for a graph database for an upcoming project and have yet to find something that really matches what we're looking for.
Matthias Broecheler (https://twitter.com/MBroecheler) is the original creator of Titan, and he is incredibly bright. When he finished his PhD, he linked up with Marko Rodriquez (https://twitter.com/twarko), the creator of Gremlin (https://github.com/tinkerpop/gremlin/wiki), and they formed Aurelius to focus on building the big-data graph ecosystem (like Cloudera for graphs -- in fact, the Aurelius Cluster integrates with Hadoop and Cloudera).
There are other distributed graph databases, but most of these are batch processing engines like Pregel. However, Titan is a real-time, transactional graph database backed by either Cassandra or HBase, and it provides fast, horizontally scalable write performance (10,000+ tps) that hasn't been available in an open-source graph database.
See http://thinkaurelius.com/2012/08/06/titan-provides-real-time...
Combining this with Faunus for batch processing and the Aurelius Graph Cluster's integration with the Hadoop ecosystem makes for an incredibly powerful platform for building applications such as social startups.
See Matthias' C* 2012 presentation: Titan - Big Graph Data With Cassandra: http://www.youtube.com/watch?v=ZkAYA4Kd8JE
The Titan user group is here: https://groups.google.com/forum/?fromgroups#!forum/aureliusg...
The Gremlin user group is here: https://groups.google.com/forum/?fromgroups#!forum/gremlin-u...
https://github.com/StartTheShift/thunderdome
There's a few caveats that come with working with distributed databases, so it's important to know what you're getting into. Neo4j might be easier out of the box (since more people are using it), but if you want a robust solution that'll work for 50 or 50,000 users, Titan feels like the way to go.
I'm actually right now mostly messing with TinkerGraph (an in memory graph database that's part of the Tinkerpop utilities that the Titan guys make).
http://thinkaurelius.com/2012/10/25/a-solution-to-the-supernode-problem/
Next, if you decide to scale horizontally, then you can simply change the storage.backend=cassandra and thats that (of course, you need to do a bulk data transfer from BerkeleyDB to Cassandra).I'm thinking of this kind of patterns:
If there are nodes n and n' such that
- there's an edge from n to n' and
- n has a label XY
then add label Y to n'So what I'd want to do is match basic patterns and then add nodes, edges, and labels.
https://github.com/clojurewerkz/titanium/blob/master/src/clo...