- SQL/PGQ, a property graph query extension to SQL is planned to be released next year as part of SQL:2021.
- GQL, a standalone graph query language will follow later.
While it is a lot of work to design these languages, both graph database vendors (e.g. Neo4j, TigerGraph) and traditional RDBMS companies (e.g. Oracle [2], PostgreSQL/2ndQuadrant [3]) seem serious about them. And with a well-defined query language, it should be possible to build a SQL/PGQ engine in (or on top of) SQLite as well.
[1] https://www.linkedin.com/pulse/sql-now-gql-alastair-green/
[2] http://wiki.ldbcouncil.org/pages/viewpage.action?pageId=1062...
[3] https://www.linkedin.com/pulse/postgresql-oracle-graph-query...
Gremlin's main focus is defining traversal operations on property graphs. While it supports pattern matching [1], IMHO its syntax is not as clean as Cypher's. Gremlin queries are also difficult to optimize: while it is possible to define traversal rewrite rules, they are more involved than relational optimization rules. The fact that most open-source Gremlin implementations are focusing on distributed setups (e.g. a typical deployment of Titan/JanusGraph runs on top of Cassandra) has also implications on single-machine performance, which certainly did not help the adoption of Gremlin -- but this is not necessarily the problem of the query language. Overall, Gremlin is great for workloads where complex single-source traversal operations do the bulk of the work but it's less well-suited to global pattern matching queries such as the ones in the LDBC Social Network Benchmark's BI workload [2].
SPARQL focuses on the graph problems of the "semantic web" domain, which include not only pattern matching but semantic reasoning/inferencing. One can use it for pattern matching queries but with the following caveats:
- Its data model is based on triples so if one wants to return a node and its attributes (properties), one has to specify each of these attributes explicitly.
- On the execution side, returning these attributes might necessitate executing a number of self-join operations.
- Many SPARQL implementations also have performance limitations due to the extra complexity introduced by self-joins, lack of intra-query parallelism, etc.
The "RDF* and SRARQL* approach" is an initiative to amend the self-join problem by introducing nested triples in the data model. It's currently being worked on by a W3C community group [3]. SPARQL also has "property paths", which allows regular path queries, i.e. traversals where the node/edge labels confirm some regular expression (the "property" in "property paths" has nothing to do with "property graphs").
SQL/PGQ and GQL target the property graph data model and support an ASCII-art like syntax for pattern matching queries (inspired by Cypher). They also offer some graph traversal/shortest path operations (including shortest path and regular path queries). Additionally, GQL supports returning graphs so it's queries can be composed.
[1] https://en.wikipedia.org/wiki/Gremlin_(query_language)#Decla...
[2] https://ldbc.github.io/ldbc_snb_docs/workload-bi-reads.pdf
[3] https://blog.liu.se/olafhartig/2019/01/10/position-statement...