When we started looking at Bitcoin we thought that we would have to use such sophisticated algorithms to uncover interesting structure, but it turned out to be much easier than we expected to find structure and meaning, so we never got too sophisticated.
There's a very active field of research on these cluster finding algorithms - the term to search with is 'community finding algorithms' - http://en.wikipedia.org/wiki/Community_structure has a reasonable introduction.
>And wouldn't finding such a cluster be NP-complete?
That isn't a problem, in practice.
Finding the maximum clique in a graph is an NP complete problem, which you might be thinking of - but 'clique' is a stricter definition than most people would use for 'community', (in that a clique requires all nodes to be connected to each other), and even then very good heuristic clique finding algorithms exist in practice (E.g. the Bron Kerbosch algorithm).
Some community finding algorithms have objective functions which are NP complete to maximise, but again, often fast heuristics are available.
Consequently, there are many good community finding algorithms out there that will quickly find clusters on networks the size of the Bitcoin graph - we ran some, but we didn't do much with their output.
Its difficult to dig into such problems without a ground truth, and again, we could uncover a lot of meaning using simpler techniques.
I think the benefit of this cycling, though, is in the size, not the obscurity. That is, if 60% of the users (and 99% of the addresses) are cycling money to obscure connection to a person, then either:
- You have to accept that "Joe spent a bitcoin that was once in a crime" is insufficient evidence Joe had any connection whatsoever to it, since "most users have touched that bitcoin too"; or
- You have to make it a crime to be a part of such a cycler altogether, which would effectively require an outright ban on Bitcoin.
These conclusions follow no matter how much structure to the trades you can detect.
I don't know if it's NP complete. But my guess is no. I can immediately think of algorithms (maybe crude ones, I am no statistician) that could be used to attack it, and they require a lot of iteration, not an explosion of recursion.