I wonder what other problems we can solve with the same toolset.
It is correct to call such analyses, tools. They cannot give you answers that you can rely on. At best, they may give you hints of other places to look. However, one problem such analysis can run into is that when the signal (the patterns and templates that the analysis was looking for) disappears, the tool becomes rather worthless (in which case, you may want to consider how useful/correct-for-the-job the tool really was).
In the case described in the article, it seemed like an appropriate use of textual analysis.
I'm of the opinion we have only scratched the surface of what is possible to predict by analyzing realtime data from social networks, user groups and message board communities.
[1] https://blog.twitter.com/2015/usgs-twitter-data-earthquake-d...
It sounds like she's busting low class pimps, and hoping that a few of them are human traffickers.
""I would literally just spend hours on these websites, looking at ads, getting a sense for what was the norm," she said. She began to pick up the nuances of every post, understand how a template was made, and get a feel for the different voices behind these ads."
I don't know how this information fits with her implementation, but I was reminded of an old article by Paul Graham "A Plan For Spam" (http://www.paulgraham.com/spam.html), where he talks about automating the process of detecting spam using Bayesian Filtering.
"I think it's possible to stop spam, and that content-based filters are the way to do it. The Achilles heel of the spammers is their message. They can circumvent any other barrier you set up. They have so far, at least. But they have to deliver their message, whatever it is. If we can write software that recognizes their messages, there is no way they can get around that."
Substitute the spam message for the sex message, and we're talking about the same thing. It would be an interesting exercise to try Bayesian Filtering on sex ads, or any other kind of message, to see where it leads.
There's software. Out there. Doing something. (and it's always watching)
It is possible that the "Research Grade" version of the program does use that data, but there is no evidence of that here.