And here's the not-too-hidden secret: the ML part is the fun part. It's a big reason we spend months creating banking.csv. Josh Willis did a very funny presentation at MLconf partly about this. It's like waiting in line at a theme park for an hour, and then paying someone to cut in line at the last minute and record the ride for you. https://www.youtube.com/watch?v=4Gwf5zsg4vI&feature=youtu.be...
All these scenarios are difficult to debug because it's "statistical debugging". There are no breakpoints to put or watch windows to look at. There is no stack trace and there are no exceptions. Any Joe can train a model given training data, it takes fair bit of genius to debug these issues and push model performance to next level. Unfortunately all these new and old "frameworks" almost completely ignore this debugging part. I think the first framework that has great debugging tools will revolutionize ML like Borland revolutionized programming with its visual IDEs.
I mean, it's time consuming and frustrating, but it's also the essence of ML work and the place where I get to apply creativity and gain insight.
I guess this could be useful for some people, but it seems rudimentary to me. If I'm reading their FAQ right they're just fitting a logistic regression to everything. I'm hoping this is just a starting point. Also, not being able to export the actual model seems like a huge dealbreaker to me.
Azure ML also supports R and Python custom code, which can be dropped directly into your workspace.
And this was even before Microsoft acquired Revolution Analytics. Amazon ML seems to be less flexible in regards to importing your own models:
Q: Can I export my models out of Amazon Machine Learning?
No.
Q: Can I import existing models into Amazon Machine Learning?
No.
http://blogs.microsoft.com/blog/2014/06/16/microsoft-azure-m...
(in all fairness Amazon are better than many when it comes to unexpectedly withdrawing products)
However, that's 5x cheaper than what BigML is offering (https://bigml.com/pricing/credits) for its ad hoc service, so I might be wrong.
"Q: What algorithm does Amazon Machine Learning use to generate models?
Amazon Machine Learning currently uses an industry-standard logistic regression algorithm to generate models."
But disappointingly:
"Q: Can I export my models out of Amazon Machine Learning?
No.
Q: Can I import existing models into Amazon Machine Learning?
No."
Note that they are doing classification and regression on iid feature vectors. Of course, ML is much larger than this setting, but this setting is generic enough that it has some applicability to lots of problems.
Also, vw is what I'd consider "industry standard."
Honestly, I'd see it the other way around. Small companies without a DS team might be drawn to this. I don't see how any company with a lick of sense would lock down their prediction model into AWS. They very clearly won't let you export your model once the training is done.
This would be really nice to use at my startup, but its cost prohibitive even on a very large budget.
I am setting up Spark Streaming to handle model creation and updates for recommendations based on what a user interacts with. If I were to even attempt something similar with this AWS service, its $10 for every 1 million predictions which isn't sustainable (not including the costs to create and update the model).
> but the audience is really limited beyond that in my opinion.
Definitely, largely as a result of cost. I would love to not have to worry about Spark in my infrastructure (its another piece...) but at this price the AWS service is just too expensive.
Its nice to see tools for analyzing your data as well as multi-class classification, and some tune-able parameters but this doesn't seem to bring anything 'new' to the game.
All the hard parts, feature selection, noise, unlabeled data, etc are still up to the end user, which makes me wonder how many people will try this out and get poor results.
It would be nice to get an idea of what sort of model they are using on the backend or even having a choice of models.
It would be good if there were an open source tool like Libreoffice that does Machine Learning in their spreadsheet app. It would be a good feature to add, and then the competitors would have to add it to their software as well.
(if anyone has the direct link for the console, please share :)