RCN is much more data efficient than traditional Deep Neural Networks (opens in new tab)

[1] https://arxiv.org/pdf/1710.09829.pdf

70 page supplementary material: http://science.sciencemag.org/content/sci/suppl/2017/10/25/s...

Reference code: https://github.com/vicariousinc/science_rcn

mannigfaltig8y ago

I still find it incredibly hard to tell whether this is overblown hype or legit scientific progress. There is no indication whatsoever that this approach scales to deep feature hierarchies and that is likely what you need to compete on hard tasks like classification on ImageNet. Given the amount of money at play (several hundred millions of dollars), writing 70 pages, making code publishable is certainly an obvious way to get the most out of the hype.

plaidfuji8y ago

Haha yeah Science papers are about providing a high level explanation of what you did, in real words. Then you hit em with the 100 page supplemental that's got more detail than 3 papers' worth of research in other journals

bufo8y ago

Again: no one cares about CAPTCHA in the deep learning world compared to other more challenging benchmarks. I wouldn’t be surprised that many optimizations could be made with ANY kind of effort put into it. Still waiting for Vicarious to go beyond MNIST and text CPATCHA.

nl8y ago

This is trueish, but there is more to it than that.

It is true for sure that absolute performance on MNIST isn't the most interesting thing in the world.

But when introducing a new tool or technique being able to show competitive performance on MNIST is a good way to show that it isn't an entirely useless thing.

I'd note that recent Sabour, Frosst and Hinton paper[1] (where they finally got Hinton's capsules to work) spends most of the paper analyzing how it performs on MNIST, and only a short section on other datasets.

I assume I don't need to point out that Geoff Hinton does know a little about deep learning, and if he thinks submitting a NIPS paper on MNIST is acceptable in 2017 then I'm not going to argue too hard against it.

real-hacker8y ago

+1. I was thinking about the Hinton Capsule paper using MNIST.

chronice708y ago

And what about the other boys who know a thing or two about deep learning? I don't see any of these people submitting MNIST to NIPS in 2017: Yousha Bengio, Yann LeCun, Ian Goodfellow, Andrew Ng, Ross Girshick, Andrej Karpathy, Pedro Domingos, and the whole DeepMind crew.

So yes, submitting experiments on MNIST in 2017 should not be taken seriously.

nl8y ago

"boys"

Not sure what this was supposed to mean? Yes, I think Fei Fei Li's datasets are much better tests than MNIST if that is what you were getting at?

I don't see any of these people submitting MNIST to NIPS in 2017

None of them submitted things as entirely new and different as this, either.

Having said that, I think my point holds.

The completely awesome 2017 "Generalization in Deep Learning" paper[1] was co-authored by Bengio and uses MNIST - because everyone can follow it.

Yann LeCun was co-author on the 2017 "Adversarially Regularized Autoencoders for Generating Discrete Structures"[1.5], using MNIST

Ian Goodfellow Autoencoder NIPS paper[1] used MNIST as one of its 4 datasets. Yes, it was 2014, but when introducing a new technique using familiar datasets isn't a bad thing.

DeepMind's "Bayes by Backprop" (ICML15) used MNIST[2]

Another example: the (June 2017) John Langford (Vowpal Wabbit) et. al paper[3] on using Boosting to learn ResNet blocks used MNIST.

So yes, I agree there are much better datasets to compare performance on. But to prove something new works, MNIST is a useful dataset.

[0] https://arxiv.org/pdf/1710.05468.pdf

[1] http://papers.nips.cc/paper/5423-generative-adversarial-nets

[1.5] https://arxiv.org/pdf/1706.04223.pdf

[2] https://deepmind.com/research/publications/weight-uncertaint...

[3] https://arxiv.org/pdf/1706.04964.pdf

one of the central points of the blog post was that the problem of CAPTCHAs / general artificial intelligence is NOT solved until letters like these are recognised: https://www.vicarious.com/wp-content/uploads/2017/10/image20...

does your network solve/recognise those?

Filligree8y ago

Those are letters?

dx0348y ago

I had no issue recognising any of those immediately.

2 more replies

flor1s8y ago

I only skimmed over the article, but I think the title on HN does not reflect the claims the authors are making.

The title of the paper is: A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs

The title of the article is: Common Sense, Cortex, and CAPTCHA

That's nowhere near the sensationalist title at HN: RCN is much more data efficient than traditional Deep Neural Networks

Paper abstract highlights the model's data efficiency several times:

Learning from few examples and generalizing to dramatically different situations are capabilities of human visual intelligence that are yet to be matched by leading machine learning models. By drawing inspiration from systems neuroscience, we introduce a probabilistic generative model for vision in which message-passing based inference handles recognition, segmentation and reasoning in a unified way. The model demonstrates excellent generalization and occlusion-reasoning capabilities, and outperforms deep neural networks on a challenging scene text recognition benchmark while being 300-fold more data efficient. In addition, the model fundamentally breaks the defense of modern text-based CAPTCHAs by generatively segmenting characters without CAPTCHA-specific heuristics. Our model emphasizes aspects like data efficiency and compositionality that may be important in the path toward general artificial intelligence.

sherbondy8y ago

As far as I can tell, the code on GitHub (https://github.com/vicariousinc/science_rcn) only works for the MNIST dataset.

Unclear how to run on the CAPTCHA examples referenced in the paper, even though they did make the datasets for those examples available.

Bummer, a big part of what the paper mentions about being so great with this RCN model is being able to segment sequences of characters (of indeterminate length even!). Sad that I cannot easily verify this for myself!

fuelfive8y ago

We talked about releasing more comprehensive proof of concept code, but ultimately decided against it. While helpful for other researchers, offering anyone on the internet a ready-to-use arbitrary captcha breaker seemed like a net-negative for society.

visarga8y ago

Like people can't find captcha breakers already. Your research might push for the replacement of text captchas with some other test.

BucketSort8y ago

I'd love to read this, but the faint text on white background... good god. I went through the code looking to change the background so I could read it and found this:

body { text-rendering: optimizeLegibility; }

Groxx8y ago

Huh. Did they change it? I see a very thin font in the header and in bulleted lists, but the rest of the text on the page is black (literally #000000) and relatively bold compared to what I'm used to seeing online (could just be that it's slightly larger, which is also good! it's by no means big, just nice to see something not pointlessly tiny).

The header has the awful "ObjektivMk1-Thin" font mentioned elsewhere, but for me the body is a normal "Roboto","Helvetica Neue",Helvetica,Arial,sans-serif font-family.

spott8y ago

They did change it.

tertius8y ago

Came here to say this. Sent them a contact message hopefully they'll do some A/B testing and notice all the new attention they're getting.

cwilkes8y ago

If you are on an iPhone (maybe any Safari browser?) clicking on the top left reader button makes it legible.

alphydan8y ago

if you untick [font-family: "ObjektivMk1-Thin"; */] it does the job.

singhrac8y ago

Right? I just '* {font-family: Arial}'ed it and it made it a lot better.

sdenton48y ago

You're a datapoint in the upcoming Science paper, "RCN is Much Better at Reading Our Announcements Concerning RCN Than Actual Humans."

BucketSort8y ago

lol

creo8y ago

I wasn't sure why i cant focus on article until i've read this.

nightcracker8y ago

Featuring some of the worst typography I've seen on the internet. There clearly was an attempt, but just leaving font-face as default would've been more readable.

warent8y ago

I'll typically be the last to comment on a webpage typography (usually to a fault) but this site here actually gives me a headache to try and read.

cs7028y ago

This paper looks really interesting to me, although after quickly reading the introduction it's evident that I'm going to have to invest quite a bit of time and effort on the paper to grasp its key ideas. I come from more an encoding-decoding, deep/machine-learning background, as opposed to a probabilistic graphical modeling or PGM background, and my knowledge of neuroscience is minimal.

To date, my experience with "deep PGM models" (for lack of a better term) is limited to some tinkering with (a) variational autoencoders using ELBO maximization as the training objective, and to a much lesser extent (b) "bi-directional" GANs using a Jensen-Shannon divergence between two joint distributions as the training loss.

Has anyone here with a similar background to mine had a chance to read this paper? Any thoughts?

barboloOP8y ago

I’m reading over and over again since the last weekend. And I’m checking the code. And I’m still not understanding it.

real-hacker8y ago

It looks RCN sits between traditional machine learning (with manual feature selection) and 'modern' neural networks (CNN). The traditional methods are too rigid to capture the essential information, while the CNNs sometimes are too flexible to avoid overfitting. Different from CNNS, RCNs have a predetermined structure. Humans are not born a blank slate, we have a neural structure encoded in our genes, so we don't need millions of training samples to recognize objects. So maybe RCN is onto something.

I am curious how RCN performs on real-life images like ImageNet, and how do they perform against adversarial examples. If they can easily recognize adversarial examples, that would be very interesting...

dx0348y ago

> In 2013, we announced an early success of RCN: its ability to break text-based CAPTCHAs like those illustrated below (left column). With one model, we achieve an accuracy rate of 66.6% on reCAPTCHAs, 64.4% on BotDetect, 57.4% on Yahoo, and 57.1% on PayPal, all significantly above the 1% rate at which CAPTCHAs are considered ineffective (see [4] for more details). When we optimize a single model for a specific style, we can achieve up to 90% accuracy.

66% with reCaptcha and up to 90% when optimised is much higher than what I can achieve with my actual brain. Maybe I should consider using a neural network to answer those, it happens quite frequently that I need 2-3 rounds to get through reCaptcha.

nnx8y ago

Is RCN more of a CNN alternative most useful to image-related tasks or could also work well to other types of neural networks?

ps: thanks god for Reader mode on Safari

visarga8y ago

This is a paper that departs from the 'normal' AI routine and takes a very different approach. Is there another paper formally describing the RCN network? What goes inside the RCN cell? I find it more like a teaser than a revelation at this point.

dpandya8y ago

The details are provided in the supplementary material: http://science.sciencemag.org/content/sci/suppl/2017/10/25/s...

(mentioned by boltzmannbrain in one of the other comments)

_0w8t8y ago

I do not see a discussion in the paper regarding computational efficiency of RCN detection. The only hint about performance that I found is at the end of supplementary material where the authors state:

> Use of appearance during the forward pass: Surface appearance is now only used after the backward pass. This means that appearance information (including textures) is not being used during the forward pass to improve detection (whereas CNNs do). Propagating appearance bottom-up is a requisite for high performance on appearance-rich images.

I presume from this that in the current form RCN requires much more computations than CNN per detection, but I could be wrong.

stochastic_monk8y ago

If I'm not mistaken, a Deep Belief Net or Deep Belief Machine would also be a generative model with enormously greater data efficiency. Comparing against CNNs is a red herring: the advantage of requiring less data to develop a model is more a generative/discriminative issue than it is an "RCN vs everyone else" issue.

What I don't quite understand is why Deep Belief Nets seem to not be getting press these days. For example, see this paper from 2010: http://proceedings.mlr.press/v9/salakhutdinov10a.html.

https://gizmodo.com/a-new-ai-system-passed-a-visual-turing-t... / http://web.mit.edu/cocosci/Papers/Science-2015-Lake-1332-8.p...

gugagore8y ago

Here's another example of a generative model that improves data efficiency, in a similar-ish domain.

taneq8y ago

Recent discussion on Vicarious' CAPTCHA cracking: https://news.ycombinator.com/item?id=15564922

http://science.sciencemag.org/content/sci/early/2017/10/26/s...

The git 'reference implementation' is only for MNIST not for real captchas.

jostmey8y ago

I'll need to see this approach work well across many datasets before I am convinced, not just captchas and the MNIST

m3kw98y ago

How hard is it to get it to run using CoreML?

j / k navigate · click thread line to collapse

46 comments

dpandya8y ago

Obviously, the question then becomes: what happens when you have visual situations that violate or come close to violating the assumptions made?

I'm not familiar enough with the specifics of RCNs to be able to answer this; maybe someone else can. Very interesting paper and approach regardless.

joe_the_user8y ago

After six or seven click-throughs, I downloaded the PDF.

Also, the paper is published under the heading of an AI firm Fremont, ca rather than folks in a university, with the many authors listed by initial and last name...

PDF for the curious:

Edit: tracked down that apparently has some "real" math. Whether is even what the OP is doing remains to be seen.

https://staff.fnwi.uva.nl/t.e.j.mensink/zsl2016/zslpubs/lake...

[1] https://arxiv.org/pdf/1710.09829.pdf

70 page supplementary material: http://science.sciencemag.org/content/sci/suppl/2017/10/25/s...

Reference code: https://github.com/vicariousinc/science_rcn

mannigfaltig8y ago

plaidfuji8y ago

bufo8y ago

nl8y ago

This is trueish, but there is more to it than that.

It is true for sure that absolute performance on MNIST isn't the most interesting thing in the world.

But when introducing a new tool or technique being able to show competitive performance on MNIST is a good way to show that it isn't an entirely useless thing.

real-hacker8y ago

+1. I was thinking about the Hinton Capsule paper using MNIST.

chronice708y ago

So yes, submitting experiments on MNIST in 2017 should not be taken seriously.

nl8y ago

"boys"

Not sure what this was supposed to mean? Yes, I think Fei Fei Li's datasets are much better tests than MNIST if that is what you were getting at?

I don't see any of these people submitting MNIST to NIPS in 2017

None of them submitted things as entirely new and different as this, either.

Having said that, I think my point holds.

The completely awesome 2017 "Generalization in Deep Learning" paper[1] was co-authored by Bengio and uses MNIST - because everyone can follow it.

Yann LeCun was co-author on the 2017 "Adversarially Regularized Autoencoders for Generating Discrete Structures"[1.5], using MNIST

Ian Goodfellow Autoencoder NIPS paper[1] used MNIST as one of its 4 datasets. Yes, it was 2014, but when introducing a new technique using familiar datasets isn't a bad thing.

DeepMind's "Bayes by Backprop" (ICML15) used MNIST[2]

Another example: the (June 2017) John Langford (Vowpal Wabbit) et. al paper[3] on using Boosting to learn ResNet blocks used MNIST.

So yes, I agree there are much better datasets to compare performance on. But to prove something new works, MNIST is a useful dataset.

[0] https://arxiv.org/pdf/1710.05468.pdf

[1] http://papers.nips.cc/paper/5423-generative-adversarial-nets

[1.5] https://arxiv.org/pdf/1706.04223.pdf

[2] https://deepmind.com/research/publications/weight-uncertaint...

[3] https://arxiv.org/pdf/1706.04964.pdf

does your network solve/recognise those?

Filligree8y ago

Those are letters?

dx0348y ago

I had no issue recognising any of those immediately.

2 more replies

flor1s8y ago

I only skimmed over the article, but I think the title on HN does not reflect the claims the authors are making.

The title of the paper is: A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs

The title of the article is: Common Sense, Cortex, and CAPTCHA

That's nowhere near the sensationalist title at HN: RCN is much more data efficient than traditional Deep Neural Networks

Paper abstract highlights the model's data efficiency several times:

sherbondy8y ago

As far as I can tell, the code on GitHub (https://github.com/vicariousinc/science_rcn) only works for the MNIST dataset.

Unclear how to run on the CAPTCHA examples referenced in the paper, even though they did make the datasets for those examples available.

fuelfive8y ago

visarga8y ago

Like people can't find captcha breakers already. Your research might push for the replacement of text captchas with some other test.

BucketSort8y ago

I'd love to read this, but the faint text on white background... good god. I went through the code looking to change the background so I could read it and found this:

body { text-rendering: optimizeLegibility; }

Groxx8y ago

The header has the awful "ObjektivMk1-Thin" font mentioned elsewhere, but for me the body is a normal "Roboto","Helvetica Neue",Helvetica,Arial,sans-serif font-family.

spott8y ago

They did change it.

tertius8y ago

Came here to say this. Sent them a contact message hopefully they'll do some A/B testing and notice all the new attention they're getting.

cwilkes8y ago

If you are on an iPhone (maybe any Safari browser?) clicking on the top left reader button makes it legible.

alphydan8y ago

if you untick [font-family: "ObjektivMk1-Thin"; */] it does the job.

singhrac8y ago

Right? I just '* {font-family: Arial}'ed it and it made it a lot better.

sdenton48y ago

You're a datapoint in the upcoming Science paper, "RCN is Much Better at Reading Our Announcements Concerning RCN Than Actual Humans."

BucketSort8y ago

lol

creo8y ago

I wasn't sure why i cant focus on article until i've read this.

nightcracker8y ago

Featuring some of the worst typography I've seen on the internet. There clearly was an attempt, but just leaving font-face as default would've been more readable.

warent8y ago

I'll typically be the last to comment on a webpage typography (usually to a fault) but this site here actually gives me a headache to try and read.

cs7028y ago

Has anyone here with a similar background to mine had a chance to read this paper? Any thoughts?

barboloOP8y ago

I’m reading over and over again since the last weekend. And I’m checking the code. And I’m still not understanding it.

real-hacker8y ago

dx0348y ago

nnx8y ago

Is RCN more of a CNN alternative most useful to image-related tasks or could also work well to other types of neural networks?

ps: thanks god for Reader mode on Safari

visarga8y ago

dpandya8y ago

The details are provided in the supplementary material: http://science.sciencemag.org/content/sci/suppl/2017/10/25/s...

(mentioned by boltzmannbrain in one of the other comments)

_0w8t8y ago

I presume from this that in the current form RCN requires much more computations than CNN per detection, but I could be wrong.

stochastic_monk8y ago

What I don't quite understand is why Deep Belief Nets seem to not be getting press these days. For example, see this paper from 2010: http://proceedings.mlr.press/v9/salakhutdinov10a.html.

https://gizmodo.com/a-new-ai-system-passed-a-visual-turing-t... / http://web.mit.edu/cocosci/Papers/Science-2015-Lake-1332-8.p...

gugagore8y ago

Here's another example of a generative model that improves data efficiency, in a similar-ish domain.

taneq8y ago

Recent discussion on Vicarious' CAPTCHA cracking: https://news.ycombinator.com/item?id=15564922