undefined | Better HN

0 pointsnsthorat8y ago0 comments

We're using SqueezeNet (https://github.com/DeepScale/SqueezeNet), which is similar to Inception (trained on the same ImageNet dataset) but is much smaller - 5MB instead of inception's 100MB - and inference is much much quicker.

The application takes webcam frames and infers through SqueezeNet, producing a 1000D logits vector for each frame. These can be thought of as unnormalized probabilities for each of ImageNet's 1000 classes.

During the collection phase, we collect these vectors for each class in browser memory, and during inference we pass the frame through SqueezeNet and do k-nearest neighbors to find the class with the most similar logits vector. KNN is quick because we vectorize it as one large matrix multiplication.

I'll go deeper in a blog post soon :)

0 comments

eggie58y ago

So you're doing nearest neighbour search on the images features from the CNN. This is alluded to in Figure 4 of the DeCaf paper: https://twitter.com/eggie5/status/907120374575505408

eggie58y ago

alexnet paper not decaf paper!

amelius8y ago

Interesting!

I'm curious why you've used a different classification algorithm on top of a neural network. I would expect that a neural network on top of a pretrained network could give similar results, with the benefit of simpler code. Is performance the reason?

Anyway, I'm looking forward to your blog post.

nsthoratOP8y ago

Training a neural network on top would require a "proper" training phase, and finding the right hyperparameters that work everywhere turned out to be tricky. Actually, this is what we did originally, in the blog post we'll try to show demos of each of the approaches and explain why they don't work.

KNN also makes training "instant", and the code much much simpler.

amelius8y ago

That makes sense.

By the way, I think your software could become very popular on the Raspberry Pi, because it would be very cheap and fun to use it for all sorts of applications (e.g. home automation).

1 more reply

make38y ago

Basically, read this paper: https://www.cs.cmu.edu/~rsalakhu/papers/oneshot1.pdf

j / k navigate · click thread line to collapse

0 comments

eggie58y ago

So you're doing nearest neighbour search on the images features from the CNN. This is alluded to in Figure 4 of the DeCaf paper: https://twitter.com/eggie5/status/907120374575505408

eggie58y ago

alexnet paper not decaf paper!

amelius8y ago

Interesting!

Anyway, I'm looking forward to your blog post.

nsthoratOP8y ago

KNN also makes training "instant", and the code much much simpler.

amelius8y ago

That makes sense.

By the way, I think your software could become very popular on the Raspberry Pi, because it would be very cheap and fun to use it for all sorts of applications (e.g. home automation).

1 more reply

make38y ago

Basically, read this paper: https://www.cs.cmu.edu/~rsalakhu/papers/oneshot1.pdf

j / k navigate · click thread line to collapse