Teachable Machine: Teach a machine using your camera, live in the browser (opens in new tab)

(blog.google)

438 pointsjozydapozy8y ago90 comments

90 comments

deeplearn.js author here...

We do not send any webcam / audio data back to a server, all of the computation is totally client side. The storage API requests are just downloading weights of a pretrained model.

We're thinking about releasing a blog post explaining the technical details of this project, would people be interested?

amelius8y ago

Yes please! :)

And some quick questions:

What network topology do you use, and on what model is it based (e.g. "inception")?

What kind of data have you used to pretrain the model?

nsthorat8y ago

We're using SqueezeNet (https://github.com/DeepScale/SqueezeNet), which is similar to Inception (trained on the same ImageNet dataset) but is much smaller - 5MB instead of inception's 100MB - and inference is much much quicker.

The application takes webcam frames and infers through SqueezeNet, producing a 1000D logits vector for each frame. These can be thought of as unnormalized probabilities for each of ImageNet's 1000 classes.

During the collection phase, we collect these vectors for each class in browser memory, and during inference we pass the frame through SqueezeNet and do k-nearest neighbors to find the class with the most similar logits vector. KNN is quick because we vectorize it as one large matrix multiplication.

I'll go deeper in a blog post soon :)

3 more replies

Splines8y ago

There's something fantastically entertaining about this. It's stupidly simple (from the outside) but interacting with the computer in such a different way is weirdly fun.

It's like when you turn on a camera and people can see themselves on a TV. A lot of people can't help but make faces at it.

sydd8y ago

Why does it not work in Edge? Please keep the web open, do not make stuff that does not work in a modern browser. Also always give an option to try it anyway.

haser_au8y ago

A blog post on the technical details would be great, please. Thanks in advance, since I know it'll take a bit of your time to write.

godelmachine8y ago

To answer the question at last, yes, I am interested.

celim3078y ago

Pretty neat! Good overview without overwhelming right off the bat. Would be cool if they showed off common pitfalls like over fitting, or even segued into general statistics!

melling8y ago

How long before I can teach my computer gestures that are mapped to real computer functions? For example, scroll up/down, switch apps, save document, cut/copy/paste, etc.

One could probably map each gesture to a regular USB device that acts as a second keyboard and mouse? The hard part is identifying enough unique gestures?

amelius8y ago

I want to teach my computer to recognize when I'm slouching, so I can correct my posture.

wildebaard8y ago

You mean a device such as the Leap Motion Controller? http://store-eur.leapmotion.com/products/leap-motion-control... They seem to offer a VR headset add-on these days, but I've only ever seen the 'basic' controller in action, which worked okay-ish.

melling8y ago

I think that’s infrared but the same idea. That never quite worked. Also, Leap didn’t continue to refine their hardware for consumers. They have next generation hardware that’s going directly into VR headsets, but you can’t buy it.

https://www.engadget.com/2014/08/28/leap-motion-s-next-senso...

1 more reply

BatFastard8y ago

Using the Leap Motion on an Oculus is magic. Much better than hand held controllers. When I use it I don't get sick from VR.

1 more reply

amelius8y ago

I don't have a camera here. Did anyone try it? How does it work?

IanCal8y ago

Surprisingly well!

It's a really well put together demo & tutorial.

I held a pen up next to me and held the green button.

Then did the same with a mouse.

It would flick between the two if I was holding nothing, so I held the orange button for a bit while holding nothing.

Worked pretty much every time.

Training is fast enough with a few hundred images per class that I didn't notice any delay.

amelius8y ago

What do you mean exactly by "held the green button"?

I can't run the demo here (browser not capable enough, and no camera) and I'm getting really curious what this is about.

2 more replies

makmanalp8y ago

It's working great because they're using a state of the art model (SqueezeNet https://github.com/DeepScale/SqueezeNet) and also the samples / experiments you do are often only on yourself, in the same lighting, same clothes, etc. So it gives a nice idealized playground environment that mostly eliminates annoying details like this.

gfredtech8y ago

there are 3 default classes, so you train according to each class(e.g. hand waving, sitting still, etc) you take examples of each(using your camera). you map the input data from your camera to some output data(e.g. if i used the green button to take photos of me waving), display a GIF of a cat that's waving. instead of a GIF you can use sound too

crypticlizard8y ago

The value-add for this demo is amazing, it's going to be many people's first approachable experience of ml, or things just like this will be. I expect a lot more of this stuff to appear in UI/UX. It's fun, intuitive, and a game changer away from dumb screens to fully interactive machines with their own knowledge graph.

lelima8y ago

You can solve problems using Machine learning without coding from a while ago.. azure machine learning have this features from more than a year ago.

I've solve regression, classification and recommendation problems with it and the best part is it deploys an web service with a few clicks.

thanksgiving8y ago

But you would need to have:

1 a working phone

2 a valid credit card

To use azure which places a too high bar on students. I mean I've tried to argue for graduated restrictions so basically students with .edu emails should be able to do some things without entering a credit card number but the fact that it is not possible suggests this isn't a priority for azure.

Google says this finds on your browser so there's little infrastructure cost for this demo, right?

jlian8y ago

There is a student offer that doesn’t require credit card https://azure.microsoft.com/en-us/offers/ms-azr-0144p/

1 more reply

shostack8y ago

Can you clarify on what you did with it? I'd love to start dabbling in solving problems with ML, but am a bit intimidated by getting started. Is it fairly easy for a novice to do the things you did?

StavrosK8y ago

Does anyone know what this uses under the hood? I loved the demo, but I would like a similarly easy way to get started locally with Python, for example.

Is there an ML library that can easily start capturing images from the webcam so you can play around with training a model?

make38y ago

I'm pretty sure this is actually pretrained as a few shot learning model under the hood, so that's a bit hard. For the camera part at least: http://opencv-python-tutroals.readthedocs.io/en/latest/py_tu...

and you could do worse then this https://github.com/fchollet/keras/blob/master/examples/mnist...

icc978y ago

It says it in the video: https://deeplearnjs.org

make38y ago

he is not asking for the learning library

2 more replies

hnarayanan8y ago

Yes, there is! http://www.wekinator.org

(It is not a Python application but a Java application, but still as fun!)

StavrosK8y ago

> a Java application, but still as fun!

Your unbridled optimism even in the face of reality has inspired me to give this a shot, thank you! It seems to be based on Weka, so it should be good.

On a first run, I don't really see how to record images from the webcam, it just says "waiting for samples". I'll play around some more and hopefully figure it out, thanks again.

EDIT: Ah, there's a detailed walkthrough which seems to work well!

greggman8y ago

be aware, at least in Chrome, once you give teachablemachine.withgoogle.com permission to use you camera, unless you revoke that permission is has permission to use your camera without further permission including from iframes. In other words every ad from and analytics from Google could start injecting camera access.

I wish chrome would give the option to only give permission "this time" and I wish it didn't allow camera access from cross domain iframes.

ma2rten8y ago

Are you serious? Do you realize that Chrome is also written by Google and they could theoretically already run arbitrary code on your computer? The potential reputation damage and legal risk for Google would be way too high pull off something like that.

jamesmishra8y ago

If this happened, the Google Chrome tab would show a camera. Many webcams have adjacent LEDs that identify that they are activated.

Google could theoretically release compromised versions of Google Chrome and only use the permission on devices where webcam LEDs are unlikely (e.g. smartphones), but this is going deep into tin-foil-hat territory.

greggman8y ago

that's not helpful. the pictures would already be taken and uploaded to servers without my permission reguardless of whether or not I wanted my picture taken or what's visible (contracts, trade secrets, people in various states of undress) .

also this isn't about Google spying. it's about Chrome's bad camera permission model. any company can abuse it

azinman28y ago

But won’t it be on just that FQDN alone? Google analytics and ads are served from a totally different domain. What’s the actual concern here?

greggman8y ago

Google ads and analytics inject JavaScript which means they can insert iframes for any domain they want. If they injected <iframe src="https:// teachablemachine.withgoogle.com/spyonuserwithcamera" /> they'd be able to use your camera from the ad or analytics without asking for permission again.

Of course I'm not suggesting Google would actually do that but some other company might make seeamazingcamerameme.com to get users to turn on there camera for that domain and then after that make iframes for seeamazingcamerameme.com/spy

amigoingtodie8y ago

So you are contending we are secure via DNS?

1 more reply

haser_au8y ago

Chrome does give you this option. It's called "incognito mode"

addedlovely8y ago

Good to know, but thankfully easy to remove permissions from the settings.

netcraft8y ago

What makes it non-mobile? Is it something about the expected performance of the JS? or are there apis being used im not thinking about?

nsthorat8y ago

It works on mobile, it's just slow. Every time we read and write from memory we have to pack and unpack 32 bit floats as 4 bytes without bit shifting operators >.>

white-flame8y ago

Isn't that what ArrayBuffers can do for you at nearly the same amortized speed as C unions?

f00_8y ago

this is really cool, openframeworks-esque in browser javascript

if you like this I would highly recommend looking at openframeworks.

the interactive browser part excites me want to try to make something with deeplearn.js

mschuster918y ago

Hmm. I wonder if one could train this with dick pics and embed into popular messenger apps client-side... "this picture was classified as a penis", to counter morons sending their dick as first message.

peepopeep8y ago

Am I the only paranoid one who thinks this is just Google's way of capturing millions of faces in their database? Or did Apple beat them to it?

moduspol8y ago

Claims like these make privacy-focused efforts less valuable, and I wish people wouldn't make them.

What value is there in taking care to store biometric data only locally, in a separate chip inaccessible even to the OS, if people will simply claim it's equivalent to keeping a remote database of millions of faces?

prophesi8y ago

People will be much less likely to make those claims if you clearly state where the data is being stored. This article + their project page doesn't mention anything about privacy.

I don't know anything about Squeezenet, but it makes a lot of calls to storage.googleapis.com. I wouldn't be surprised if it's making some PUT requests. https://github.com/googlecreativelab/teachable-machine/blob/...

2 more replies

xyrnoble8y ago

Facebook beat them to it... that's the whole reason for tagged images imo. Then they can relate identities with each other and with exif gps data to track their movements over time.

colmvp8y ago

Yeah it's a little hilarious how people just keep giving Facebook more and more data to experiment with.

2 more replies

ma2rten8y ago

I am pretty sure that Apple does not save your image data in any database. Apple is really trying to differentiate itself on privacy.

Also, I don't think that this sends any data to Google, since it trains the neural net in the browser. You could even verify this yourself by looking at the source code.

glass_of_water8y ago

The machine learning is done in the browser with deeplearn.js, so the images aren't being sent to Google's servers.

jamesmishra8y ago

The faces are unlabeled, and I'm not sure what that data would be good for. If Google really wanted face data, they could look at:

- Gmail / Google Plus / Google Apps profile pictures

- Google Street View

- Google Hangouts

- implementing a primitive Face ID or Snapchat-style camera on Google Android

- the large mass of face pictures that they index with Google Images

runj__8y ago

Google Photos seems like the absolute best bet there, they're "organizing" them there by default.

1 more reply

tokenizerrr8y ago

Android has had face unlock for ages

danso8y ago

What did Apple beat them to? FaceID is said to not upload data off-device.

icc978y ago

It's good to be paranoid about it but at the same time it's quite a cool thing to offer people.

Also I think a lot of the processing is done in the browser using deeplearning.js, so I don't know how much is sent back to Google.

46844998y ago

Don't need to, they've got Youtube or so. People has been providing free data set to Google for years anyway.

fancyfacebook8y ago

Don't worry some comment on a forum said they'd never do this, so I think we're all good!

eggie58y ago

i bet it's fine tuning an ImageNet CNN

nsthorat8y ago

kinda, ya

eggie58y ago

What do u think?

j / k navigate · click thread line to collapse

90 comments

nsthorat8y ago

deeplearn.js author here...

We do not send any webcam / audio data back to a server, all of the computation is totally client side. The storage API requests are just downloading weights of a pretrained model.

We're thinking about releasing a blog post explaining the technical details of this project, would people be interested?

amelius8y ago

Yes please! :)

And some quick questions:

What network topology do you use, and on what model is it based (e.g. "inception")?

What kind of data have you used to pretrain the model?

nsthorat8y ago

I'll go deeper in a blog post soon :)

3 more replies

Splines8y ago

There's something fantastically entertaining about this. It's stupidly simple (from the outside) but interacting with the computer in such a different way is weirdly fun.

It's like when you turn on a camera and people can see themselves on a TV. A lot of people can't help but make faces at it.

sydd8y ago

Why does it not work in Edge? Please keep the web open, do not make stuff that does not work in a modern browser. Also always give an option to try it anyway.

haser_au8y ago

A blog post on the technical details would be great, please. Thanks in advance, since I know it'll take a bit of your time to write.

godelmachine8y ago

To answer the question at last, yes, I am interested.

celim3078y ago

Pretty neat! Good overview without overwhelming right off the bat. Would be cool if they showed off common pitfalls like over fitting, or even segued into general statistics!

melling8y ago

How long before I can teach my computer gestures that are mapped to real computer functions? For example, scroll up/down, switch apps, save document, cut/copy/paste, etc.

One could probably map each gesture to a regular USB device that acts as a second keyboard and mouse? The hard part is identifying enough unique gestures?

amelius8y ago

I want to teach my computer to recognize when I'm slouching, so I can correct my posture.

wildebaard8y ago

melling8y ago

https://www.engadget.com/2014/08/28/leap-motion-s-next-senso...

1 more reply

BatFastard8y ago

Using the Leap Motion on an Oculus is magic. Much better than hand held controllers. When I use it I don't get sick from VR.

1 more reply

amelius8y ago

I don't have a camera here. Did anyone try it? How does it work?

IanCal8y ago

Surprisingly well!

It's a really well put together demo & tutorial.

I held a pen up next to me and held the green button.

Then did the same with a mouse.

It would flick between the two if I was holding nothing, so I held the orange button for a bit while holding nothing.

Worked pretty much every time.

Training is fast enough with a few hundred images per class that I didn't notice any delay.

amelius8y ago

What do you mean exactly by "held the green button"?

I can't run the demo here (browser not capable enough, and no camera) and I'm getting really curious what this is about.

2 more replies

makmanalp8y ago

gfredtech8y ago

crypticlizard8y ago

lelima8y ago

You can solve problems using Machine learning without coding from a while ago.. azure machine learning have this features from more than a year ago.

I've solve regression, classification and recommendation problems with it and the best part is it deploys an web service with a few clicks.

thanksgiving8y ago

But you would need to have:

1 a working phone

2 a valid credit card

Google says this finds on your browser so there's little infrastructure cost for this demo, right?

jlian8y ago

There is a student offer that doesn’t require credit card https://azure.microsoft.com/en-us/offers/ms-azr-0144p/

1 more reply

shostack8y ago

Can you clarify on what you did with it? I'd love to start dabbling in solving problems with ML, but am a bit intimidated by getting started. Is it fairly easy for a novice to do the things you did?

StavrosK8y ago

Does anyone know what this uses under the hood? I loved the demo, but I would like a similarly easy way to get started locally with Python, for example.

Is there an ML library that can easily start capturing images from the webcam so you can play around with training a model?

make38y ago

and you could do worse then this https://github.com/fchollet/keras/blob/master/examples/mnist...

icc978y ago

It says it in the video: https://deeplearnjs.org

make38y ago

he is not asking for the learning library

2 more replies

hnarayanan8y ago

Yes, there is! http://www.wekinator.org

(It is not a Python application but a Java application, but still as fun!)

StavrosK8y ago

> a Java application, but still as fun!

Your unbridled optimism even in the face of reality has inspired me to give this a shot, thank you! It seems to be based on Weka, so it should be good.

On a first run, I don't really see how to record images from the webcam, it just says "waiting for samples". I'll play around some more and hopefully figure it out, thanks again.

EDIT: Ah, there's a detailed walkthrough which seems to work well!

greggman8y ago

I wish chrome would give the option to only give permission "this time" and I wish it didn't allow camera access from cross domain iframes.

ma2rten8y ago

jamesmishra8y ago

If this happened, the Google Chrome tab would show a camera. Many webcams have adjacent LEDs that identify that they are activated.

greggman8y ago

also this isn't about Google spying. it's about Chrome's bad camera permission model. any company can abuse it

azinman28y ago

But won’t it be on just that FQDN alone? Google analytics and ads are served from a totally different domain. What’s the actual concern here?

greggman8y ago

amigoingtodie8y ago

So you are contending we are secure via DNS?

1 more reply

haser_au8y ago

Chrome does give you this option. It's called "incognito mode"

addedlovely8y ago

Good to know, but thankfully easy to remove permissions from the settings.

netcraft8y ago

What makes it non-mobile? Is it something about the expected performance of the JS? or are there apis being used im not thinking about?

nsthorat8y ago

It works on mobile, it's just slow. Every time we read and write from memory we have to pack and unpack 32 bit floats as 4 bytes without bit shifting operators >.>

white-flame8y ago

Isn't that what ArrayBuffers can do for you at nearly the same amortized speed as C unions?

f00_8y ago

this is really cool, openframeworks-esque in browser javascript

if you like this I would highly recommend looking at openframeworks.

the interactive browser part excites me want to try to make something with deeplearn.js

mschuster918y ago

peepopeep8y ago

Am I the only paranoid one who thinks this is just Google's way of capturing millions of faces in their database? Or did Apple beat them to it?

moduspol8y ago

Claims like these make privacy-focused efforts less valuable, and I wish people wouldn't make them.

prophesi8y ago

People will be much less likely to make those claims if you clearly state where the data is being stored. This article + their project page doesn't mention anything about privacy.

2 more replies

xyrnoble8y ago

Facebook beat them to it... that's the whole reason for tagged images imo. Then they can relate identities with each other and with exif gps data to track their movements over time.

colmvp8y ago

Yeah it's a little hilarious how people just keep giving Facebook more and more data to experiment with.

2 more replies

ma2rten8y ago

I am pretty sure that Apple does not save your image data in any database. Apple is really trying to differentiate itself on privacy.

Also, I don't think that this sends any data to Google, since it trains the neural net in the browser. You could even verify this yourself by looking at the source code.

glass_of_water8y ago

The machine learning is done in the browser with deeplearn.js, so the images aren't being sent to Google's servers.

jamesmishra8y ago

The faces are unlabeled, and I'm not sure what that data would be good for. If Google really wanted face data, they could look at:

- Gmail / Google Plus / Google Apps profile pictures

- Google Street View

- Google Hangouts

- implementing a primitive Face ID or Snapchat-style camera on Google Android

- the large mass of face pictures that they index with Google Images

runj__8y ago

Google Photos seems like the absolute best bet there, they're "organizing" them there by default.

1 more reply

tokenizerrr8y ago

Android has had face unlock for ages

danso8y ago

What did Apple beat them to? FaceID is said to not upload data off-device.

icc978y ago

It's good to be paranoid about it but at the same time it's quite a cool thing to offer people.

Also I think a lot of the processing is done in the browser using deeplearning.js, so I don't know how much is sent back to Google.

46844998y ago

Don't need to, they've got Youtube or so. People has been providing free data set to Google for years anyway.

fancyfacebook8y ago

Don't worry some comment on a forum said they'd never do this, so I think we're all good!

eggie58y ago

i bet it's fine tuning an ImageNet CNN

nsthorat8y ago

kinda, ya

eggie58y ago

What do u think?

j / k navigate · click thread line to collapse