Basic Neural Network on Python (opens in new tab)

(danielfrg.github.io)

172 pointsdfrodriguez14312y ago28 comments

28 comments

Very good write up. If you want to trade speed and memory for accuracy, you can make a large lookup table for your sigmoidal function which should just about double the speed of it.

As an aside, and not to be too critical, because the post was great, but as (presumably) a non-native English speaker, you might do a spell-checker on your post. There are also some missing pronouns which make some sentences very Spanishy.

gallamine12y ago

You could also try the Elliot Sigmoid activation function. I found it executed about 2x faster than the exponential sigmoid (in Matlab). Brief writeup: http://www.gallamine.com/blog/2013/01/21/a-sigmoid-function-...

lightcatcher12y ago

A relatively small lookup table for the sigmoid function can also work well. Here are the various sigmoid approximations that Theano (a library used for deep learning research among other things) offers: http://deeplearning.net/software/theano/library/tensor/nnet/...

gamegoblin12y ago

I usually use an array with a few thousand entries. In C this gives me a 2.5x speedup over the exact function with no important decrease in accuracy.

2 more replies

dfrodriguez143OP12y ago

Definitely a native spanish speaker here :P. Because I wrote this on an iPython notebook it takes a little bit longer to spell-check. I will try not to be so lazy next time.

Thanks for the tips.

MaxGabriel12y ago

The only one I caught was "state of the are" in the last sentence.

benhamner12y ago

Both datasets you used (iris and digits) are way too simple for neural networks to shine.

Neural networks / deep neural networks work best in domains where the underlying data has a very rich, complex, and hierarchical structure (such as computer vision and speech recognition). Currently, training these models is both computationally expensive and fickle. Most state of the art research in this area is performed on GPU's and there are many tuneable parameters.

For most typical applied machine learning problems, especially on simpler datasets that fit in RAM, variants of ensembled decision trees (such as Random Forests) to perform at least as well as neural networks with less parameter tuning and far shorter training times.

sine_dicendo12y ago

Not for nothing but Ben did you read the article? He's not even discussing most of what you mention. He is simply taking his learning and applying it. You seem to be going off on a tangent about advanced applications where he is obviously just learning about how these things work and not trying to teach a method or suggesting that he has discovered anything significant..

To the author: I liked the article. A simple, concise read.

dbecker12y ago

In Ben's defense: The original article declares random forest a "winner" over neural networks. Ben's comment is a cautionary note that this result only applies to a specific class of problems.

This was a nice post, but it's reasonable to warn users not to overgeneralize the algorithm comparison.

stiff12y ago

He just shared some insights, he didn't critique anything and he's a Kaggler and published researcher so I don't get why he is getting downvoted.

jph0012y ago

Handwritten digits actually is a pretty good domain for deep nets, and the poor performance achieved in the article's case is due to the implementation (it needs deeper net, convolutional layer, etc). In that case much better (99%+) results have been achieved by deep nets for digit recognition. In fact, Hinton (in his Coursera course) recommends this domain for studying deep nets, since it is so well understood.

(Ben I know you're aware of all this already, but I just wanted to clarify for those who aren't as on top of the research as you)

theschreon12y ago

You could try the following improvements to speed up neural network training:

- Resilient Propagation (RPROP), it significantly speeds up training for full batch learning: http://davinci.fmph.uniba.sk/~uhliarik4/recognition/resource...

- RMSProp, introduced by Geoffrey Hinton, also speeds up training but can also be used for mini-batch learning: https://class.coursera.org/neuralnets-2012-001/lecture/67 (sign up to view the video)

Please consider more datasets when benchmarking methods:

- MNIST ( 70k 28x28 pixel images of handwritten digits ): http://yann.lecun.com/exdb/mnist/ . There are several wrappers for Python on github.

- UCI Machine Learning Repository: http://archive.ics.uci.edu/ml/datasets.html

dfrodriguez143OP12y ago

Definitely a lot to read and improvements to make. I will probably do a more complete benchmark with more datasets on a later post.

Thanks for the suggestions.

benhamner12y ago

You may be interested in this ICML 2006 paper, which empirically compared many standard algorithms across a combination of metrics and UCI datasets - http://www.cs.cornell.edu/~caruana/ctp/ct.papers/caruana.icm...

mbq12y ago

You are just doing a simple validation on a test set rather than cross-validation; the point of CV is to make many iterations of validation on different train-test splits and average the results.

dfrodriguez143OP12y ago

I agree completely a more complex benchmark should be done with a complete cross-validation.

Just for future reference I did ran the fitting a few times founding very(+-2%) similar results. Also Random Forests do an average so probably not much to improve on that particular algorithm.

mbq12y ago

To be honest I don't expect the results to change; but this is an only way to attach significance to the observed differences and to ensure this wasn't a lucky shot.

lelandbatey12y ago

Hmmmm... The layout of the page seems very messed up. Is anyone else having it show up like this?:

http://puu.sh/3vTL8.png

rullgrus12y ago

Same here with Firefox 22.0. With 2560x1440 resolution you get three columns for the code blocks. It looks fine in IE 10. IE renders the page with only one column independent of window width.

dfrodriguez143OP12y ago

Should work with most newer versions of any browser.

Which browser are you using?

lelandbatey12y ago

Firefox 22 on Windows 8

scotty7912y ago

What learning scientists think brain actually uses? Back-propagation and such seem like a method god would use to architect static brain for given task.

wfn12y ago

For starters - see Hebbian theory. [1]

Backprop falls within the class of 'supervised learning' which can indeed be said not to be very biologically realistic. However, reinforcement learning is observed, so the overall picture is probably much more complex: e.g. associative/recurrent/etc networks with Hebb-like unsupervised learning developing neuronal group testing and selection systems that involve reinforcement learning. (see first lecture/talk in [3].)

Perhaps worth a watch is a very nice talk by Geoffrey Hinton [2], which is oft referred to on HN. (Hinton does refer to the notion of biological plausibility etc. in this talk as far as I recall, but the focus is elsewhere (developing next generation state-of-the-art (mostly unsupervised) machine learning techniques/systems.))

[1]: https://en.wikipedia.org/wiki/Hebbian_theory

[2]: https://www.youtube.com/watch?v=AyzOUbkUf3M

[3]: http://kostas.mkj.lt/almaden2006/agenda.shtml (The original summary HTML file is gone from the original source, so this is a mirror; the links to videos and slides do work, though.) The first and the second talks are somewhat relevant (particularly the first one, re: bio plausibility etc ("Nobelist Gerald Edelman, The Neurosciences Institute: From Brain Dynamics to Consciousness: A Prelude to the Future of Brain-Based Devices")), but all are great. Rather heavy, though. (Also, skip the intros.)

edit that first talk/lecture from Almaden (Edelman's) is actually a very nice exposure of the whole paradigm in which {cognitive,computational,etc} neuroscience rests; it does get hairy later on; overall, it's a great talk for the truly curious.

primelens12y ago

Good writeup. Is there a feed for that blog? I only found one for the comments.

dfrodriguez143OP12y ago

Yes: http://danielfrg.github.io/feeds/all.atom.xml

Gonna add a direct link from the site soon.

skatenerd12y ago

"def function(...)"

j / k navigate · click thread line to collapse

28 comments

gamegoblin12y ago

Very good write up. If you want to trade speed and memory for accuracy, you can make a large lookup table for your sigmoidal function which should just about double the speed of it.

gallamine12y ago

lightcatcher12y ago

gamegoblin12y ago

I usually use an array with a few thousand entries. In C this gives me a 2.5x speedup over the exact function with no important decrease in accuracy.

2 more replies

dfrodriguez143OP12y ago

Definitely a native spanish speaker here :P. Because I wrote this on an iPython notebook it takes a little bit longer to spell-check. I will try not to be so lazy next time.

Thanks for the tips.

MaxGabriel12y ago

The only one I caught was "state of the are" in the last sentence.

benhamner12y ago

Both datasets you used (iris and digits) are way too simple for neural networks to shine.

sine_dicendo12y ago

To the author: I liked the article. A simple, concise read.

dbecker12y ago

In Ben's defense: The original article declares random forest a "winner" over neural networks. Ben's comment is a cautionary note that this result only applies to a specific class of problems.

This was a nice post, but it's reasonable to warn users not to overgeneralize the algorithm comparison.

stiff12y ago

He just shared some insights, he didn't critique anything and he's a Kaggler and published researcher so I don't get why he is getting downvoted.

jph0012y ago

(Ben I know you're aware of all this already, but I just wanted to clarify for those who aren't as on top of the research as you)

theschreon12y ago

You could try the following improvements to speed up neural network training:

- Resilient Propagation (RPROP), it significantly speeds up training for full batch learning: http://davinci.fmph.uniba.sk/~uhliarik4/recognition/resource...

- RMSProp, introduced by Geoffrey Hinton, also speeds up training but can also be used for mini-batch learning: https://class.coursera.org/neuralnets-2012-001/lecture/67 (sign up to view the video)

Please consider more datasets when benchmarking methods:

- MNIST ( 70k 28x28 pixel images of handwritten digits ): http://yann.lecun.com/exdb/mnist/ . There are several wrappers for Python on github.

- UCI Machine Learning Repository: http://archive.ics.uci.edu/ml/datasets.html

dfrodriguez143OP12y ago

Definitely a lot to read and improvements to make. I will probably do a more complete benchmark with more datasets on a later post.

Thanks for the suggestions.

benhamner12y ago

mbq12y ago

You are just doing a simple validation on a test set rather than cross-validation; the point of CV is to make many iterations of validation on different train-test splits and average the results.

dfrodriguez143OP12y ago

I agree completely a more complex benchmark should be done with a complete cross-validation.

Just for future reference I did ran the fitting a few times founding very(+-2%) similar results. Also Random Forests do an average so probably not much to improve on that particular algorithm.

mbq12y ago

To be honest I don't expect the results to change; but this is an only way to attach significance to the observed differences and to ensure this wasn't a lucky shot.

lelandbatey12y ago

Hmmmm... The layout of the page seems very messed up. Is anyone else having it show up like this?:

http://puu.sh/3vTL8.png

rullgrus12y ago

Same here with Firefox 22.0. With 2560x1440 resolution you get three columns for the code blocks. It looks fine in IE 10. IE renders the page with only one column independent of window width.

dfrodriguez143OP12y ago

Should work with most newer versions of any browser.

Which browser are you using?

lelandbatey12y ago

Firefox 22 on Windows 8

scotty7912y ago

What learning scientists think brain actually uses? Back-propagation and such seem like a method god would use to architect static brain for given task.

wfn12y ago

For starters - see Hebbian theory. [1]

[1]: https://en.wikipedia.org/wiki/Hebbian_theory

[2]: https://www.youtube.com/watch?v=AyzOUbkUf3M

primelens12y ago

Good writeup. Is there a feed for that blog? I only found one for the comments.

dfrodriguez143OP12y ago

Yes: http://danielfrg.github.io/feeds/all.atom.xml

Gonna add a direct link from the site soon.

skatenerd12y ago

"def function(...)"

j / k navigate · click thread line to collapse