Draft of the Fast.ai Book (opens in new tab)

(github.com)

509 pointsAgharaShyam6y ago58 comments

58 comments

chrisa6y ago

It's a neat idea to write the entire book as notebooks - so you can have run-able code right in line.

Definitely excited to check this out; thanks Jeremy and Sylvain!

weego6y ago

It's an awful idea, notebook stability ages like milk

fredmonroe6y ago

i think that's true of almost all tech books in paper form

personally i think these work great because i can add cells to inspect data or try experiments easily as i'm reading to help me understand whats going on

SkyMarshal6y ago

What do you mean by stability in this context?

1 more reply

RocketSyntax6y ago

read for theory and big patterns

smohare6y ago

These sorts of books have existed since the dawn of notebooks. While I think them useful for demonstrations, I don’t think there is much pedantic value (beyond a more static medium) in all honesty. Well-worked examples a student can consult while attempting to solve problems on their own is still of paramount importance.

timClicks6y ago

When you say "pedantic value", do you mean "pedagogic value"?

localhost6y ago

Why can’t we have both? Great examples AND an interactive, reproducible medium?

cube22226y ago

I recommend everybody who didn’t check those out to do so.

Not really being interested in ML, but doing all the available most popular courses to keep up, I really liked how fastai doesn’t just teach you ready and known models, but also how to compose differentiable building blocks to design NN’s yourself.

montenegrohugo6y ago

Love the work Jeremy Howard does. Would definitely recommend the fastai course for anyone considering getting into ML.

Kunix6y ago

Same, long-time developer, I've followed the course and learned a lot (as long as you invest the effort ^^). Great content!

alexcnwy6y ago

Totally. Jeremy changed my life.

stazz16y ago

Are you using ML in your business now? Just wondering Thanks =)

he11ow6y ago

This is not intending to minimize in the slightest the amazing work that Jeremy does - I am a huge fan.

But Fast.ai has TWO co-founders, and somehow, Rachel doesn't seem to get any credit in these discussions (not the book specifically, I'm talking about the overall enterprise). Not quite sure why; A lot of the content on the website is written by her, and it's clear she adds a lot of value to the endeavor as a whole.

jph006y ago

Thank you for mentioning Rachel! :) She is working as the Founding Director of the Center for Applied Data Ethics nowadays, which is a very full-time job. So she hasn't been involved much in fastai v2 or the book (other than chapter 3, of which she's a co-author).

She created and taught the NLP and Computational Linear Algebra courses, and has written most of the material on the fast.ai blog, and of course (as noted) co-founded fast.ai. Overall, I'd agree that she doesn't get as much credit as she deserves. That's perhaps partly due to her increasing focus on ethics issues, which aren't generally discussed much on HN (sadly).

I also would say that Sylvain Gugger doesn't get as much credit as he should -- he has been an equal partner with me in creating the book and fastai library.

(I discussed this response with Rachel prior to posting it.)

DrNuke6y ago

Another smart move from fast.ai, this book is going to be the state-of-the-art reference for 2020 and a classic anthology of algo techniques for the mid term.

mci6y ago

> There are also "agglutinative languages", like Polish, which can add many morphemes together to create very long "words" which include a lot of separate pieces of information. [1]

Polish does not work this way. Source: I am Polish. Perhaps jph00 meant Turkish. Issue filed.

[1] https://github.com/fastai/fastbook/blob/master/10_nlp.ipynb

jph006y ago

Yes you're right, in our NLP course we used Turkish as our example.

But for the book I mentioned Polish due to this paper: https://arxiv.org/abs/1810.10222 . But as you say, now the word "agglutinative" isn't technically correct. I'm actually not sure what the right word is to describe languages that have lots of big compounds with no spaces. (Which is the key issue here, as to why we need subword tokenization techniques).

phaker6y ago

After reading that section of the book i think the language property you're after is 'highly synthetic':

https://en.wikipedia.org/wiki/Synthetic_language

There's a spectrum between synthetic and analytic languages ( https://en.wikipedia.org/wiki/Synthetic_language#Synthetic_a... ) and those closer to the synthetic end are the ones giving you trouble.

Polish will be subtype of synthetic called fusional/inflected which means things need to be adjusted to fit together, agglutinative languages are those that use mainly agglutination where morphemes are stuck together as is:

https://en.wikipedia.org/wiki/Agglutinative_language

There are also polysynthetic languages, which is the name for the extreme of this spectrum, but there are no familiar examples of these (Mayan languages, Ainu, Inuit, Aleut are only i recognize from those mentioned on wikipedia).

1 more reply

mci6y ago

The term you are looking for may be "highly inflected".

Side note: IMHO, you are exaggerating the ability of Polish to form long compounds. Dissecting the "Bezbarwne zielone idee wściekle śpią" example from https://arxiv.org/pdf/1810.10222.pdf#page=3 reveals no words longer than 4 morphemes:

bez-BARW-n-e ZIEL-on-e IDE-e WŚCIEK-l-e ŚP-ią, where I put word roots in uppercase and bound morphemes in lowercase.

The longest sequences of morphemes (for a loose definition of morpheme) I can think of are conditional mood of verbs with double prefixes like po-wy-CHODZI-ł-y-by-ście. However, the sequences of bound morphemes in those forms, which may look complex to you, form a finite-state language that admits just a few sequences.

2 more replies

kristofferc6y ago

Or Hungarian.

Megszentségteleníthetetlenségeskedéseitekért for example.

0-_-06y ago

Yes, which is an agglutinative language

machiaweliczny6y ago

Doesn't German work this way?

mkl6y ago

Not really. "German grammar allows for the construction of long compounded noun phrases which are expressed as one word in written language. Compounding is not really the same as agglutination.": https://www.quora.com/Is-German-considered-a-true-agglutinat...

There are quite a lot of languages that do though: https://en.wikipedia.org/wiki/Agglutinative_language

1 more reply

pault6y ago

English too. Policeman, bathwater, catwalk, headstone, toothbrush, etc.

ratsimihah6y ago

It looks promising!

Minor point, a requirements.txt file or something would be convenient to get started quickly.

jph006y ago

Once the book is released there will be a whole website and prebuilt environments and lots more to get started quickly.

We didn't expect the draft to get this much attention, frankly!

ratsimihah6y ago

That's exciting! If pull requests are enabled I can always send one once I'm up and running! Looking forward to checking out the good stuff in there!

1 more reply

honzzz6y ago

Any estimate when the book will be released? Thanks :-)

1 more reply

classified6y ago

> If you make any pull requests to this repo, then you are assigning copyright of that work to Jeremy Howard and Sylvain Gugger.

That's not how the GPL works.

coderunner6y ago

How does this compare to the video course? e.g. what are the differences?

jph006y ago

It's all new material. It'll be the basis of the next course coming in July. Or you can join the in person course from March in SF https://www.usfca.edu/data-institute/certificates/deep-learn...

pritovido6y ago

It always amazes me how bad some technical people is at basic promotion:

What is Fastai? Why do I need it?

Something as basic as an elevator speech that introduces your product in your github page and book intro can mean 10x or 100x more sales.

If you force people into having to search it for you, you have already lost most of them.

For this author it is as you already know everything about Fastai, but if you did, you would not be needing this book in the first place.

It happens a lot to technical writers because they have spent years thinking about a topic, so they could not put themselves in the shoes of someone who does not.

sixhobbits6y ago

It's sad to see this perspective. With the AI hype, there are so many people spending all their time and money on marketing AI materials while their actual product is all smoke and mirrors.

By contrast, Jeremy and team have proven that "build it and they will come" is not dead. They built high quality courses and quickly became authoritative with no marketing and with full transparency and openness in everything they do.

This book draft looks great. Everyone else is talking about "democratising AI" - this is actually doing it.

Jugurtha6y ago

We're building and improving our internal machine learning platform. We decided recently to support the fast.ai courses. You get everything you need (notebook, object storage, data, parameters and mettics tracking and deployment). Our colleague teaches at university and we're opening the internal platform to about 30 of her students this week to prepare their final masters project.

They don't have access to compute power (GPUs) or bandwidth to download datasets hundreds of gigabytes of data, which they'll find right there, so this should help them since they don't need to have powerful machines or worry about experiment tracking.

We also have a Publish option to make an application from a notebook in one click behind the scenes with training parameters in a form generated automatically, so they can write scripts and instrument model training.

The fast.ai course will also help current or future members, and other students. It's important for us to make it even easier for people to enter the field.

sytelus6y ago

You are downvoted by fanboys but you are exactly right. I am surrounded by researchers working in DL and I have say at least 40% of them have never heard of FastAI or Jeremy Howard. However folks who are active on Twitter, listening to popular podcasts, popular media, HN etc would be very familiar with name Jeremy Howard and what FastAI is and need no introduction. In research world, an astonishing number of good researchers have little to none online presence. They have little to no time other than keeping track of research papers in their sub-field. It also surprises me when authors sweat for months to churn out 100s of polished pages but couldn’t spend 15 minutes to write a paragraph of introduction in readme.MD.

kriro6y ago

I guess it depends a bit which field they work in exactly. I'd be rather surprised if rigorous DL researchers in NLP haven't heard of him because I expect "Universal language model fine-tuning for text classification" (and tbh. also "Fine-tuned language models for text classification" due to the universality of the idea) to show up in any half-decent literature review of the field.

Most DL researchers I know also have a pretty good knowledge of available libraries and make it a habit to check them pretty often.

DrNuke6y ago

Fast.ai really democratizes the bleeding edge research for the masses, though, that’s why it’s popular among outcasts and outsiders. In general, I would be more wary of people working within closed environments and organizations than people making all they do public and open to review.

1 more reply

mkl6y ago

Fast.ai courses, software, and articles are frequently posted on HN: https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...

Few people here need to look it up; the basic promotion has already been done very effectively for the target audience. Besides, the intro chapter explains very clearly what the book is about and who it's for.

Arrrlex6y ago

This is a draft. I expect that, when the first finished version is ready, the authors will promote it effectively (IMO they are very good at promoting their courses at fast.ai).

sytelus6y ago

Wow, Orielly lawyers are determined to screw this up. The thing is GPL v3 licensed which means I can’t copy any of the book code in my closed-source product or competitions or even MIT licensed code. The readme says I cannot make copies of this material but it’s ok to fork. Huh?

jph006y ago

No, the readme says you can make copies for personal use.

If you want to use code in the book under a non GPL license, then you could just buy the book when it comes out. That doesn't seem like an unreasonable burden.

PS: none of this is anything to do with O'Reilly or their lawyers.

sytelus6y ago

Wait... so if you buy book then it seizes to be GPLed? This is quite confusing. For DL research, most code is MIT licenced and legal folks at many industrial labs would be quite hesitent to permit use of code from this repo with feels like legal minefield with different restrictions spread over multiple places including LICENSE, README, fastai website and perhaps printed book. I would highly recommand converting to one simple MIT license and call it a day (except for markdown cells).

1 more reply

j / k navigate · click thread line to collapse

58 comments

chrisa6y ago

It's a neat idea to write the entire book as notebooks - so you can have run-able code right in line.

Definitely excited to check this out; thanks Jeremy and Sylvain!

weego6y ago

It's an awful idea, notebook stability ages like milk

fredmonroe6y ago

i think that's true of almost all tech books in paper form

personally i think these work great because i can add cells to inspect data or try experiments easily as i'm reading to help me understand whats going on

SkyMarshal6y ago

What do you mean by stability in this context?

1 more reply

RocketSyntax6y ago

read for theory and big patterns

smohare6y ago

timClicks6y ago

When you say "pedantic value", do you mean "pedagogic value"?

localhost6y ago

Why can’t we have both? Great examples AND an interactive, reproducible medium?

cube22226y ago

I recommend everybody who didn’t check those out to do so.

montenegrohugo6y ago

Love the work Jeremy Howard does. Would definitely recommend the fastai course for anyone considering getting into ML.

Kunix6y ago

Same, long-time developer, I've followed the course and learned a lot (as long as you invest the effort ^^). Great content!

alexcnwy6y ago

Totally. Jeremy changed my life.

stazz16y ago

Are you using ML in your business now? Just wondering Thanks =)

he11ow6y ago

This is not intending to minimize in the slightest the amazing work that Jeremy does - I am a huge fan.

jph006y ago

I also would say that Sylvain Gugger doesn't get as much credit as he should -- he has been an equal partner with me in creating the book and fastai library.

(I discussed this response with Rachel prior to posting it.)

DrNuke6y ago

Another smart move from fast.ai, this book is going to be the state-of-the-art reference for 2020 and a classic anthology of algo techniques for the mid term.

mci6y ago

> There are also "agglutinative languages", like Polish, which can add many morphemes together to create very long "words" which include a lot of separate pieces of information. [1]

Polish does not work this way. Source: I am Polish. Perhaps jph00 meant Turkish. Issue filed.

[1] https://github.com/fastai/fastbook/blob/master/10_nlp.ipynb

jph006y ago

Yes you're right, in our NLP course we used Turkish as our example.

phaker6y ago

After reading that section of the book i think the language property you're after is 'highly synthetic':

https://en.wikipedia.org/wiki/Synthetic_language

There's a spectrum between synthetic and analytic languages ( https://en.wikipedia.org/wiki/Synthetic_language#Synthetic_a... ) and those closer to the synthetic end are the ones giving you trouble.

https://en.wikipedia.org/wiki/Agglutinative_language

1 more reply

mci6y ago

The term you are looking for may be "highly inflected".

bez-BARW-n-e ZIEL-on-e IDE-e WŚCIEK-l-e ŚP-ią, where I put word roots in uppercase and bound morphemes in lowercase.

2 more replies

kristofferc6y ago

Or Hungarian.

Megszentségteleníthetetlenségeskedéseitekért for example.

0-_-06y ago

Yes, which is an agglutinative language

machiaweliczny6y ago

Doesn't German work this way?

mkl6y ago

There are quite a lot of languages that do though: https://en.wikipedia.org/wiki/Agglutinative_language

1 more reply

pault6y ago

English too. Policeman, bathwater, catwalk, headstone, toothbrush, etc.

ratsimihah6y ago

It looks promising!

Minor point, a requirements.txt file or something would be convenient to get started quickly.

jph006y ago

Once the book is released there will be a whole website and prebuilt environments and lots more to get started quickly.

We didn't expect the draft to get this much attention, frankly!

ratsimihah6y ago

That's exciting! If pull requests are enabled I can always send one once I'm up and running! Looking forward to checking out the good stuff in there!

1 more reply

honzzz6y ago

Any estimate when the book will be released? Thanks :-)

1 more reply

classified6y ago

> If you make any pull requests to this repo, then you are assigning copyright of that work to Jeremy Howard and Sylvain Gugger.

That's not how the GPL works.

coderunner6y ago

How does this compare to the video course? e.g. what are the differences?

jph006y ago

It's all new material. It'll be the basis of the next course coming in July. Or you can join the in person course from March in SF https://www.usfca.edu/data-institute/certificates/deep-learn...

pritovido6y ago

It always amazes me how bad some technical people is at basic promotion:

What is Fastai? Why do I need it?

Something as basic as an elevator speech that introduces your product in your github page and book intro can mean 10x or 100x more sales.

If you force people into having to search it for you, you have already lost most of them.

For this author it is as you already know everything about Fastai, but if you did, you would not be needing this book in the first place.

It happens a lot to technical writers because they have spent years thinking about a topic, so they could not put themselves in the shoes of someone who does not.

sixhobbits6y ago

It's sad to see this perspective. With the AI hype, there are so many people spending all their time and money on marketing AI materials while their actual product is all smoke and mirrors.

This book draft looks great. Everyone else is talking about "democratising AI" - this is actually doing it.

Jugurtha6y ago

The fast.ai course will also help current or future members, and other students. It's important for us to make it even easier for people to enter the field.

sytelus6y ago

kriro6y ago

Most DL researchers I know also have a pretty good knowledge of available libraries and make it a habit to check them pretty often.

DrNuke6y ago

1 more reply

mkl6y ago

Fast.ai courses, software, and articles are frequently posted on HN: https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...

Arrrlex6y ago

This is a draft. I expect that, when the first finished version is ready, the authors will promote it effectively (IMO they are very good at promoting their courses at fast.ai).

sytelus6y ago

jph006y ago

No, the readme says you can make copies for personal use.

If you want to use code in the book under a non GPL license, then you could just buy the book when it comes out. That doesn't seem like an unreasonable burden.

PS: none of this is anything to do with O'Reilly or their lawyers.

sytelus6y ago

1 more reply

j / k navigate · click thread line to collapse