Normally, Deep RL courses teach a lot of mathematically involved theory. You get the practical applications near the end (if at all).
I have tried to turn that on its head. In the top-down approach, you learn practical skills first, then go deeper later. This is much more fun.
This course (the first in a planned multi-part series) shows how to use the Deep Reinforcement Learning framework RLlib to solve OpenAI Gym environments. I provide a big-picture overview of RL and show how to use the tools to get the job done. This approach is similar to learning Deep Learning by building and training various deep networks using a high-level framework e.g. Keras.
In the next course in the series (open for pre-enrollment), we move on to solving real-world Deep RL problems using custom environments and various tricks that make the algorithms work better [1].
The main advantage of this sequence is that these practical skills can be picked up fast and used in real life immediately. The involved mathematical bits can be picked up later. RLlib is the industry standard, so you won't need to change tools as you progress.
This is the first time that I made a course on my own. I learned flip-chart drawing to illustrate the slides and notebooks. That was fun, considering how much I suck at drawing. I am using Teachable as the LMS, Latex (Beamer) for the slides, Sketchbook for illustrations, Blue Yeti for audio recording, OBS Studio for screencasting, and Filmora for video editing. The captions are first auto-generated on YouTube and then hand edited to fix errors and improve formatting. I do the majority of the production on Linux and then switch to Windows for video editing.
I released the course last month and the makers of RLlib got in touch to show their approval. That's the best thing to happen so far.
Please feel free to try it and ask any questions. I am around and will do my best to answer them.
[0] https://www.datacamp.com/courses/unit-testing-for-data-scien... [1] https://courses.dibya.online/p/realdeeprl
By the way, RLlib is good if you want to try out simple experiments with well-established RL algorithms, but it's really awful to use when you want to modify the algorithm even just a little bit. So it's not bad for beginner-level tutorials, but once you get the basics it might be very frustrating later on. I would recommend simpler frameworks like Stable Baselines 3 (https://stable-baselines3.readthedocs.io/en/master/ ) for a much more stable experience, if you have gained a fair bit of Python/ML programming skills at hand and don't have trouble reading well-maintained library code.
- The framework seems to be mainly built on the assumption that it is going to be run on a cloud machine like AWS/Azure. However, many researchers use HPC-type cluster machines which are far different from these cloud setups, and I found support for it to be lackluster in RLlib. (In our case we had 4 16-core Xeon CPUs and 1 V100 GPU per node, with multiple nodes connected via Infiniband, with CentOS 7 / OpenHPC installed and job control done via SLURM) It was quite disappointing to found out that the framework didn't support Infiniband communication at all, since these are really costly to have (for good reason!). I also found that allocating workers based on lower-level details like affinity/NUMA to be very cumbersome, since the API assumes you want to "auto-assign" your workers automatically instead of "pinning" it manually for the highest performance. (The last time I've used RLlib I looked at placement groups to do this but found it too confusing.) Running your environments NUMA-aware can be crucial for having the best performance when you're running heavy custom-made environments in C++. I did some experiments and found out that parallelizing the environment on the C++ side (via threading) on each NUMA mode was much faster than blindly running one process per physical CPU core (which is what RLlib defaults to. You can hack a bit and write your VecEnv on the C++ side but this messes up lots of assumptions RLlib makes and creates a whole lot of other issues in the code.) Seeing promising solutions like Envpool (https://github.com/sail-sg/envpool) coming up I think these issues with parallelizing environments can be improved.
- As I've said before, the framework is very easy to do simple and established things, but becomes very hard when you try to do anything custom, like modifying RL algorithms to fit in your research. What I needed to do was to simply modify the PPO algorithm to do some custom learning step inside each epoch, and still found it surprisingly hard. Using the whole declarative "Observable-like" API approach to write RL code in Python was incredibly painful, since you have no way to debug any of your code, and also have no idea that your code is correct until you run your whole RL pipeline until 30 minutes into your training you get a strange TypeError. (Got some of the horror flashbacks from when I was using modern JS and Angular, but in a much worse form) I get the feeling that the overall codebase is incredibly complex, uses too many weird dark Python metaprogramming tricks, and is a pain to navigate and extend, compared to other much cleaner solutions like Stable Baselines 3... (they aren't as "general" of a solution as RLlib, but can be more easily modified towards one's needs). Maybe my needs were a bit special, so it might have been much better if I had hand-rolled my PPO implementation with torch.distributed... (if I just had more time...)
But still, your framework did help tremendously in our research, we wouldn't have finished the paper without it. These were just some lamentations from a formerly-grad school student who was struggling with these issues some years ago. (I'm not doing any reinforcement learning nowadays, but many people would certainly benefit from these improvements.)
Anyhow, the experimentation stage requires a certain discipline and feels tedious at times. But the moment when learning takes off, it feels great, and for me personally, compensates for the tedious phase before.
It's certainly not fun for everyone, but I guess it could be fun for the target audience of the course (ML engineers/Data Scientists).
Regarding frameworks, my experience has been different. I find RLlib to be more modular and adaptable than SB3. But the learning curve is certainly steeper. The biggest differentiating factor for me is production readiness. Assuming that we are learning something in order to actually use it, I would recommend RLlib over SB3. The equation for researchers may be different though.
When all these factors are taken into account, I have encountered situations where Deep RL performed better.
There are also very public examples of this e.g. Google's data center cooling [0] and competitive sailing [1].
[0] https://www.technologyreview.com/2018/08/17/140987/google-ju... [1] https://www.mckinsey.com/business-functions/mckinsey-digital...
These courses teach you how to call a library and use an API. You get nearly the same thing from just looking at the docs. Please don't say you "know RL" after this.
I personally learned DRL from David Silver's course and Sutton & Burto back in the days. They were the only good resources around and I liked them very much. But I think that with the advent of high-level frameworks in DRL, there are better learning paths.
I do intend to teach the theory/math in a later installment of this series, but I wanted to do it by showing students how to implement the various classes of algorithms e.g. Q-learning (DQN/Rainbow), policy gradients (PPO) and model-based (AlphaZero) using RLlib. This would kill two birds with one stone: you can simultaneously pick up the theory/math and the lower level API of the tool that you will be using in the future anyway.
One suggestion: Instead of naming all the Jupyter notebooks "coding_exercise.ipynb", maybe name them differently? That way, they won't overwrite the previous download.
I hope you enjoy the course over the weekend.
I haven't looked deeply enough, but does this course use a higher-level 'package' such as OpenAI Gym or teach at a lower-level? (Is lower-level stuff even possible...)
The situation looks different for Deep RL algorithm. You can implement them from scratch yourself using Tensorflow or any other similar library. Otherwise, you could just use a higher-level library like RLlib which implements the algorithm using modular components and exposes hyperparameters as configuration parameters.
In many real world use cases, all one needs to do is to use RLlib's implementation and then tune the hyperparameters. In that way RLlib is to Deep RL what Keras is to Deep Learning.
This course uses RLlib. Does that answer your question?
This is apparently happening after Teachable updated their video player. Earlier, they used Wistia. Now they use Hotmart.
I have informed Teachable about this issue. They said they will look into it.
The current workaround would be to use Chrome or Firefox (with tracking protection set to a level below "strict").
I’m a software engineer (non-ML) currently working at big tech company that does ML and has a fair amount of open roles in ML and I’ve wondered is ML the sort of thing you could jump into a team and learn on the job? Or do you really need to take some courses, read some books, or even get a degree?
I got a CS/Math bachelors but it’s been nigh on a decade and my higher level math is rusty. Curious on people’s thoughts here.
You'd need to (self) study & learn how to train a network, eg course or book or articles?
Hmm isn't that was this HN post is about :-) the course: https://courses.dibya.online/p/fastdeeprl, 4 hours it says, self study
I think often the most challenging part isn't the ML, but to gather training data and clean and prepare it so the ML has sth to learn from
\begin{figure}
\includegraphics<2>[width=0.35\textwidth]{images/shop/1.png}%
\includegraphics<3>[width=0.35\textwidth]{images/shop/2.png}%
\end{figure}
The % sign is important and it maintains the correct positioning of the images.[0] https://www.udemy.com/course/drawing-for-trainers-leaders-an...
https://www.manning.com/books/grokking-deep-reinforcement-le...
Also any tips for finding a study group for learning the large language models? I can’t seem to self motivate.
Transformers are being used in Deep RL for at least months.
Try these: https://scholar.google.com/scholar?q=transformer+deep+reinfo...
[1] https://www.deepmind.com/learning-resources/introduction-to-...
No code, coding assignments, math problems or coding problems.
Very little RoI.
I watched them all from start to finish. I had a superficial, shallow "understanding" but no real knowledge.
The best (very short book) to learn Deep RL is the one by Zai, Brown from Manning.
And keep the classic Sutton, Barto near. That's it.
If you want a video course that closely follows the book with quizzes and assignments, check out UofAlberta's MOOC on Coursera.
(Hugging Face also has a new Deep RL course taught by Simonini. You could check that out, but I haven’t seen it.)
Sutton and Barto is the best start for foundations. Start there.
I am currently working on a project where I need to use RLlib for a capacity planning problem. Looks like I will learn a thing or two over the weekend.
I will eventually need to use a custom environment, so it's great to see it's included in your roadmap. Most courses I have seen totally ignored that. Fancy Atari envs are great for practice and have wow factor, but you need a custom environment to do anything resembling real work.
Would I need a beefy GPU for the coding challenges?