Advanced Reinforcement Learning in Python: cutting-edge DQNs

Advanced Reinforcement Learning in Python: cutting-edge DQNs

English | MP4 | AVC 1280×720 | AAC 44KHz 2ch | 102 lectures (8h 26m) | 2.47 GB

Build Artificial Intelligence (AI) agents using Deep Reinforcement Learning and PyTorch: From basic DQN to Rainbow DQN

This is the most complete Advanced Reinforcement Learning course on Udemy. In it, you will learn to implement some of the most powerful Deep Reinforcement Learning algorithms in Python using PyTorch and PyTorch lightning. You will implement from scratch adaptive algorithms that solve control tasks based on experience. You will learn to combine these techniques with Neural Networks and Deep Learning methods to create adaptive Artificial Intelligence agents capable of solving decision-making tasks.

This course will introduce you to the state of the art in Reinforcement Learning techniques. It will also prepare you for the next courses in this series, where we will explore other advanced methods that excel in other types of task.

The course is focused on developing practical skills. Therefore, after learning the most important concepts of each family of methods, we will implement one or more of their algorithms in jupyter notebooks, from scratch.

Leveling modules:

  • Refresher: The Markov decision process (MDP).
  • Refresher: Q-Learning.
  • Refresher: Brief introduction to Neural Networks.
  • Refresher: Deep Q-Learning.

Advanced Reinforcement Learning:

  • PyTorch Lightning.
  • Hyperparameter tuning with Optuna.
  • Reinforcement Learning with image inputs
  • Double Deep Q-Learning
  • Dueling Deep Q-Networks
  • Prioritized Experience Replay (PER)
  • Distributional Deep Q-Networks
  • Noisy Deep Q-Networks
  • N-step Deep Q-Learning
  • Rainbow Deep Q-Learning

What you’ll learn

  • Master some of the most advanced Reinforcement Learning algorithms.
  • Learn how to create AIs that can act in a complex environment to achieve their goals.
  • Create from scratch advanced Reinforcement Learning agents using Python’s most popular tools (PyTorch Lightning, OpenAI gym, Optuna)
  • Learn how to perform hyperparameter tuning (Choosing the best experimental conditions for our AI to learn)
  • Fundamentally understand the learning process for each algorithm.
  • Debug and extend the algorithms presented.
  • Understand and implement new algorithms from research papers.
Table of Contents

1 Introduction
2 Reinforcement Learning series
3 Google Colab
4 Where to begin
5 Complete code

Refresher The Markov Decision Process (MDP)
6 Module overview
7 Elements common to all control tasks
8 The Markov decision process (MDP)
9 Types of Markov decision process
10 Trajectory vs episode
11 Reward vs Return
12 Discount factor
13 Policy
14 State values v(s) and action values q(s,a)
15 Bellman equations
16 Solving a Markov decision process

Refresher Q-Learning
17 Module overview
18 Temporal difference methods
19 Solving control tasks with temporal difference method
20 Q-Learning
21 Advantages of temporal difference methods

Refresher Brief introduction to Neural Networks
22 Module overview
23 Function approximators
24 Artificial Neural Networks
25 Artificial Neurons
26 How to represent a Neural Network
27 Stochastic Gradient Descent
28 Neural Network optimization

Refresher Deep Q-Learning
29 Module overview
30 Deep Q-Learning
31 Experience replay
32 Target Network

PyTorch Lightning
33 PyTorch Lightning
34 Link to the code notebook
35 Introduction to PyTorch Lightning
36 Create the Deep Q-Network
37 Create the policy
38 Create the replay buffer
39 Create the environment
40 Define the class for the Deep Q-Learning algorithm
41 Define the play episode() function
42 Prepare the data loader and the optimizer
43 Define the train step() method
44 Define the train epoch end() method
45 [Important] Lecture correction
46 Train the Deep Q-Learning algorithm
47 Explore the resulting agent

Hyperparameter tuning with Optuna
48 Hyperparameter tuning with Optuna
49 Link to the code notebook
50 Log average return
51 Define the objective function
52 Create and launch the hyperparameter tuning job
53 Explore the best trial

Double Deep Q-Learning
54 Maximization bias and Double Deep Q-Learning
55 Link to the code notebook
56 Create the Double Deep Q-Learning algorithm
57 Check the resulting agent

Dueling Deep Q-Networks
58 Dueling Deep Q-Networks
59 Link to the code notebook
60 Create the dueling DQN
61 Observation and reward normalization
62 Create the environment – Part 1
63 Create the environment – Part 2
64 Implement Deep Q-Learning
65 Check the resulting agent

Prioritized Experience Replay
66 Prioritized Experience Replay
67 Link to the code notebook
68 DQN for visual inputs
69 Prioritized Experience Repay Buffer
70 Create the environment
71 Implement the Deep Q-Learning algorithm with Prioritized Experience Replay
72 Errata Lecture 70
73 Launch the training process
74 Check the resulting agent

Noisy Deep Q-Networks
75 Noisy Deep Q-Networks
76 Link to the code notebook
77 Create the noisy linear layer class
78 Create the Deep Q-Network
79 Create the policy
80 Create the environment
81 Train the algorithm
82 Check the results

N-step Deep Q-Learning
83 N-step Deep Q-Learning
84 Link to the code notebook
85 N-step Deep Q-Learning – Part 1
86 N-step Deep Q-Learning – Part 2
87 N-step Deep Q-Learning – Part 3
88 Check results

Distributional Deep Q-Networks
89 Distributional Deep Q-Networks
90 Link to the code notebook
91 Create the distributional DQN – Part 1
92 Create the distributional DQN – Part 2
93 Create the policy
94 Create the environment
95 Adapt the algorithm Constructor and sample function
96 Adapt the algorithm Training step – Part 1
97 Adapt the algorithm Training step – Part 2
98 Adapt the algorithm Training step – Part 3
99 Adapt the algorithm Training step – Part 4
100 Launch the training process

Final steps
101 Next steps
102 Next steps