Foundations of Deep Reinforcement Learning


Deep Reinforcement Learning equips AI agents with the ability to learn from their own trial and error. Success stories include learning to play Atari games, Go, Dota2, robots learning to run, jump, manipulate. This tutorial will cover the foundations of Deep Reinforcement Learning, including MDPs, DQN, Policy Gradients, TRPO, PPO, DDPG, SAC, TD3, model-based RL, as well as current research frontiers.

Session Outline:
Module 1: Introduction to Markov Decision Processes (MDPs) and Exact Solution Methods (which only apply to small problems)
Module 2: Deep Q Networks and Application to Atari
Module 3: Policy Gradients, Trust Region Policy Optimization (TRPO), Proximal Policy Optimization (PPO), Deep Deterministic Policy Gradients (DDPG), Twin Delayed Deep Deterministic Policy Gradients (TD3), Soft Actor Critic (SAC) and Application to Robot Learning
Module 4: Model-based Reinforcement Learning
Module 5: Current Research Frontier

Background Knowledge:
Familiarity with Deep Supervised Learning is a major plus; general familiarity with calculus is a plus, too


Professor Pieter Abbeel is Director of the Berkeley Robot Learning Lab and Co-Director of the Berkeley Artificial Intelligence (BAIR) Lab. Abbeel’s research strives to build ever more intelligent systems, which has his lab push the frontiers of deep reinforcement learning, deep unsupervised learning, especially as it pertains to robotics. Abbeel's Intro to AI class has been taken by over 100K students through edX, and his Deep Unsupervised Learning materials are standard references for AI researchers. Abbeel has founded several companies, including Gradescope (AI to help instructors with grading homework, projects and exams) and Covariant (AI for robotic automation of warehouses and factories). He advises many AI and robotics start-ups, and is a frequently sought after speaker worldwide for C-suite sessions on AI future and strategy. Abbeel has received many awards and honors, including ACM Prize, IEEE Fellow, PECASE, NSF-CAREER, ONR-YIP, AFOSR-YIP, Darpa-YFA, TR35, and 10+ best paper awards/finalists. His work is frequently featured in the press, including the New York Times, Wall Street Journal, BBC, Rolling Stone, Wired, and Tech Review.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from Youtube
Consent to display content from Vimeo
Google Maps
Consent to display content from Google