#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning
David Silver leads the reinforcement learning research group at
DeepMind and was lead researcher on AlphaGo, AlphaZero and co-lead
on AlphaStar, and MuZero and lot of important work in reinforcement
learning. -
1 Stunde 48 Minuten
Podcast
Podcaster
Conversations about AI, science, technology, history, philosophy and the nature of intelligence, consciousness, love, and power.
Beschreibung
vor 5 Jahren
David Silver leads the reinforcement learning research group at
DeepMind and was lead researcher on AlphaGo, AlphaZero and co-lead
on AlphaStar, and MuZero and lot of important work in reinforcement
learning. Support this podcast by signing up with these sponsors: -
MasterClass: https://masterclass.com/lex - Cash App - use code
"LexPodcast" and download: - Cash App (App Store):
https://apple.co/2sPrUHe - Cash App (Google Play):
https://bit.ly/2MlvP5w EPISODE LINKS: Reinforcement learning
(book): https://amzn.to/2Jwp5zG This conversation is part of the
Artificial Intelligence podcast. If you would like to get more
information about this podcast go to https://lexfridman.com/ai or
connect with @lexfridman on Twitter, LinkedIn, Facebook, Medium, or
YouTube where you can watch the video versions of these
conversations. If you enjoy the podcast, please rate it 5 stars on
Apple Podcasts, follow on Spotify, or support it on Patreon. Here's
the outline of the episode. On some podcast players you should be
able to click the timestamp to jump to that time. OUTLINE: 00:00 -
Introduction 04:09 - First program 11:11 - AlphaGo 21:42 - Rule of
the game of Go 25:37 - Reinforcement learning: personal journey
30:15 - What is reinforcement learning? 43:51 - AlphaGo (continued)
53:40 - Supervised learning and self play in AlphaGo 1:06:12 - Lee
Sedol retirement from Go play 1:08:57 - Garry Kasparov 1:14:10 -
Alpha Zero and self play 1:31:29 - Creativity in AlphaZero 1:35:21
- AlphaZero applications 1:37:59 - Reward functions 1:40:51 -
Meaning of life
DeepMind and was lead researcher on AlphaGo, AlphaZero and co-lead
on AlphaStar, and MuZero and lot of important work in reinforcement
learning. Support this podcast by signing up with these sponsors: -
MasterClass: https://masterclass.com/lex - Cash App - use code
"LexPodcast" and download: - Cash App (App Store):
https://apple.co/2sPrUHe - Cash App (Google Play):
https://bit.ly/2MlvP5w EPISODE LINKS: Reinforcement learning
(book): https://amzn.to/2Jwp5zG This conversation is part of the
Artificial Intelligence podcast. If you would like to get more
information about this podcast go to https://lexfridman.com/ai or
connect with @lexfridman on Twitter, LinkedIn, Facebook, Medium, or
YouTube where you can watch the video versions of these
conversations. If you enjoy the podcast, please rate it 5 stars on
Apple Podcasts, follow on Spotify, or support it on Patreon. Here's
the outline of the episode. On some podcast players you should be
able to click the timestamp to jump to that time. OUTLINE: 00:00 -
Introduction 04:09 - First program 11:11 - AlphaGo 21:42 - Rule of
the game of Go 25:37 - Reinforcement learning: personal journey
30:15 - What is reinforcement learning? 43:51 - AlphaGo (continued)
53:40 - Supervised learning and self play in AlphaGo 1:06:12 - Lee
Sedol retirement from Go play 1:08:57 - Garry Kasparov 1:14:10 -
Alpha Zero and self play 1:31:29 - Creativity in AlphaZero 1:35:21
- AlphaZero applications 1:37:59 - Reward functions 1:40:51 -
Meaning of life
Weitere Episoden
3 Stunden 23 Minuten
vor 5 Monaten
2 Stunden 17 Minuten
vor 6 Monaten
3 Stunden 30 Minuten
vor 6 Monaten
In Podcasts werben
Abonnenten
Ici
Kommentare (0)