John Schulman
TalkRL: The Reinforcement Learning Podcast - Podcast tekijän mukaan Robin Ranjit Singh Chauhan
Kategoriat:
John Schulman, OpenAI cofounder and researcher, inventor of PPO/TRPO talks RL from human feedback, tuning GPT-3 to follow instructions (InstructGPT) and answer long-form questions using the internet (WebGPT), AI alignment, AGI timelines, and more!
