The Obligatory GPT-3 Post
Astral Codex Ten Podcast - Podcast tekijän mukaan Jeremiah
Kategoriat:
https://slatestarcodex.com/2020/06/10/the-obligatory-gpt-3-post/ I. I would be failing my brand if I didn’t write something about GPT-3, but I’m not an expert and discussion is still in its early stages. Consider this a summary of some of the interesting questions I’ve heard posed elsewhere, especially comments by gwern and nostalgebraist. Both of them are smart people who I broadly trust on AI issues, and both have done great work with GPT-2. Gwern has gotten it to write poetry, compose music, and even sort of play some chess; nostalgebraist has created nostalgebraist-autoresponder (a Tumblr written by GPT-2 trained on nostalgebraist’s own Tumblr output). Both of them disagree pretty strongly on the implications of GPT-3. I don’t know enough to resolve that disagreement, so this will be a kind of incoherent post, and hopefully stimulate some more productive comments. So: OpenAI has released a new paper, Language Models Are Few-Shot Learners, introducing GPT-3, the successor to the wildly-successful language-processing AI GPT-2. GPT-3 doesn’t have any revolutionary new advances over its predecessor. It’s just much bigger. GPT-2 had 1.5 billion parameters. GPT-3 has 175 billion. The researchers involved are very open about how it’s the same thing but bigger. Their research goal was to test how GPT-like neural networks scale. Before we get into the weeds, let’s get a quick gestalt impression of how GPT-3 does compared to GPT-2. Here’s a sample of GPT-2 trying to write an article: