Training language models to follow instructions with human feedback

Created on 2023-04-16T01:04:16-05:00

Return to the Index

This card pertains to a resource available on the internet.

This card can also be read via Gemini.

Authors: Long Ouyang and Jeff Wu and Xu Jiang and Diogo Almeida and Carroll L. Wainwright and Pamela Mishkin and Chong Zhang and Sandhini Agarwal and Katarina Slama and Alex Ray and John Schulman and Jacob Hilton and Fraser Kelton and Luke Miller and Maddie Simens and Amanda Askell and Peter Welinder and Paul Christiano and Jan Leike and Ryan Lowe

Year: 2022.

Resulting network is referred to as "InstructGPT."