InstructGPT: Training language models to follow instructions with human feedback

Training models to follow instructions using RLHF — being helpful, not harmful, and not hallucinating.
Alignment
Author

Imad Dabbura

Published

June 23, 2022

InstructGPT: Training language models to follow instructions with human feedback

#nlp #llm #fine-tuning

Back to top