Skip to content

Latest commit

 

History

History
42 lines (33 loc) · 1.58 KB

README.md

File metadata and controls

42 lines (33 loc) · 1.58 KB

nextGPT

📢 Open source implementation for ChatGPT replica to build the end-to-end pipeline from SFT to RLHF.

  • 🔥 Step 1) SFT: Surpervised Fine-tuning
  • 🔥 Step 2) RM: Reward Model
  • 🔥 Step 3) PPO: Proximal Policy Optimization

ChatGPT Diagram

Installation

$ pip install nextgpt

or install from the git repo to get always the latest version.

$ git clone https://github.com/louiezzang/next-gpt.git
$ cd next-gpt/
$ pip install .
$ cd ../

Examples

See chatGPT example

RLHF

What is RLHF?

Implementation of RLHF (Reinforcement Learning with Human Feedback) was powered by Colossal-AI. More details can be found in the blog.

The RLHF was forked and modified from these git repos.

References