AutoLLM

AutoLLM is a tool designed for the automated updating of model parameters through the merging of different data and training method parameters. We believe the key to intelligent models lies in continuous learning and updating. By generating new data and combining various training methods (e.g., pretrain, supervised fine-tuning (SFT), and Reinforcement Learning with Human Feedback (RLHF)), AutoLLM can automatically adjust the mixing ratios of model parameters, thus continually enhancing the intelligent performance in practical applications.

Main Features and Steps

Step 1: Data Generation and Segregation

Generate a batch of data from the source database (data can also be generalized from multiple sources such as user queries, evaluation bad cases, etc.).
Divide the generated data into multiple small chunks and reserve a portion as the evaluation dataset.

Step 2: Data Chunk Training

Use SFT (or other training methods like DPO) to train multiple data chunks separately.
We believe different datasets should be suited to different training methods; therefore, the appropriate training method or loss function optimization should be chosen based on data characteristics. For instance, SFT is usually used for optimizing expressive styles, while DPO is employed when penalizing negative examples.
When the optimal training method cannot be determined, we recommend using multiple training methods in parallel. In the subsequent stage, the optimization algorithm will automatically select the necessary delta parameters.

Step 3: Data Mixing

Mix the data according to delta parameters. The mixing ratio and training strategy choices are automatically determined by the optimization algorithm, aiming to optimize the score on the evaluation dataset.

Usage Instructions

This project relies on LLaMA-Factory for model fine-tuning. Please download the code into the current folder and install the necessary dependencies:

cd AutoLLM
rm -rf LLaMA-Factory
git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e .
pip install bayesian-optimization
cd ..

The sample script constructs a toy example using mock data. Users can modify and use it as per their needs. To run the script with one command:

cd AutoLLM
sh autollm.sh

Experimental Results

We report the experimental results of using llama3-8b on four predefined delta_data sets as follows:

model	eval_score
raw	0.41
train@all	0.46
train@delta_1	0.26
train@delta_2	0.31
train@delta_3	0.39
train@delta_4	0.12
average_merge	0.16
human_merge	0.51
auto_merge	0.54

The types of the four data sets are: Mathematics, Code, Instruction Following, Instruction Generation.

The (optimal) parameters for various merging methods are as follows:

model	delta_1_weight	delta_2_weight	delta_3_weight	delta_4_weight
average_merge	1	1	1	1
human_merge	0	0	1	0.2
auto_merge	0.11787106318307217	0.5505310957766784	0.4914651404592535	0.14596056082881093

For more results and analysis, readers are encouraged to conduct their own experiments. Discussions are welcome in the issues section.

License

The code in this repository is open-sourced under the Apache-2.0 License.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
config		config
data		data
src		src
README.md		README.md
README_CN.md		README_CN.md
autollm.png		autollm.png
autollm.sh		autollm.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AutoLLM

Main Features and Steps

Step 1: Data Generation and Segregation

Step 2: Data Chunk Training

Step 3: Data Mixing

Usage Instructions

Experimental Results

License

About

Releases

Packages

Languages

haochen2115/AutoLLM

Folders and files

Latest commit

History

Repository files navigation

AutoLLM

Main Features and Steps

Step 1: Data Generation and Segregation

Step 2: Data Chunk Training

Step 3: Data Mixing

Usage Instructions

Experimental Results

License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages