-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When test with QLora Finetuned Gemma2 2B on Linux, it just generate a repeated char #146
Comments
The outputs from the gemma-q4_k and gemma-q4_0 models provided by the mllm team are correct. You can download the gemma model params from the repository at https://huggingface.co/mllmTeam/gemma-2b-mllm/tree/main to test whether your mllm has been compiled correctly. Additionally, have you modified the vocabulary of Gemma when finetuning? If so, you will need to provide the correct vocabulary file to demo_gemma. |
Have you combined the IIRC, some LoRA fine-tuning frameworks offer utilities to facilitate this process. For instance, the alpaca-lora framework might have relevant functions. It's also possible that BitsAndBytes provides similar functionality. You can find more information in the alpaca-lora repository, specifically in the file export_hf_checkpoint.py, around line 39: Link to GitHub. |
Thanks for your help. |
The Gemma impl in mllm is v1.1. Gemma 2 shares a similar architectural foundation with the original Gemma models. Compared to the original Gemma, Gemma2 introduces features such as Logit Soft-Capping. You need to modify the modeling_gemma.hpp file. Link to the file cc @yirongjie pls add Gemma2 to our todo list |
I fine-tune the Gemma2 2B Instruction with BitsAndBytes(int4). It works when test with the transformer.
Then I follow the guide to build the mllm and quantize the model for linux.
But when I test the finetuned model with the example demo_gemma, it always output a repeated char(Korea char).
Does any one tried this?
Or is something wrong with me?
The text was updated successfully, but these errors were encountered: