-
Notifications
You must be signed in to change notification settings - Fork 7.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Models: Add Yi-1.5-9B-Chat-16K #2750
base: main
Are you sure you want to change the base?
Conversation
Adds model support for [Yi-1.5-9B-Chat-16K](https://huggingface.co/GPT4All-Community/Yi-1.5-9B-Chat-16K-GGUF) ## Description: It is a bilingual model and at the date of writing with strong results in benchmarks (for its parameter size). It supports a context of up to 16K. - Minimum required version: GPT4All 3.1. - The model was trained on English and Chinese language. - License: Apache 2.0 - Q4_0 ## Personal Impression: I got the impression the model is very task focused and this is the reason, why I chose `Below is an instruction that describes a task. Write a response that appropriately completes the request.` as system prompt. I have seen refusals when it was tasked with certain things and has the typical "know it better than the user" vibe and seems to be finetuned for being a professional assistant. For instance, roleplay caused refusals, but writing a cover letter was no problem. Its long context and quality of responses makes it a good model, if you can bear its alignment. It mainly will appeal to English and Chinese speaking users. Signed-off-by: ThiloteE <[email protected]>
Signed-off-by: ThiloteE <[email protected]>
Signed-off-by: ThiloteE <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested file hash size prompts download link. Seems like it's all good.
Signed-off-by: ThiloteE <[email protected]>
Is this model for mainland chinese or taiwanese? I'd like our maintainers of the translations for these to have a look |
Also, we really need sections key in our models.json so we don't just have a huge list of models, but we can overhaul the GUI to provide sections for a model that is more specialized, right? |
Unfortunately I am not fluent in Chinese. The original model card does not specify, if mainland or taiwanese. |
@supersonictw can you comment on this model's chinese abilities? is it traditional chinese or simplified? wondering if we should advertise its purported bilingual abilities |
Yi is a simplified chinese based model. |
The model is very friendly for people in Mainland China. People in Taiwan are prefer to use LLaMa(or ChatGPT-4, lol 🤪), it's more general and can be accepted. For best Traditional Chinese model, it might be "TaiwanLLM", but it's not so required. LLaMa model family is useful enough for us. |
This one is larger than the Qwen models so I think it should probably be an addition, right? |
If this model is not good enough, I can also try to find a finetune of it, but it is hard to find good finetunes nowadays, since the huggingface open leaderboard 2 has been quite inactive since weeks/months now. My motivation for supporting this model specifically:
|
I will add a PR for Qwen2 as well. Maybe one of its finetunes as well. I think there are more finetunes for Qwen2 |
Resolves #176
Adds model support for Yi-1.5-9B-Chat-16K
Description of Model
It is a bilingual model and at the date of writing with strong results in benchmarks (for its parameter size). It supports a context of up to 16K.
Personal Impression:
I got the impression the model is very task focused and this is the reason, why I chose
Below is an instruction that describes a task. Write a response that appropriately completes the request.
as system prompt. I have seen refusals when it was tasked with certain things and it has the typical "know it better than the user" vibe and seems to be finetuned with a particular alignment. For instance, roleplay caused refusals, but tasking it to write a cover letter was no problem. Its long context and quality of responses makes it a good model, if you can bear its alignment or your use case happens to fall within the originally intended use cases of the model. It mainly will appeal to English and Chinese speaking users.Checklist before requesting a review