Models: Add Yi-1.5-9B-Chat-16K #2750

ThiloteE · 2024-07-26T23:41:57Z

Resolves #176

Adds model support for Yi-1.5-9B-Chat-16K

Description of Model

It is a bilingual model and at the date of writing with strong results in benchmarks (for its parameter size). It supports a context of up to 16K.

The model was trained/finetuned on English and Chinese language
License: Apache 2.0

Personal Impression:

I got the impression the model is very task focused and this is the reason, why I chose Below is an instruction that describes a task. Write a response that appropriately completes the request. as system prompt. I have seen refusals when it was tasked with certain things and it has the typical "know it better than the user" vibe and seems to be finetuned with a particular alignment. For instance, roleplay caused refusals, but tasking it to write a cover letter was no problem. Its long context and quality of responses makes it a good model, if you can bear its alignment or your use case happens to fall within the originally intended use cases of the model. It mainly will appeal to English and Chinese speaking users.

Checklist before requesting a review

I have performed a self-review of my code.
If it is a core feature, I have added thorough tests.
I have added thorough documentation for my code.
I have tagged PR with relevant project labels. I acknowledge that a PR without labels may be dismissed.
If this PR addresses a bug, I have provided both a screenshot/video of the original bug and the working solution.

Adds model support for [Yi-1.5-9B-Chat-16K](https://huggingface.co/GPT4All-Community/Yi-1.5-9B-Chat-16K-GGUF) ## Description: It is a bilingual model and at the date of writing with strong results in benchmarks (for its parameter size). It supports a context of up to 16K. - Minimum required version: GPT4All 3.1. - The model was trained on English and Chinese language. - License: Apache 2.0 - Q4_0 ## Personal Impression: I got the impression the model is very task focused and this is the reason, why I chose `Below is an instruction that describes a task. Write a response that appropriately completes the request.` as system prompt. I have seen refusals when it was tasked with certain things and has the typical "know it better than the user" vibe and seems to be finetuned for being a professional assistant. For instance, roleplay caused refusals, but writing a cover letter was no problem. Its long context and quality of responses makes it a good model, if you can bear its alignment. It mainly will appeal to English and Chinese speaking users. Signed-off-by: ThiloteE <[email protected]>

Signed-off-by: ThiloteE <[email protected]>

3Simplex

Tested file hash size prompts download link. Seems like it's all good.

Signed-off-by: ThiloteE <[email protected]>

manyoso · 2024-07-28T14:07:44Z

Is this model for mainland chinese or taiwanese? I'd like our maintainers of the translations for these to have a look

manyoso · 2024-07-28T14:08:34Z

Also, we really need sections key in our models.json so we don't just have a huge list of models, but we can overhaul the GUI to provide sections for a model that is more specialized, right?

ThiloteE · 2024-07-28T15:04:14Z

Unfortunately I am not fluent in Chinese. The original model card does not specify, if mainland or taiwanese.

manyoso · 2024-07-29T13:56:45Z

@supersonictw can you comment on this model's chinese abilities? is it traditional chinese or simplified? wondering if we should advertise its purported bilingual abilities

supersonictw · 2024-07-29T14:26:02Z

Yi is a simplified chinese based model.
They call that as "零一万物"(01.ai).
The model is provided for Mainland China mainly, though it is found by Taiwanese Scientist.

supersonictw · 2024-07-29T14:32:38Z

The model is very friendly for people in Mainland China.
But if you want to add more models for Mainland China, ~~it's better to add Qwen/Qwen2 models also~~ oh I found they're already added, wow #2759 .

People in Taiwan are prefer to use LLaMa(or ChatGPT-4, lol 🤪), it's more general and can be accepted. For best Traditional Chinese model, it might be "TaiwanLLM", but it's not so required. LLaMa model family is useful enough for us.

manyoso · 2024-08-01T11:54:39Z

The model is very friendly for people in Mainland China. But if you want to add more models for Mainland China, ~~it's better to add Qwen/Qwen2 models also~~ oh I found they're already added, wow #2759 .

People in Taiwan are prefer to use LLaMa(or ChatGPT-4, lol 🤪), it's more general and can be accepted. For best Traditional Chinese model, it might be "TaiwanLLM", but it's not so required. LLaMa model family is useful enough for us.

This one is larger than the Qwen models so I think it should probably be an addition, right?

ThiloteE · 2024-08-01T17:47:44Z

If this model is not good enough, I can also try to find a finetune of it, but it is hard to find good finetunes nowadays, since the huggingface open leaderboard 2 has been quite inactive since weeks/months now.

My motivation for supporting this model specifically:

I think that having a model in a language that billions of people speak is a good idea.
The model claims to have larger context than most of the models GPT4All currently supports and if meta's llama-3.1-8b-instruct-128k turns out to not do well with larger context or still is buggy in the next release of GPT4All, then at least we have this one for longer context.
It is relatively high on the benchmarks.
None of this models quants out in the wild are compatible with GPT4All, so adding support for this model will save many users from trying bad quants.
I had to start "somewhere" at fixing models and opening pull-requests to GPT4All. I was planning to add more models, so this was just the first one I opened a pull-request for.
The model is from an organisation that is backed by a large corporation (Alibaba Cloud), so reputation is high.
The model's license is Apache 2.0
The model is still a recent model. Not too old yet.

ThiloteE · 2024-08-01T19:44:08Z

I will add a PR for Qwen2 as well. Maybe one of its finetunes as well. I think there are more finetunes for Qwen2

ThiloteE · 2024-08-01T19:50:48Z

ThiloteE added models models.json This requires a change to the official model list. labels Jul 26, 2024

ThiloteE changed the title ~~Models3.json Add Yi-1.5-9B-Chat-16K~~ Models: Add Yi-1.5-9B-Chat-16K Jul 26, 2024

ThiloteE added 2 commits July 27, 2024 01:58

Point to correct URL

0dd91cd

Signed-off-by: ThiloteE <[email protected]>

Fix empty space

666de6b

Signed-off-by: ThiloteE <[email protected]>

3Simplex reviewed Jul 27, 2024

View reviewed changes

ThiloteE requested a review from manyoso July 27, 2024 00:37

Fix wrong space in Systemprompt

c2cbe9e

Signed-off-by: ThiloteE <[email protected]>

ThiloteE mentioned this pull request Jul 27, 2024

Models: Add Qwen2-1.5B-Instruct #2759

Merged

5 tasks

ThiloteE mentioned this pull request Jul 28, 2024

[Feature] Disentangle "Explore models page" by adding tabs or additional preferences in search bar #2525

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Models: Add Yi-1.5-9B-Chat-16K #2750

Models: Add Yi-1.5-9B-Chat-16K #2750

ThiloteE commented Jul 26, 2024 •

edited

Loading

3Simplex left a comment

manyoso commented Jul 28, 2024

manyoso commented Jul 28, 2024

ThiloteE commented Jul 28, 2024

manyoso commented Jul 29, 2024

supersonictw commented Jul 29, 2024

supersonictw commented Jul 29, 2024 •

edited

Loading

manyoso commented Aug 1, 2024

ThiloteE commented Aug 1, 2024 •

edited

Loading

ThiloteE commented Aug 1, 2024 •

edited

Loading

ThiloteE commented Aug 1, 2024 •

edited

Loading

Models: Add Yi-1.5-9B-Chat-16K #2750

Are you sure you want to change the base?

Models: Add Yi-1.5-9B-Chat-16K #2750

Conversation

ThiloteE commented Jul 26, 2024 • edited Loading

Description of Model

Personal Impression:

Checklist before requesting a review

3Simplex left a comment

Choose a reason for hiding this comment

manyoso commented Jul 28, 2024

manyoso commented Jul 28, 2024

ThiloteE commented Jul 28, 2024

manyoso commented Jul 29, 2024

supersonictw commented Jul 29, 2024

supersonictw commented Jul 29, 2024 • edited Loading

manyoso commented Aug 1, 2024

ThiloteE commented Aug 1, 2024 • edited Loading

My motivation for supporting this model specifically:

ThiloteE commented Aug 1, 2024 • edited Loading

ThiloteE commented Aug 1, 2024 • edited Loading

ThiloteE commented Jul 26, 2024 •

edited

Loading

supersonictw commented Jul 29, 2024 •

edited

Loading

ThiloteE commented Aug 1, 2024 •

edited

Loading

ThiloteE commented Aug 1, 2024 •

edited

Loading

ThiloteE commented Aug 1, 2024 •

edited

Loading