Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Adds a models3.json entry for Qwen2-1.5B-Instruct
Description of Model
It is a tiny bilingual model and at the date of writing with very strong results in benchmarks (for its parameter size). It supports a context of up to 32768. Because of its model size it has very fast responses, even when doing inference on CPU. This LLM is LITERALLY for all. Since the model fits into 4GB of RAM (just barely, if the Operating System and other apps also need RAM) or alternatively into 3GB of VRAM, this will be the workhorse of the desperate and hardware poor.
Personal Impression:
I got the impression the model is very task focused and this is the reason, why I chose
Below is an instruction that describes a task. Write a response that appropriately completes the request.
as system prompt. Since the model is relatively small, its responses may seem not very coherent or intelligent, but it works surprisingly well with GPT4All's LocalDocs feature. It is like the model was made for RAG. Its long context adds to that. It mainly will appeal to English and Chinese speaking users.Checklist before requesting a review