-
Notifications
You must be signed in to change notification settings - Fork 7.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jinja change broke my sideloaded models (TheBloke/OpenHermes-2.5-Mistral-7B-GGUF) Have Fallback solution for models without jinja template #3287
Comments
Here is our plan:
|
Almost all the problems encountered so far are a result of bugs or missing features in a new third party dependency - jinja2cpp. The change to chat templates and trying to work with the chat templates that are included in the gguf in theory should result in GPT4All working with more models that are sideloaded or downloaded from huggingface. Unfortunately, because of a few problems in jinja2cpp that isn't the case, but we're working to fix and mitigate. The intention here was not to break sideloaded models but to actually make them more likely to work out of the box. |
Is it possible to support the old behavior and the new one, so we could switch the dependency engine per by model? |
Original chate template in tokenizer_config.json of https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %} Out of the box chat template in GPT4All 3.6.1 of https://huggingface.co/TheBloke/OpenHermes-2.5-Mistral-7B-GGUFWorking chat template in GPT4All 3.6.1 for TheBloke/OpenHermes-2.5-Mistral-7B-GGUF :{%- for message in messages %}
{{- '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}
{%- endfor %}
{%- if add_generation_prompt %}
{{- '<|im_start|>assistant\n' }}
{%- endif %} Out of the box working chat template in GPT4All 3.6.1 of https://huggingface.co/mradermacher/OpenHermes-2.5-Mistral-7B-i1-GGUF{%- for message in messages %}
{{- '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}
{%- endfor %}
{%- if add_generation_prompt %}
{{- '<|im_start|>assistant\n' }}
{%- endif %} Basically, the problem here is that TheBloke's quants from one year ago do not feature a chat template at all. TheBloke's quants were created on 2nd November 2023, but the chat template was added to teknums repository on third of November 2023 (Can be seen in the commit history). TheBloke simply was too fast at quantizing and didn't bother updating the quant afterwards. For this reason, using newer quants (e.g. from mradermacher) is recommended. |
If there are any other models you are using that are still broken with GPT4All 3.6.1, please post links to the model here. |
Closing for now to reduce the number of open issues. If you have other models, feel free to reopen. |
Actually, re-opening, as this is a perfect example for a case where GPT4All should have a fallback solution for models without chat template. |
Metaissue: #3340
So out of 12 models that are regularly used by me.
None of them work as of the last update. They all worked wonderfully under 3.4.2 and all are broke under 3.5.1.
Is every non Nomic Model being rendered useless and inoperable, going to be addressed as a FLAW or is it a new feature that I am to be happy and joyous over at all times?
Basically, is there a plan to go back to some kind of 'jinja' based 'default prompt' that just works with any side loaded models? Lawyer talk and nanny gas lighting aside the expected real-world behavior was side loaded models just worked once put into the model folder. Very compatible with many many models of any quant we wanted from 4 to 16 any quant you wanted almost any model. Thousands in fact.
That was the functionality up until 3.5 AFTER ALL. Truth be told. 99% of all Hugging face models just worked out of the box and with very little or NO fiddling, upside, will all those THOUSAND OF LLM models on HUGGINGFACE that made your product popular because they side loaded EASILY, have to stop advertising as working with your product?
TheBloke Models are not compatible with software all the sudden?
[TheBloke/OpenHermes-2.5-Mistral-7B-GGUF] (This is purely a random example among 1000's of others broke by your UPDATES) (https://huggingface.co/TheBloke/OpenHermes-2.5-Mistral-7B-GGUF)
Ran script ... Won't just find and install the correct template?
Nope. This is a lot of lost productivity.
The Bloke and folks like him ... MADE YOU ...
Those 'side loaded' models are why many folks use you. We ourselves, only use this because it gave us the EASE and FREEDOM to use any model WE CHOSE on our machine under OUR TERMS. 99/100 AI models in 5 or 6 Archs used to work perfectly with the default out of the box.
If that goes: Your software gets uninstalled. I don't use your curated ones at all daily so them working or not matters not to me in the least. My freedom to use the model I want like I could last week, does.
The text was updated successfully, but these errors were encountered: