Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jinja change broke my sideloaded models (TheBloke/OpenHermes-2.5-Mistral-7B-GGUF) Have Fallback solution for models without jinja template #3287

Open
UserHub1973 opened this issue Dec 13, 2024 · 7 comments
Labels
bug-unconfirmed chat gpt4all-chat issues

Comments

@UserHub1973
Copy link

UserHub1973 commented Dec 13, 2024

Metaissue: #3340

So out of 12 models that are regularly used by me.

None of them work as of the last update. They all worked wonderfully under 3.4.2 and all are broke under 3.5.1.

Is every non Nomic Model being rendered useless and inoperable, going to be addressed as a FLAW or is it a new feature that I am to be happy and joyous over at all times?

Basically, is there a plan to go back to some kind of 'jinja' based 'default prompt' that just works with any side loaded models? Lawyer talk and nanny gas lighting aside the expected real-world behavior was side loaded models just worked once put into the model folder. Very compatible with many many models of any quant we wanted from 4 to 16 any quant you wanted almost any model. Thousands in fact.

That was the functionality up until 3.5 AFTER ALL. Truth be told. 99% of all Hugging face models just worked out of the box and with very little or NO fiddling, upside, will all those THOUSAND OF LLM models on HUGGINGFACE that made your product popular because they side loaded EASILY, have to stop advertising as working with your product?

TheBloke Models are not compatible with software all the sudden?

[TheBloke/OpenHermes-2.5-Mistral-7B-GGUF] (This is purely a random example among 1000's of others broke by your UPDATES) (https://huggingface.co/TheBloke/OpenHermes-2.5-Mistral-7B-GGUF)
Ran script ... Won't just find and install the correct template?

Nope. This is a lot of lost productivity.

The Bloke and folks like him ... MADE YOU ...

Those 'side loaded' models are why many folks use you. We ourselves, only use this because it gave us the EASE and FREEDOM to use any model WE CHOSE on our machine under OUR TERMS. 99/100 AI models in 5 or 6 Archs used to work perfectly with the default out of the box.

If that goes: Your software gets uninstalled. I don't use your curated ones at all daily so them working or not matters not to me in the least. My freedom to use the model I want like I could last week, does.

@UserHub1973 UserHub1973 added bug-unconfirmed chat gpt4all-chat issues labels Dec 13, 2024
@manyoso
Copy link
Collaborator

manyoso commented Dec 13, 2024

Here is our plan:

  1. New release today with three PRs one of which will help communicate to users that models downloaded via huggingface from within GPT4All are not guaranteed to work - aka some mild setting of expectations and making things clear

  2. New PR for our documentation on chat templates that includes known list of templates that work with different families of models including the Mistral ones that a lot of people are having problems with

  3. Communication to our users about this documentation and roadmap where we'll have a heuristic that detects bad templates and swaps them with known good ones

  4. Opening up lines of communication with jinja2cpp authors about the bugs we're encountering to hopefully make this dependency handle a broader swath of the chat templates encountered in the wild

  5. Making it clear with new features that are released that they are direct result of the new chat templates

@manyoso
Copy link
Collaborator

manyoso commented Dec 13, 2024

Almost all the problems encountered so far are a result of bugs or missing features in a new third party dependency - jinja2cpp.

The change to chat templates and trying to work with the chat templates that are included in the gguf in theory should result in GPT4All working with more models that are sideloaded or downloaded from huggingface. Unfortunately, because of a few problems in jinja2cpp that isn't the case, but we're working to fix and mitigate. The intention here was not to break sideloaded models but to actually make them more likely to work out of the box.

@manyoso manyoso changed the title Jinja broke my models .... I don't care about your currated models. Jinja change broke my sideloaded models Dec 13, 2024
@manyoso manyoso marked this as a duplicate of #3281 Dec 13, 2024
@nomic-ai nomic-ai deleted a comment from UserHub1973 Dec 13, 2024
@nomic-ai nomic-ai deleted a comment from UserHub1973 Dec 13, 2024
@nomic-ai nomic-ai deleted a comment from UserHub1973 Dec 13, 2024
@nomic-ai nomic-ai deleted a comment from UserHub1973 Dec 13, 2024
@thingsiplay
Copy link

Is it possible to support the old behavior and the new one, so we could switch the dependency engine per by model?

@nomic-ai nomic-ai deleted a comment from UserHub1973 Dec 20, 2024
@ThiloteE
Copy link
Collaborator

ThiloteE commented Dec 20, 2024

Original chate template in tokenizer_config.json of https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B

{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}

Out of the box chat template in GPT4All 3.6.1 of https://huggingface.co/TheBloke/OpenHermes-2.5-Mistral-7B-GGUF

Working chat template in GPT4All 3.6.1 for TheBloke/OpenHermes-2.5-Mistral-7B-GGUF :

{%- for message in messages %}
    {{- '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}
{%- endfor %}
{%- if add_generation_prompt %}
    {{- '<|im_start|>assistant\n' }}
{%- endif %}

Out of the box working chat template in GPT4All 3.6.1 of https://huggingface.co/mradermacher/OpenHermes-2.5-Mistral-7B-i1-GGUF

{%- for message in messages %}
    {{- '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}
{%- endfor %}
{%- if add_generation_prompt %}
    {{- '<|im_start|>assistant\n' }}
{%- endif %}

Basically, the problem here is that TheBloke's quants from one year ago do not feature a chat template at all. TheBloke's quants were created on 2nd November 2023, but the chat template was added to teknums repository on third of November 2023 (Can be seen in the commit history). TheBloke simply was too fast at quantizing and didn't bother updating the quant afterwards. For this reason, using newer quants (e.g. from mradermacher) is recommended.

@ThiloteE ThiloteE changed the title Jinja change broke my sideloaded models Jinja change broke my sideloaded models (TheBloke/OpenHermes-2.5-Mistral-7B-GGUF) Dec 20, 2024
@ThiloteE
Copy link
Collaborator

ThiloteE commented Dec 20, 2024

If there are any other models you are using that are still broken with GPT4All 3.6.1, please post links to the model here.

@ThiloteE
Copy link
Collaborator

Closing for now to reduce the number of open issues. If you have other models, feel free to reopen.

@ThiloteE
Copy link
Collaborator

Actually, re-opening, as this is a perfect example for a case where GPT4All should have a fallback solution for models without chat template.

@ThiloteE ThiloteE reopened this Dec 22, 2024
@ThiloteE ThiloteE changed the title Jinja change broke my sideloaded models (TheBloke/OpenHermes-2.5-Mistral-7B-GGUF) Jinja change broke my sideloaded models (TheBloke/OpenHermes-2.5-Mistral-7B-GGUF) Have Fallback solution for models without jinja template Dec 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-unconfirmed chat gpt4all-chat issues
Projects
None yet
Development

No branches or pull requests

4 participants