You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, guys, thanks for your work.
I got a question: the fixed policy templates are too long, which can seriously affect the speed of model inference, have you considered optimisation methods?
Is it possible to store kv cache. For the llamaguard, prefix KV caching can be used if it is prefixed.(This may not be possible because of the llava architecture, where the prefix is an image and not a fixed template, and the token of the image is not fixed. I was just wondering what you guys were thinking.)
The text was updated successfully, but these errors were encountered:
Thank you for the hint. Initially, we also thought about stating the policy within our system prompt. Unfortunately, the conversation templates are implemented relatively statically in the training code of llava. So far, we haven't had the chance to implement it, but the idea is very sensible, and we will probably include it in our next iteration of LlavaGuard.
Hi, guys, thanks for your work.
I got a question: the fixed policy templates are too long, which can seriously affect the speed of model inference, have you considered optimisation methods?
Is it possible to store kv cache. For the llamaguard, prefix KV caching can be used if it is prefixed.(This may not be possible because of the llava architecture, where the prefix is an image and not a fixed template, and the token of the image is not fixed. I was just wondering what you guys were thinking.)
The text was updated successfully, but these errors were encountered: