Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Will there be support for models with custom architecture (not only mistral or gpt based)? #25

Open
Nishant-kirito opened this issue Mar 19, 2024 · 6 comments

Comments

@Nishant-kirito
Copy link

No description provided.

@vgel
Copy link
Owner

vgel commented Mar 26, 2024

is there a specific model you'd be interested in? It's not very hard to add support for models theoretically, so if there's a relatively-popular one you're interested in having supported I can take a look. If you have a truly custom model (that is still decoder-only, and not a MoE), you can patch it to expose a Mistral-like interface (e.g., by making a wrapper class that exposes appropriate config / layers properties) and that should just work--the per-model support is just to handle HF's lack of consistency in the model interface.

@NanoCode012
Copy link

NanoCode012 commented Mar 31, 2024

Hey @vgel , would it be possible to share on the method of adding new models ?

I’m also interested in MoE models, which I saw you explicitly mention. What challenges are there to support it (for ex, mixtral)? I’ve tried to run it with current notebooks and unfortunately don't see much difference between responses.

@vgel
Copy link
Owner

vgel commented Apr 6, 2024

@NanoCode012 the main issue with MoE models is that only a subset of experts are active for a certain forward pass (by design). repeng runs a bunch of forward passes (~one per batch of examples), so that means we mix activations from different experts and run PCA over them as if they were all from the same source.

The correct thing to do would be to move the extraction/control point down from after each transformer block to after each expert, and then collect vectors by layer and expert, not just by layer, and likewise apply per-expert as well. Training datasets (and training time) would probably need to be larger to accommodate this. It would be nice to have, but I've been pretty busy so haven't been able to make it a priority.

@ycros
Copy link

ycros commented Apr 7, 2024

So how do you support more models? I tried a simple test where I took the experiments notebook and I replaced mistralai/Mistral-7B-Instruct-v0.1 with mistralai/Mistral-7B-Instruct-v0.2 - and running with 0.2 gives me garbage outputs. I would have thought they'd be similar enough that it should just work, what am I missing?

image

@ndavidson19
Copy link

@ycros Mistral-v0.2 uses a Rope Theta value of 1e6 and removed sliding window attention should be easy to fix within the model config parameters.

@vgel I'm interested in getting this working for the Phi-2 architecture. I might try and take a stab at it as this seems extremely powerful technique for anti-jailbreaking.

@davniko
Copy link

davniko commented Jun 5, 2024

I've been playing around with this lib and got it to work with MoEs... Training the vectors is slow (quite slow) compared to dense models when training on the full all_truncated_outputs.json, and the code probably needs some refactoring/optimization.

This is using a Mixtral model (the dolphin finetune):
image

MoE's seem to be able to handle larger coefficients better than other models in my short early testing. Curiously I could also get some decent results out when training the happy vector with a dataset of just 18 example pairs:
image

I haven't tested it extensively yet so I don't know how robust or reliable it is tbh, but I'll push the code and an example notebook up on my fork after cleaning it up a bit, for those who might be interested (and in case @vgel is not already working on this behind the scenes).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants