Will there be support for models with custom architecture (not only mistral or gpt based)? #25

Nishant-kirito · 2024-03-19T11:09:48Z

No description provided.

vgel · 2024-03-26T06:00:53Z

is there a specific model you'd be interested in? It's not very hard to add support for models theoretically, so if there's a relatively-popular one you're interested in having supported I can take a look. If you have a truly custom model (that is still decoder-only, and not a MoE), you can patch it to expose a Mistral-like interface (e.g., by making a wrapper class that exposes appropriate config / layers properties) and that should just work--the per-model support is just to handle HF's lack of consistency in the model interface.

NanoCode012 · 2024-03-31T01:51:27Z

Hey @vgel , would it be possible to share on the method of adding new models ?

I’m also interested in MoE models, which I saw you explicitly mention. What challenges are there to support it (for ex, mixtral)? I’ve tried to run it with current notebooks and unfortunately don't see much difference between responses.

vgel · 2024-04-06T04:17:47Z

@NanoCode012 the main issue with MoE models is that only a subset of experts are active for a certain forward pass (by design). repeng runs a bunch of forward passes (~one per batch of examples), so that means we mix activations from different experts and run PCA over them as if they were all from the same source.

The correct thing to do would be to move the extraction/control point down from after each transformer block to after each expert, and then collect vectors by layer and expert, not just by layer, and likewise apply per-expert as well. Training datasets (and training time) would probably need to be larger to accommodate this. It would be nice to have, but I've been pretty busy so haven't been able to make it a priority.

ycros · 2024-04-07T08:33:20Z

So how do you support more models? I tried a simple test where I took the experiments notebook and I replaced mistralai/Mistral-7B-Instruct-v0.1 with mistralai/Mistral-7B-Instruct-v0.2 - and running with 0.2 gives me garbage outputs. I would have thought they'd be similar enough that it should just work, what am I missing?

ndavidson19 · 2024-04-10T17:08:40Z

@ycros Mistral-v0.2 uses a Rope Theta value of 1e6 and removed sliding window attention should be easy to fix within the model config parameters.

@vgel I'm interested in getting this working for the Phi-2 architecture. I might try and take a stab at it as this seems extremely powerful technique for anti-jailbreaking.

davniko · 2024-06-05T09:00:14Z

I've been playing around with this lib and got it to work with MoEs... Training the vectors is slow (quite slow) compared to dense models when training on the full all_truncated_outputs.json, and the code probably needs some refactoring/optimization.

This is using a Mixtral model (the dolphin finetune):

MoE's seem to be able to handle larger coefficients better than other models in my short early testing. Curiously I could also get some decent results out when training the happy vector with a dataset of just 18 example pairs:

I haven't tested it extensively yet so I don't know how robust or reliable it is tbh, but I'll push the code and an example notebook up on my fork after cleaning it up a bit, for those who might be interested (and in case @vgel is not already working on this behind the scenes).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Will there be support for models with custom architecture (not only mistral or gpt based)? #25

Will there be support for models with custom architecture (not only mistral or gpt based)? #25

Nishant-kirito commented Mar 19, 2024

vgel commented Mar 26, 2024 •

edited

Loading

NanoCode012 commented Mar 31, 2024 •

edited

Loading

vgel commented Apr 6, 2024 •

edited

Loading

ycros commented Apr 7, 2024

ndavidson19 commented Apr 10, 2024

davniko commented Jun 5, 2024

Will there be support for models with custom architecture (not only mistral or gpt based)? #25

Will there be support for models with custom architecture (not only mistral or gpt based)? #25

Comments

Nishant-kirito commented Mar 19, 2024

vgel commented Mar 26, 2024 • edited Loading

NanoCode012 commented Mar 31, 2024 • edited Loading

vgel commented Apr 6, 2024 • edited Loading

ycros commented Apr 7, 2024

ndavidson19 commented Apr 10, 2024

davniko commented Jun 5, 2024

vgel commented Mar 26, 2024 •

edited

Loading

NanoCode012 commented Mar 31, 2024 •

edited

Loading

vgel commented Apr 6, 2024 •

edited

Loading