MAJOR UPDATE: Remove CCS, VINC #292

AlexTMallen · 2023-09-21T22:13:40Z

This PR does several major things to the repository:

It removes support for CCS and VINC, and removes the use of contrast pairs
The InferenceServer class is used to do inference, enabling FSDP (--fsdp) inference with much larger models
- It also only requires loading the model once, and doesn't load the model if the dataset is cached
Only support LR probes and LM predictions
LM predictions are taken in response to statement + suffix where statement is the statement we're extracting hidden states from and suffix is a piece of text like "\n\nIs the above statement true or false?" that's optionally specified in the yaml template file
logprobs are optionally stored with --save_logprobs
support for non-balanced datasets with --balance False
Added testing for:
- The pipeline that turns the raw HF dataset into the input_ids dataset passed to InferenceServer
- InferenceServer
default template called "_default" which takes the "statement" column itself

elk/promptsource/templates/_default/templates.yaml

norabelrose

almost good to go

norabelrose · 2023-09-28T20:35:41Z

elk/training/classifier.py

@@ -185,7 +190,12 @@ def fit_cv(

    @classmethod
    def inlp(


I think we should probably just get rid of this. If we really want to explore the half-space of admissible predictors we should probably try the Conical Knowledge Decomposition thing.

I have actually been using this occasionally, basically it just gives me a higher recall way of searching for good probes. Do you still think we should get rid of it?

elk/training/sweep.py

norabelrose · 2023-09-28T20:38:12Z

elk/training/train.py

+    """Whether to use LEACE to erase the paraphrase dimensions before training the
+    classifier."""
+
+    max_inlp_iter: int | None = None


let's drop this

elk/extraction/extraction.py

elk/extraction/generator.py

elk/promptsource/templates.py

elk/plotting/visualize.py

AlexTMallen · 2023-09-29T00:40:34Z

elk/training/classifier.py

@@ -185,7 +190,12 @@ def fit_cv(

    @classmethod
    def inlp(


I have actually been using this occasionally, basically it just gives me a higher recall way of searching for good probes. Do you still think we should get rid of it?

elk/training/sweep.py

…args, fix answer token being appended, fix viz, fix tqdm propagation

for more information, see https://pre-commit.ci

norabelrose

let's merge this bad boy

AlexTMallen added 20 commits August 2, 2023 21:38

save hiddens to disk

a08d3d2

remove contrast pairs

e7d0e73

fix tests

ca09c52

add LEACE to supervised

8623ac8

add assertion for multi-dataset erasure

a0c1f30

add blank template for statements

3f63b82

mvp working for llama

6471c77

inference server working with ids

8ff22b2

refactor extraction to use InferenceServer

c8850a6

mvp with inference server

edaaf07

fix caching

1a04ee8

don't load model when using cache

3bc362e

add default template

d528fbd

maybe unsqueeze

fc8053e

gutted elk; updated tests

5c1f656

save logprobs

1730303

add balance and max_inlp_iter args

75ec336

extract lm predictions

83c7642

lm preds

98a8dea

add encodings test, cleanup

d48519d

AlexTMallen requested a review from norabelrose September 21, 2023 22:14

ignore type issue

83d81d9

AlexTMallen commented Sep 25, 2023

View reviewed changes

elk/promptsource/templates/_default/templates.yaml Outdated Show resolved Hide resolved

norabelrose requested changes Sep 28, 2023

View reviewed changes

AlexTMallen commented Sep 29, 2023

View reviewed changes

AlexTMallen added 2 commits September 29, 2023 02:01

revisions from Nora's feedback; move output_hidden_states to model_kw…

ebf95b0

…args, fix answer token being appended, fix viz, fix tqdm propagation

cleanup

22a804b

AlexTMallen requested a review from norabelrose September 29, 2023 02:16

AlexTMallen added 2 commits October 3, 2023 17:12

re-fix tests

d3586be

fix save_logprobs

abbec97

AlexTMallen and others added 6 commits October 5, 2023 23:03

fix layer sorting in logprobs.pt

81b1ba3

mark gpu tests

b74388f

test logprobs

801d698

remove buggy viz test

71d4e8c

Merge branch 'main' into no-contrast-pairs

8b4f82b

[pre-commit.ci] auto fixes from pre-commit.com hooks

c01634d

for more information, see https://pre-commit.ci

norabelrose approved these changes Oct 23, 2023

View reviewed changes

norabelrose merged commit 70a3290 into main Oct 23, 2023
4 checks passed

norabelrose deleted the no-contrast-pairs branch October 23, 2023 01:18

norabelrose mentioned this pull request Oct 23, 2023

Add option to save probabilities #289

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MAJOR UPDATE: Remove CCS, VINC #292

MAJOR UPDATE: Remove CCS, VINC #292

AlexTMallen commented Sep 21, 2023

norabelrose left a comment

norabelrose Sep 28, 2023

AlexTMallen Sep 29, 2023

norabelrose Sep 28, 2023

AlexTMallen Sep 29, 2023

norabelrose left a comment

MAJOR UPDATE: Remove CCS, VINC #292

MAJOR UPDATE: Remove CCS, VINC #292

Conversation

AlexTMallen commented Sep 21, 2023

norabelrose left a comment

Choose a reason for hiding this comment

norabelrose Sep 28, 2023

Choose a reason for hiding this comment

AlexTMallen Sep 29, 2023

Choose a reason for hiding this comment

norabelrose Sep 28, 2023

Choose a reason for hiding this comment

AlexTMallen Sep 29, 2023

Choose a reason for hiding this comment

norabelrose left a comment

Choose a reason for hiding this comment