Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update tests for transformers 4.36 #10858

Merged
merged 63 commits into from
May 24, 2024

Conversation

jenniew
Copy link
Contributor

@jenniew jenniew commented Apr 23, 2024

Update tests for transformers 4.36.2

Related issue: https://github.com/analytics-zoo/nano/issues/1289

Move mistral related test to main tests as mistral is working under transformers 4.36. Removed 4.34 tests

@jenniew
Copy link
Contributor Author

jenniew commented May 10, 2024

Examples are also verified. https://github.com/analytics-zoo/nano/issues/1120

workflow_dispatch:
workflow_call:

# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
# llm-cpp-build: # please uncomment it for PR tests
# uses: ./.github/workflows/llm-binary-build.yml
llm-cpp-build: # please uncomment it for PR tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert these changes before PR merge

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

Copy link
Contributor

@Oscilloscope98 Oscilloscope98 May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be other places changed for PR tests that should be reverted

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted

- 'Qwen/Qwen-7B-Chat'
- 'BAAI/AquilaChat-7B'
- 'baichuan-inc/Baichuan2-7B-Chat'
- 'baichuan-inc/Baichuan2-13B-Chat-4bit'
- 'bigscience/bloomz-7b1'
- 'fnlp/moss-moon-003-sft-4bit'
# - 'fnlp/moss-moon-003-sft-4bit' # moss-moon-003-sft cannot work on transformers 4.34+
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this supposed to be fixed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the tokenizer issue of moss-moon-003-sft model, because moss-moon-003-sft haven't fix the tokenizer issue to compatible with transformers 4.34+, we can not fix it. See the issue: https://github.com/analytics-zoo/nano/issues/1145

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then shall we keep transformers 4.31 in the test as well for this model?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know whether we need to keep such test for transformers 4.31 since ipex-llm would be updated to support transformers 4.36. @jason-dai Do we need to keep tests which only works on transformers 4.31?

@@ -75,7 +75,7 @@ jobs:
echo "runner=$runner" >> $GITHUB_OUTPUT

llm-whisper-evaluation:
# if: ${{ github.event.schedule || github.event.inputs.artifact == 'llm-whisper-evaluation' || github.event.inputs.artifact == 'all' }} # please comment it for PR tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert this change to make this PR more clean.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted

@hkvision
Copy link
Contributor

http://10.239.44.136:8888/pr_perf_gpu/ Need to check the performance before the merge.

workflow_dispatch:
workflow_call:

# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
# llm-cpp-build: # please uncomment it for PR tests
# uses: ./.github/workflows/llm-binary-build.yml
llm-cpp-build: # please uncomment it for PR tests
Copy link
Contributor

@Oscilloscope98 Oscilloscope98 May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be other places changed for PR tests that should be reverted

.github/workflows/llm_performance_tests.yml Outdated Show resolved Hide resolved
@@ -18,10 +18,6 @@ export OMP_NUM_THREADS=$THREAD_NUM
python -m pytest -s ${LLM_INFERENCE_TEST_DIR}/test_transformers_api.py -v
python -m pytest -s ${LLM_INFERENCE_TEST_DIR}/test_optimize_model_api.py -v

python -m pip install transformers==4.34.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the testing for Transformers 4.34 is no longer necessary?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be, mistral is put in transformers 4.36, which replaces 4.34

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the testing for Transformers 4.34 is no longer necessary, since I move mistral test into general test cases.


llm-performance-test-on-arc:
if: ${{ github.event.schedule || github.event_name == 'workflow_dispatch' || github.event.inputs.artifact == 'llm-performance-test-on-arc' || github.event.inputs.artifact == 'all' }} # please comment it for PR tests
# needs: llm-cpp-build # please uncomment it for PR tests
#if: ${{ github.event.schedule || github.event_name == 'workflow_dispatch' || github.event.inputs.artifact == 'llm-performance-test-on-arc' || github.event.inputs.artifact == 'all' }} # please comment it for PR tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For arc/core/spr perf test, except for the code that needs to be reverted, other changes LGTM. For igpu test, please refer to yuwen’s comment.

self.assertTrue(res)

def test_transformers_auto_model_for_causal_lm_int4(self):
model_path = os.environ.get('ORIGINAL_REPLIT_CODE_PATH')
model_path = os.environ.get('ORIGINAL_CODESHELL_7B_PATH')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replit-code-v1-3b cannot run with transformers 4.36, so another code generation model is used here instead :)

@@ -99,7 +99,7 @@ jobs:
echo "LLAMA_ORIGIN_PATH=${ORIGIN_DIR}/llama-7b-hf" >> "$GITHUB_ENV"
echo "BLOOM_ORIGIN_PATH=${ORIGIN_DIR}/bloom-7b1" >> "$GITHUB_ENV"
echo "ORIGINAL_CHATGLM2_6B_PATH=${ORIGIN_DIR}/chatglm2-6b" >> "$GITHUB_ENV"
echo "ORIGINAL_REPLIT_CODE_PATH=${ORIGIN_DIR}/replit-code-v1-3b" >> "$GITHUB_ENV"
echo "ORIGINAL_CODESHELL_7B_PATH=${ORIGIN_DIR}/CodeShell-7B-Chat" >> "$GITHUB_ENV"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for this change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replit-code-v1-3b cannot run with transformers 4.36, so another code generation model is used here instead :)

@jenniew
Copy link
Contributor Author

jenniew commented May 22, 2024

http://10.239.44.136:8888/pr_perf_gpu/ Need to check the performance before the merge.

@hkvision @lalalapotter According to the performance report, seems mpt-7b-chat has some performance gap.
For mpt-7b-chat, we upgrade the model since the old model cannot be compatible with transformers 4.36, but the new model implementation has been changed. So we may need to update our optimization for the new model.

Copy link
Contributor

@hkvision hkvision left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Arc related changes LGTM.

paths:
- ".github/workflows/llm_performance_tests.yml"
- "python/llm/test/benchmark/**"
- "python/llm/dev/benchmark/all-in-one/**"
Copy link
Contributor

@Oscilloscope98 Oscilloscope98 May 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should revert the change in this file which was for PR tests; any only leave necessary changes :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted

Copy link
Contributor

@liu-shaojun liu-shaojun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hkvision hkvision merged commit 0a06a6e into intel-analytics:main May 24, 2024
50 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants