Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When I was inputting long text into a large model, that is, when the len of the text was 1024*1024, a StackOverflow error occurred. #327

Open
YangQiangli opened this issue Jul 31, 2024 · 0 comments

Comments

@YangQiangli
Copy link

When I was inputting long text into a large model, that is, when the len of the text was 1024*1024, a StackOverflow error occurred.

thread '<unnamed>' panicked at src/lib.rs:227:33:
called `Result::unwrap()` on an `Err` value: RuntimeError(StackOverflow)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Traceback (most recent call last):
  File "/home/yang/teststack_1m.py", line 37, in <module>
    encoded_tokens = your_object.encode_request(request_id=request_id, prompt=prompt)
  File "/home/yang/teststack_1m.py", line 19, in encode_request
    prompt_token_ids = self.tokenizer.encode(prompt, add_special_tokens=True)
  File "/usr/local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2600, in encode
    encoded_inputs = self.encode_plus(
  File "/usr/local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 3008, in encode_plus
    return self._encode_plus(
  File "/usr/local/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 719, in _encode_plus
    first_ids = get_input_ids(text)
  File "/usr/local/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 686, in get_input_ids
    tokens = self.tokenize(text, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 617, in tokenize
    tokenized_text.extend(self._tokenize(token))
  File "/root/.cache/huggingface/modules/transformers_modules/tokenization_chatglm.py", line 88, in _tokenize
    ids = self.tokenizer.encode(text)
  File "/usr/local/lib/python3.10/site-packages/tiktoken/core.py", line 124, in encode
    return self._core_bpe.encode(text, allowed_special)
pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: RuntimeError(StackOverflow)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant