We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I got the following error when i run vllm_variable_size or naive_hf_variable_size.
1536 4 1000 ~/AIML/llm-continuous-batching-benchmarks-winston ~/AIML/llm-continuous-batching-benchmarks-winston/benchmark_configs Traceback (most recent call last): File "/home/ubuntu/AIML/llm-continuous-batching-benchmarks-winston/./benchmark_throughput.py", line 597, in main() File "/home/ubuntu/AIML/llm-continuous-batching-benchmarks-winston/./benchmark_throughput.py", line 539, in main prompts, prompt_lens = gen_random_prompts_return_lens( File "/home/ubuntu/AIML/llm-continuous-batching-benchmarks-winston/./benchmark_throughput.py", line 479, in gen_random_prompts_return_lens assert len( AssertionError: Expected prompt to contain exactly 512 tokens, got len(encoded)=350
my env: Machine: g5.4xlarge Model: meta-llama/Llama-2-7b-chat-hf
any reason for this error?
The text was updated successfully, but these errors were encountered:
No branches or pull requests
I got the following error when i run vllm_variable_size or naive_hf_variable_size.
1536 4 1000
~/AIML/llm-continuous-batching-benchmarks-winston ~/AIML/llm-continuous-batching-benchmarks-winston/benchmark_configs
Traceback (most recent call last):
File "/home/ubuntu/AIML/llm-continuous-batching-benchmarks-winston/./benchmark_throughput.py", line 597, in
main()
File "/home/ubuntu/AIML/llm-continuous-batching-benchmarks-winston/./benchmark_throughput.py", line 539, in main
prompts, prompt_lens = gen_random_prompts_return_lens(
File "/home/ubuntu/AIML/llm-continuous-batching-benchmarks-winston/./benchmark_throughput.py", line 479, in gen_random_prompts_return_lens
assert len(
AssertionError: Expected prompt to contain exactly 512 tokens, got len(encoded)=350
my env:
Machine: g5.4xlarge
Model: meta-llama/Llama-2-7b-chat-hf
any reason for this error?
The text was updated successfully, but these errors were encountered: