Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I can correctly obtain reasoning results using this code:“vila-infer \ --model-path /data/workspace/zhaoyong/model/weight_files/VILA1.5-3B \ --conv-mode vicuna_v1 \ --text "Please describe the video." \ --media /data/workspace/zhaoyong/data/安全帽.mp4”, but I get an error when using this code:“python -W ignore server.py \ --port 8000 \ --model_path /data/workspace/zhaoyong/model/weight_files/VILA1.5-3B \ --conv_mode vicuna_v1”. Why is that? How should I solve it? #163

Open
HAOYON-666 opened this issue Dec 18, 2024 · 0 comments

Comments

@HAOYON-666
Copy link

[2024-12-18 17:36:31,349] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)
INFO: Started server process [3865832]
INFO: Waiting for application startup.
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████| 2/2 [00:01<00:00, 1.14it/s]
We've detected an older driver with an RTX 4000 series GPU. These drivers have issues with P2P. This can affect the multi-gpu inference when using accelerate device_map.Please make sure to update your driver to the latest version which resolves this.
ERROR: Traceback (most recent call last):
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/starlette/routing.py", line 693, in lifespan
async with self.lifespan_context(app) as maybe_state:
File "/home/user/miniconda3/envs/VILA/lib/python3.10/contextlib.py", line 199, in aenter
return await anext(self.gen)
File "/data/workspace/zhaoyong/model/VILA/server.py", line 118, in lifespan
tokenizer, model, image_processor, context_len = load_pretrained_model(model_path, model_name, None)
File "/data/workspace/zhaoyong/model/VILA/llava/model/builder.py", line 115, in load_pretrained_model
model = LlavaLlamaModel(config=config, low_cpu_mem_usage=True, **kwargs)
File "/data/workspace/zhaoyong/model/VILA/llava/model/language_model/llava_llama.py", line 49, in init
self.init_vlm(config=config, *args, **kwargs)
File "/data/workspace/zhaoyong/model/VILA/llava/model/llava_arch.py", line 74, in init_vlm
self.llm, self.tokenizer = build_llm_and_tokenizer(llm_cfg, config, *args, **kwargs)
File "/data/workspace/zhaoyong/model/VILA/llava/model/language_model/builder.py", line 203, in build_llm_and_tokenizer
tokenizer.stop_tokens = infer_stop_tokens(tokenizer)
File "/data/workspace/zhaoyong/model/VILA/llava/utils/tokenizer.py", line 174, in infer_stop_tokens
template = tokenize_conversation(DUMMY_CONVERSATION, tokenizer, overrides={"gpt": SENTINEL_TOKEN})
File "/data/workspace/zhaoyong/model/VILA/llava/utils/tokenizer.py", line 110, in tokenize_conversation
text = tokenizer.apply_chat_template(conversation, add_generation_prompt=add_generation_prompt, tokenize=False)
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1803, in apply_chat_template
chat_template = self.get_chat_template(chat_template, tools)
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1967, in get_chat_template
raise ValueError(
ValueError: Cannot use chat template functions because tokenizer.chat_template is not set and no template argument was passed! For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation at https://huggingface.co/docs/transformers/main/en/chat_templating

ERROR: Application startup failed. Exiting.

@HAOYON-666 HAOYON-666 changed the title I can correctly infer the result using this code:“vila-infer \ --model-path /data/workspace/zhaoyong/model/weight_files/VILA1.5-3B \ --conv-mode vicuna_v1 \ --text "Please describe the video." \ --media /data/workspace/zhaoyong/data/安全帽.mp4”, but I get an error when using this code:“python -W ignore server.py \ --port 8000 \ --model_path /data/workspace/zhaoyong/model/weight_files/VILA1.5-3B \ --conv_mode vicuna_v1”. Why is that? How should I solve it? I can correctly obtain reasoning results using this code:“vila-infer \ --model-path /data/workspace/zhaoyong/model/weight_files/VILA1.5-3B \ --conv-mode vicuna_v1 \ --text "Please describe the video." \ --media /data/workspace/zhaoyong/data/安全帽.mp4”, but I get an error when using this code:“python -W ignore server.py \ --port 8000 \ --model_path /data/workspace/zhaoyong/model/weight_files/VILA1.5-3B \ --conv_mode vicuna_v1”. Why is that? How should I solve it? Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant