You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I can correctly obtain reasoning results using this code:“vila-infer \ --model-path /data/workspace/zhaoyong/model/weight_files/VILA1.5-3B \ --conv-mode vicuna_v1 \ --text "Please describe the video." \ --media /data/workspace/zhaoyong/data/安全帽.mp4”, but I get an error when using this code:“python -W ignore server.py \ --port 8000 \ --model_path /data/workspace/zhaoyong/model/weight_files/VILA1.5-3B \ --conv_mode vicuna_v1”. Why is that? How should I solve it?
#163
Open
HAOYON-666 opened this issue
Dec 18, 2024
· 0 comments
[2024-12-18 17:36:31,349] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)
INFO: Started server process [3865832]
INFO: Waiting for application startup.
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████| 2/2 [00:01<00:00, 1.14it/s]
We've detected an older driver with an RTX 4000 series GPU. These drivers have issues with P2P. This can affect the multi-gpu inference when using accelerate device_map.Please make sure to update your driver to the latest version which resolves this.
ERROR: Traceback (most recent call last):
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/starlette/routing.py", line 693, in lifespan
async with self.lifespan_context(app) as maybe_state:
File "/home/user/miniconda3/envs/VILA/lib/python3.10/contextlib.py", line 199, in aenter
return await anext(self.gen)
File "/data/workspace/zhaoyong/model/VILA/server.py", line 118, in lifespan
tokenizer, model, image_processor, context_len = load_pretrained_model(model_path, model_name, None)
File "/data/workspace/zhaoyong/model/VILA/llava/model/builder.py", line 115, in load_pretrained_model
model = LlavaLlamaModel(config=config, low_cpu_mem_usage=True, **kwargs)
File "/data/workspace/zhaoyong/model/VILA/llava/model/language_model/llava_llama.py", line 49, in init
self.init_vlm(config=config, *args, **kwargs)
File "/data/workspace/zhaoyong/model/VILA/llava/model/llava_arch.py", line 74, in init_vlm
self.llm, self.tokenizer = build_llm_and_tokenizer(llm_cfg, config, *args, **kwargs)
File "/data/workspace/zhaoyong/model/VILA/llava/model/language_model/builder.py", line 203, in build_llm_and_tokenizer
tokenizer.stop_tokens = infer_stop_tokens(tokenizer)
File "/data/workspace/zhaoyong/model/VILA/llava/utils/tokenizer.py", line 174, in infer_stop_tokens
template = tokenize_conversation(DUMMY_CONVERSATION, tokenizer, overrides={"gpt": SENTINEL_TOKEN})
File "/data/workspace/zhaoyong/model/VILA/llava/utils/tokenizer.py", line 110, in tokenize_conversation
text = tokenizer.apply_chat_template(conversation, add_generation_prompt=add_generation_prompt, tokenize=False)
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1803, in apply_chat_template
chat_template = self.get_chat_template(chat_template, tools)
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1967, in get_chat_template
raise ValueError(
ValueError: Cannot use chat template functions because tokenizer.chat_template is not set and no template argument was passed! For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation at https://huggingface.co/docs/transformers/main/en/chat_templating
ERROR: Application startup failed. Exiting.
The text was updated successfully, but these errors were encountered:
HAOYON-666
changed the title
I can correctly infer the result using this code:“vila-infer \ --model-path /data/workspace/zhaoyong/model/weight_files/VILA1.5-3B \ --conv-mode vicuna_v1 \ --text "Please describe the video." \ --media /data/workspace/zhaoyong/data/安全帽.mp4”, but I get an error when using this code:“python -W ignore server.py \ --port 8000 \ --model_path /data/workspace/zhaoyong/model/weight_files/VILA1.5-3B \ --conv_mode vicuna_v1”. Why is that? How should I solve it?
I can correctly obtain reasoning results using this code:“vila-infer \ --model-path /data/workspace/zhaoyong/model/weight_files/VILA1.5-3B \ --conv-mode vicuna_v1 \ --text "Please describe the video." \ --media /data/workspace/zhaoyong/data/安全帽.mp4”, but I get an error when using this code:“python -W ignore server.py \ --port 8000 \ --model_path /data/workspace/zhaoyong/model/weight_files/VILA1.5-3B \ --conv_mode vicuna_v1”. Why is that? How should I solve it?
Dec 18, 2024
[2024-12-18 17:36:31,349] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)
INFO: Started server process [3865832]
INFO: Waiting for application startup.
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████| 2/2 [00:01<00:00, 1.14it/s]
We've detected an older driver with an RTX 4000 series GPU. These drivers have issues with P2P. This can affect the multi-gpu inference when using accelerate device_map.Please make sure to update your driver to the latest version which resolves this.
ERROR: Traceback (most recent call last):
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/starlette/routing.py", line 693, in lifespan
async with self.lifespan_context(app) as maybe_state:
File "/home/user/miniconda3/envs/VILA/lib/python3.10/contextlib.py", line 199, in aenter
return await anext(self.gen)
File "/data/workspace/zhaoyong/model/VILA/server.py", line 118, in lifespan
tokenizer, model, image_processor, context_len = load_pretrained_model(model_path, model_name, None)
File "/data/workspace/zhaoyong/model/VILA/llava/model/builder.py", line 115, in load_pretrained_model
model = LlavaLlamaModel(config=config, low_cpu_mem_usage=True, **kwargs)
File "/data/workspace/zhaoyong/model/VILA/llava/model/language_model/llava_llama.py", line 49, in init
self.init_vlm(config=config, *args, **kwargs)
File "/data/workspace/zhaoyong/model/VILA/llava/model/llava_arch.py", line 74, in init_vlm
self.llm, self.tokenizer = build_llm_and_tokenizer(llm_cfg, config, *args, **kwargs)
File "/data/workspace/zhaoyong/model/VILA/llava/model/language_model/builder.py", line 203, in build_llm_and_tokenizer
tokenizer.stop_tokens = infer_stop_tokens(tokenizer)
File "/data/workspace/zhaoyong/model/VILA/llava/utils/tokenizer.py", line 174, in infer_stop_tokens
template = tokenize_conversation(DUMMY_CONVERSATION, tokenizer, overrides={"gpt": SENTINEL_TOKEN})
File "/data/workspace/zhaoyong/model/VILA/llava/utils/tokenizer.py", line 110, in tokenize_conversation
text = tokenizer.apply_chat_template(conversation, add_generation_prompt=add_generation_prompt, tokenize=False)
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1803, in apply_chat_template
chat_template = self.get_chat_template(chat_template, tools)
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1967, in get_chat_template
raise ValueError(
ValueError: Cannot use chat template functions because tokenizer.chat_template is not set and no template argument was passed! For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation at https://huggingface.co/docs/transformers/main/en/chat_templating
ERROR: Application startup failed. Exiting.
The text was updated successfully, but these errors were encountered: