Update on the development branch #2111
kaiyux
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
The TensorRT-LLM team is pleased to announce that we have pushed an update to the development branch (and the Triton backend) this Aug 13rd, 2024.
This update includes:
examples/exaone/README.md
.ModelWeightsLoader
is enabled for LLaMA family models (experimental)TRTLLM_DISABLE_UNIFIED_CONVERTER=1
to disable the model weights loader for those cases and fallback to the legacy path.executor
API, and it will be removed in a future release of TensorRT-LLM.max_seq_len
is not an integer. (llama 3.1 70B Instruct would not build engine "TypeError: set_shape(): incompatible function arguments." #2018)nvcr.io/nvidia/pytorch:24.07-py3
.nvcr.io/nvidia/tritonserver:24.07-py3
.Thanks,
The TensorRT-LLM Engineering Team
Beta Was this translation helpful? Give feedback.
All reactions