Release 2.48.0 corresponding to NGC container 24.07
What's Changed
- Removed explicit mode for multi-lora by @oandreeva-nv in #45
- test: Limiting multi-gpu tests to use Ray as distributed_executor_backend by @oandreeva-nv in #47
- perf: Improve vLLM backend performance by using a separate thread for responses by @Tabrizian in #46
Full Changelog: v24.06...v24.07