Skip to content

Release 2.48.0 corresponding to NGC container 24.07

Compare
Choose a tag to compare
@nvda-mesharma nvda-mesharma released this 05 Aug 20:38
· 16 commits to main since this release
128abc3

What's Changed

  • Removed explicit mode for multi-lora by @oandreeva-nv in #45
  • test: Limiting multi-gpu tests to use Ray as distributed_executor_backend by @oandreeva-nv in #47
  • perf: Improve vLLM backend performance by using a separate thread for responses by @Tabrizian in #46

Full Changelog: v24.06...v24.07