You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Dec 1, 2024. It is now read-only.
command 1:
python -m flexgen.flex_opt --model facebook/opt-30b --path _DUMMY_ --prompt-len 20 --gen-len 15 --percent 25 75 60 40 0 100 --gpu-batch-size 1 --num-gpu-batches 2 --cpu-cache-compute --debug fewer_batch
peak gpu mem: 6.0679 GB
command 2:
python -m flexgen.flex_opt --model facebook/opt-30b --path _DUMMY_ --prompt-len 20 --gen-len 15 --percent 30 70 60 40 0 100 --gpu-batch-size 1 --num-gpu-batches 2 --cpu-cache-compute --debug fewer_batch
gpu oom
The only difference of command 2 from command 1 is the percentage of gpu usage of weight to increase from 25% to 30%.
The capacity of my gpu is 24 GB.
The text was updated successfully, but these errors were encountered: