You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is a parameter named dende_decoding in minference/models_patch.py. And variables dense_k, dense_v in inf_llm.py don't appear in the original InfLLM implementation.
Could you please explain more details on the designs? Thanks a lot!
Best wishes
The text was updated successfully, but these errors were encountered:
Since MInference focuses only on pre-filling speedup, we aligned all baselines to perform sparse pre-filling with dense decoding. Therefore, we re-implemented InfLLM's dense decoding ourselves to ensure a fair comparison, which improves upon the original implementation of InfLLM.
Describe the issue
Dear author,
Thanks for your outstanding work!
There is a parameter named
dende_decoding
inminference/models_patch.py
. And variablesdense_k
,dense_v
ininf_llm.py
don't appear in the original InfLLM implementation.Could you please explain more details on the designs? Thanks a lot!
Best wishes
The text was updated successfully, but these errors were encountered: