[Question]: How to understand dense_decoding? #94

lemyx · 2024-12-13T12:25:44Z

Describe the issue

Dear author,

Thanks for your outstanding work!

There is a parameter named dende_decoding in minference/models_patch.py. And variables dense_k, dense_v in inf_llm.py don't appear in the original InfLLM implementation.

Could you please explain more details on the designs? Thanks a lot!

Best wishes

The text was updated successfully, but these errors were encountered:

iofu728 · 2024-12-15T05:12:51Z

Hi @lemyx, thanks for your interest in our work.

Since MInference focuses only on pre-filling speedup, we aligned all baselines to perform sparse pre-filling with dense decoding. Therefore, we re-implemented InfLLM's dense decoding ourselves to ensure a fair comparison, which improves upon the original implementation of InfLLM.

lemyx added the question Further information is requested label Dec 13, 2024

iofu728 self-assigned this Dec 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question]: How to understand dense_decoding? #94

[Question]: How to understand dense_decoding? #94

lemyx commented Dec 13, 2024

iofu728 commented Dec 15, 2024

[Question]: How to understand dense_decoding? #94

[Question]: How to understand dense_decoding? #94

Comments

lemyx commented Dec 13, 2024

Describe the issue

iofu728 commented Dec 15, 2024