Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]: How to understand dense_decoding? #94

Open
lemyx opened this issue Dec 13, 2024 · 1 comment
Open

[Question]: How to understand dense_decoding? #94

lemyx opened this issue Dec 13, 2024 · 1 comment
Assignees
Labels
question Further information is requested

Comments

@lemyx
Copy link

lemyx commented Dec 13, 2024

Describe the issue

Dear author,

Thanks for your outstanding work!

There is a parameter named dende_decoding in minference/models_patch.py. And variables dense_k, dense_v in inf_llm.py don't appear in the original InfLLM implementation.

Could you please explain more details on the designs? Thanks a lot!

Best wishes

@lemyx lemyx added the question Further information is requested label Dec 13, 2024
@iofu728 iofu728 self-assigned this Dec 15, 2024
@iofu728
Copy link
Contributor

iofu728 commented Dec 15, 2024

Hi @lemyx, thanks for your interest in our work.

Since MInference focuses only on pre-filling speedup, we aligned all baselines to perform sparse pre-filling with dense decoding. Therefore, we re-implemented InfLLM's dense decoding ourselves to ensure a fair comparison, which improves upon the original implementation of InfLLM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants