-
Notifications
You must be signed in to change notification settings - Fork 447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support minicpm3-4b #2465
Support minicpm3-4b #2465
Conversation
It is ok to copy weight into lm_head in load_weight (with the price that same weight would be doubled). |
May update the supported models |
@zhulinJulia24 Please add this model to test set |
from .builder import AutoModelConfigBuilder | ||
|
||
|
||
class DeepseekV2ModelConfigBuilder(AutoModelConfigBuilder): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename
lmdeploy/pytorch/models/minicpm3.py
Outdated
) | ||
|
||
|
||
class MiniCPMLongRoPE(MiniCPMRotaryEmbedding): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LongRoPEScaling = auto() |
Can we use mla? |
I tried MLA and rope scaling like deepseekv2. But the result seemed wrong. |
|
lmdeploy/pytorch/models/minicpm3.py
Outdated
return torch.cat((-x2, x1), dim=-1) | ||
|
||
|
||
def apply_rotary_pos_emb(q, k, cos, sin, position_ids, unsqueeze_dim=1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have provide apply rotary and rotary embedding op in pytorch.nn
, Which is fused and would provide better performance.
Less than 1000 blocks can be allocated for kv caches with |
dtype=dtype, | ||
device=device, | ||
quant_config=quantization_config, | ||
is_tp=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If tp=1
is passed to PytorchEngineConfig
, do we still set is_tp=True
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems all the is_tp
are set to True in pytorch engine regardless of tp=1
or not.
update_weights
function is not obvious to users. @grimoire