fix(gguf): Ensure Gemma2 configs have hidden_act for backward compatibility#30411
fix(gguf): Ensure Gemma2 configs have hidden_act for backward compatibility#30411kitaekatt wants to merge 1 commit intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces several fixes and improvements related to GGUF model loading. The primary fix ensures backward compatibility for Gemma2 models by setting the hidden_act attribute in the configuration, resolving a potential AttributeError. Additionally, the PR enhances dtype handling for quantized models by automatically selecting a compatible dtype when a conflict arises, which improves user experience by avoiding crashes. It also addresses hardware-specific precision issues on Blackwell GPUs for GGUF models by disabling bfloat16. Finally, it correctly handles tied word embeddings in GGUF models. The changes are well-implemented and improve the robustness of GGUF model support in vLLM.
04ceef5 to
126db87
Compare
|
This pull request has merge conflicts that must be resolved before it can be |
126db87 to
8108f82
Compare
8108f82 to
a9fb4d7
Compare
…bility GGUF-loaded configs may only have hidden_activation from config.json, but Gemma2MLP model code expects hidden_act attribute. This adds a post-processing step to copy hidden_activation to hidden_act when needed. Fixes AttributeError: 'Gemma2Config' object has no attribute 'hidden_act' when loading Gemma2 GGUF models. Signed-off-by: Christina <truffle@gmail.com>
a9fb4d7 to
eb974b4
Compare
yewentao256
left a comment
There was a problem hiding this comment.
LGTM, thanks for the work!
hmellor
left a comment
There was a problem hiding this comment.
- Model specific edge cases in config classes should be avoided if possible
- Why can't we just update
gemma2.pyto accesshidden_activation?
Summary
Fixes
AttributeError: 'Gemma2Config' object has no attribute 'hidden_act'when loading Gemma2 GGUF models.Changes
ModelConfig.__init__, if model_type is "gemma2" and config hashidden_activationbut nothidden_act, copy the valueRoot Cause
Gemma2Configonly defineshidden_activation, nothidden_actgemma2.pydirectly accessesconfig.hidden_actwithout fallbackTesting
Tested with
bartowski/gemma-2-2b-it-GGUF- model resolves architecture correctly.