The name of weight for vision_tower in the safetensors file and in the model do not match.

by xf2022 - opened Feb 5

xf2022

Feb 5

`class LayerScale(nn.Module):

    def __init__(self, dim, init_values=1e-5, inplace=False, force_fp32=False):
         super().__init__()
        self.inplace = inplace
        self.weight = nn.Parameter(init_values * torch.ones(dim))
        self.force_fp32 = force_fp32`

In the model and the model.safetensors.index.json file, the name of weight for LayerScale is model.vision_tower.vision_tower.blocks.0.ls1.weight.
However, only model.vision_tower.vision_tower.blocks.0.ls1.gamma can be found in the model-00003-of-00004.safetensors file.

xf2022 changed discussion title from The names `model.vision_tower.vision_tower.blocks.0.ls1.gamma` in the safetensors file and `model.vision_tower.vision_tower.blocks.0.ls1.weight` in the model do not match. to The name `model.vision_tower.vision_tower.blocks.0.ls1.gamma` in the safetensors file and `model.vision_tower.vision_tower.blocks.0.ls1.weight` in the model do not match. Feb 5

xf2022 changed discussion title from The name `model.vision_tower.vision_tower.blocks.0.ls1.gamma` in the safetensors file and `model.vision_tower.vision_tower.blocks.0.ls1.weight` in the model do not match. to The name of weight for vision_tower in the safetensors file and in the model do not match. Feb 5

xf2022

Feb 6

In versions of transformers prior to 4.49.0, there was a logic that would fix the key values of the weights.

But this logic is not found in the newer versions. Therefore when using newer versions of transformers,model.vision_tower.vision_tower.blocks.0.ls1.weight can not be initialized from the safetensors file.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment