Freeze multimodal embedding when training.
Freeze multimodal embedding when training.