Skip to content

Conversation

@dienruei123
Copy link

Training TASTE w/o vq, using llama tokenizer (Llama-3.2-1B)

  • Modify distil-whisper's decoder embedding dimensions to fit llama token size
  • Modify tokenize stage in data-processing (whisper --> llama)
  • Borrowing the weights from pretrained text-only_baseline model, and replacing the embedding layer
  • Create a temporary training config (taste_no_vq_llama.yaml)

@GitYCC GitYCC changed the title Training TASTE w/o vq using llama tokenizer (text) [WIP] Training TASTE w/o vq using llama tokenizer (text) Jul 30, 2025
@GitYCC GitYCC changed the title [WIP] Training TASTE w/o vq using llama tokenizer (text) WIP: Training TASTE w/o vq using llama tokenizer (text) Jul 30, 2025
make_v_proj_identity: bool = False,
is_word_level: bool = False,
skip_prefix_idx: Optional[int] = None,
vocab_size: int = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new augment new_vocab_size=None
if new_vocab_size is not None:
do your work

postfix_token_to_wrap = [tokenizer.eos_token_id] if add_eos else []
_skip_prefix_idx = len(prefix_token_to_wrap)
logging.info(f"Tokenizer is from transformers `WhisperTokenizerFast` of transformers. Decoder prefix ids: {forced_decoder_ids}.")
if whisper_tokenizer_name_or_fpath.endswith("Llama-3.2-1B"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

只要不是whisper都應該採用這條路徑

prefix_token_to_wrap = [tokenizer.bos_token_id] if add_bos else []
postfix_token_to_wrap = [tokenizer.eos_token_id] if add_eos else []
_skip_prefix_idx = len(prefix_token_to_wrap)
logging.info(f"Using Llama tokenizer from {whisper_tokenizer_name_or_fpath}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

記得同時修

$RTSLM_WORK_DIR/CosyVoice/cosyvoice/bin/train.py \
--train_engine $train_engine \
--config $conf_fpath \
--train_data ./data/train.data.list \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

libritts 資料的格式也可運作嗎?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants