hyper parameters for MultiWOZ

Hi ! For all datasets in dialoGLUE benchmark, I can reproduce similar results except for the MultiWOZ.
For ConverBERT-DG, your joint goal is around 58, but I can only get 56, which is the same as the original Trippy reported.
I wonder if you have used different hyper-parameters for Trippy?  If so, can you share them ?

Thank you!

The original hypers for Trippy are as follows:

`--do_lower_case \
	    --learning_rate=1e-4 \
	    --num_train_epochs=10 \
	    --max_seq_length=180 \
	    --per_gpu_train_batch_size=48 \
	    --per_gpu_eval_batch_size=1 \
	    --output_dir=${OUT_DIR} \
	    --save_epochs=2 \
	    --logging_steps=10 \
	    --warmup_proportion=0.1 \
	    --eval_all_checkpoints \
	    --adam_epsilon=1e-6 \
	    --label_value_repetitions \
            --swap_utterances \
	    --append_history \
	    --use_history_labels \
	    --delexicalize_sys_utts \
	    --class_aux_feats_inform \
	    --class_aux_feats_ds \
`



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hyper parameters for MultiWOZ #13

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

hyper parameters for MultiWOZ #13

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions