-
Notifications
You must be signed in to change notification settings - Fork 10
Description
I trained with a 4090 but was not able to reach the metrics reported in the paper. My results are (2014: 62.4, 2016: 60.7, 2019: 63.6). I would like to ask whether the authors used any other hyperparameters. The configuration I used is as follows:
seed_everything: 7
trainer:
checkpoint_callback: true
callbacks:
- class_path: pytorch_lightning.callbacks.LearningRateMonitor
init_args:
logging_interval: epoch
- class_path: pytorch_lightning.callbacks.ModelCheckpoint
init_args:
save_top_k: 1
monitor: val_ExpRate
mode: max
filename: '{epoch}-{step}-{val_ExpRate:.4f}'
default_root_dir: 'lightning_logs/version_0'
gpus: 1
check_val_every_n_epoch: 2
max_epochs: 300
deterministic: true
num_sanity_val_steps: 1
model:
d_model: 256
encoder
growth_rate: 24
num_layers: 16
decoder
nhead: 8
num_decoder_layers: 3
dim_feedforward: 1024
dropout: 0.3
dc: 32
cross_coverage: true
self_coverage: true
beam search
beam_size: 10
max_len: 200
alpha: 1.0
early_stopping: false
temperature: 1.0
training
learning_rate: 0.08
patience: 20
data:
zipfile_path: data_crohme.zip
test_year: '2014'
train_batch_size: 8
eval_batch_size: 4
num_workers: 4
scale_aug: True