Skip to content

Multi-Agent Dual Learning #113

@kweonwooj

Description

@kweonwooj

Abstract

  • propose multi-agent dual learning framework to boost performance of neural machine translation
  • dual learning leverages the duality between primal task (X->Y) and dual task (Y->X)
  • SOTA score on WMT 2014 EnDe BLEU : 30.67 (+2.2 compared to Transformer_big)

Details

Introduction

  • Dual Learning
    • formulated as a two-agent system where primal model learns f : X -> Y mapping and dual model learns g : Y -> X mapping.
    • given x in X, delta (x, g(f(x))) is the reconstruction loss function used for training signal.
    • Theoretically, monolingual corpus is sufficient to learn NMT model in dual learning framework.
    • refer to original dual learning paper (Xia et al. 2016) accepted at NIPS2016 for more details
  • Multi-agent Dual Learning
    • instead of single f and g, multi agent system uses N - 1 additional agents in each side, pre-trained with parallel corpus via different random seed. Ensemble effect boosts the quality of feedback signal.

Algorithm

screen shot 2018-10-05 at 1 19 09 pm

Results

  • Experimental Settings
    • Model : Transformer Big
    • compare with Knowledge Distillation (KD), Back Translation (BT) and two-agent Dual Learning (Dual) each with single and multi-agent
  • IWSLT En <-> De
    • KD improves BLEU little, BT has no effect, Dual-5 improves BLEU best
      screen shot 2018-10-05 at 1 19 52 pm
  • IWSLT Es, Ru, He -> En
    • result is consistent throughout various language pairs in IWSLT
      screen shot 2018-10-05 at 1 22 09 pm
  • WMT 2014 En <-> De Bilingual
    • KD improves BLEU little, BT has no effect, Dual-5 improves BLEU best (SOTA)
      screen shot 2018-10-05 at 1 23 22 pm
  • WMT 2014 En <-> De Monolingual
    • also, performs best in unsupervised NMT (SOTA)
      screen shot 2018-10-05 at 1 23 34 pm

Image Translation

  • compares Multi-Agent Dual Learning with CycleGAN in image translation, with MADL showing more robust and cleaner image translation

Personal Thoughts

  • Multi-Agent pre-trained models provide good initialization point and improve the quality of feedback signal
  • Existing dual learning seemed to have only theoretical merit, not practical enough. But this paper uncovers the practical merit as well.
  • Seems to work across various languages

Link : https://openreview.net/pdf?id=HyGhN2A5tm
Authors : Anonymous

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions