text-style-transfer

Text style transfer is to convert a formal piece of text into an informal piece of text, vice cersa. We've encountered the problem of lack of parallel corpora to train these models that could lead to good task performance. In this project, we aim to increase performance of formality style transfer model using in-domain data augmentation methods including synonym replacement and round trip translation. Both of these methods focuses on in-domain training that avoids losing generality and assures the quality of data. Augmentation that replies on synonym replacement replaces certain words of the sentence and round trip translation translate the sentence from one language to another and back. For our baseline model, we incorporate GYFAC(Grammarly's Yahoo Answers Formality Corpus) corpus.
Machine transformers of our model are modeled through sequence to sequence(Seq2Seq) neural architecture along with the attention mechanism. The language model leverages the likelihood of belonging to the target domain and predicts the next word. Furthermore, we explore three different scoring functions that are dot, general and concatenate and evaluate our augmented model compared to the baseline model using BLEU score as our metric.

major Contributions

Improve model performance with data augmentation methods
Explore formality style transfer datasets and models
Compare results of different scoring variants
Application of Attention mechanism to solve the bottleneck problem of seq2seq
Wrote short full-paper on our findings and experiments
Conducted baseline, augmented, ablation study experiments

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
tknizer		tknizer
README.md		README.md
final_presentation.pdf		final_presentation.pdf
final_report.pdf		final_report.pdf
model.ipynb		model.ipynb
proposal.pdf		proposal.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

text-style-transfer

major Contributions

Directory

algorithms

ppt

GYFAC corpus is not shown due to confidential issues.

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

text-style-transfer

major Contributions

Directory

algorithms

ppt

GYFAC corpus is not shown due to confidential issues.

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages