Skip to content

afonso-sousa/pam

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PAM: Paraphrase AMR-Centric Evaluation Metric

This repo contains the code for the paper PAM: Paraphrase AMR-Centric Evaluation Metric, by Afonso Sousa & Henrique Lopes Cardoso (ACL Findings 2025).

Paraphrasing is rooted in semantics, which makes evaluating paraphrase generation systems hard. Current paraphrase generators are typically evaluated using borrowed metrics from adjacent text-to-text tasks, like machine translation or text summarization. These metrics tend to have ties to the surface form of the reference text. This is not ideal for paraphrases as we typically want variation in the lexicon while persisting semantics. To address this problem, and inspired by learned similarity evaluation on plain text, we propose PAM, a Paraphrase AMR-Centric Evaluation Metric. This metric uses AMR graphs extracted from the input text, which consist of semantic structures agnostic to the text surface form, making the resulting evaluation metric more robust to variations in syntax or lexicon. Additionally, we evaluated \pam on different semantic textual similarity datasets and found that it improves the correlations with human semantic scores when compared to other AMR-based metrics.

Installation

First, to create a fresh conda environment with all the used dependencies run:

conda env create -f environment.yml

Additionally, for most scripts you will need the pretrained AMR parser. We used parse_xfm_bart_large from here. Download it, rename it to amr_parser, and place it in the root directory.

Preprocess data

Go to data/README and extract the third-party data into /data folder.

data
└── dataset_name
    └── main
        └──raw
           │ src.dev.amr
           │ src.test.amr
           │ tgt.dev.amr
           │ tgt.test.amr

Then use merge_dataset.sh to merge the information into a json file. For the aforementioned example, the output file should be placed under /main.

Train and test models

To train/test PAM or any other model refered to in the paper you can run the corresponding script. For example:

sh ./scripts/train_pam.sh
sh ./scripts/test_pam.sh

Further finetune

To further finetune the trained model on Quora Question Pairs (QQP), run:

sh ./scripts/paraphrase_finetune.sh

Other experiments reported in the paper

For many experiments reported in the paper, we used third-party libraries integrated into our source code, which require you to extract them to the root directory and potentially install the respective packages -- for example, AlignScore.

Others, like WWLK, were computed using the original source code.

Some files were used for smaller, single experiments:

Acknowledgements

This project used code and took inspiration from the following open source projects:

License

This project is released under the MIT License.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors