ExplainableTextSimplification

A workflow that improves the readability of complex texts, by performing lexical text simplification. It is built on the following fundamental steps:

Classification of an input text.
If the text is complex, detection of the complex parts. It employs feature importance interpretability techniques.
An algorithm that iteratively performs masking in order to change the complex parts in simpler ones.

The proposed workflow is presented in the following graph:

Models

The models fine-tuned for this task are BioBERT and DistilBERT. They were trained to receive as input text either a sentence or a paragraph. All the models are available at this link.

As a result, for the classifier we have four models:

# Name	# Base Model	# Class of Training Data	# Type of Input Data
bert_par_all	BioBERT	all	paragraphs
dist_par_all	DistBERT	all	paragraphs
bert_sent_all	BioBERT	all	sentences
dist_sent_all	DistBERT	all	sentences

And for the mask-filler we have eight models:

# Name	# Base Model	# Class of Training Data	# Type of Input Data
bert_par_all	BioBERT	all	paragraphs
dist_par_all	DistBERT	all	paragraphs
bert_par_plain	BioBERT	plain	paragraphs
dist_par_plain	DistBERT	plain	paragraphs
bert_sent_all	BioBERT	all	sentences
dist_sent_all	DistBERT	all	sentences
bert_sent_plain	BioBERT	plain	sentences
dist_sent_plain	DistBERT	plain	sentences

For the masking experiments, we also tried the base models, not at all trained on this data.

Project Structure

The project has the following structure:

code: includes multiple scripts created for the different parts of the project, from data processing to fine-tuning models and evaluation of results.
data: the datasets used for this project.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
code		code
data		data
images		images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ExplainableTextSimplification

Models

Project Structure

About

Uh oh!

Releases

Packages

Languages

License

mcmaniou/ExplainableTextSimplification

Folders and files

Latest commit

History

Repository files navigation

ExplainableTextSimplification

Models

Project Structure

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages