TAG/TWG Supertagger

This repository contains a ParTAGe-compliant, PyTorch-based implementation of a TAG/TWG supertagger.

Table of Contents

Installation
Usage

Installation

The tool requires Python 3.8+. If you use conda, you can set up an appropriate environment using the following commands (substituting <env-name> for the name of the environment):

conda create --name <env-name> python=3.8
conda activate <env-name>

Then, to install the tool (together with its dependencies), run:

pip install .

Finally, install disco-dop from its github repository.

Usage

Data format

The tool supports the same format as partage.

Configuration

The model (embedding size, BiLSTM depth, etc.) and training (number of epochs, learning rates, etc.) configuration is currently hard-coded in supertagger/config.py. It can be replaced during training by providing appropriate .json configuration files.

Training

To train a supertagging model, you will need:

fastText.bin: binary fastText model (important: the embedding size of the model must be specified in the configuration)
train.supertags: training dataset (see data format)
dev.supertags (optional): development dataset

Then, to train a model and save it in model.pth:

python -m supertagger train -f fastText.bin -t train.supertags -d dev.supertags --save model.pth

See python -m supertagger train --help for additional training options.

Tagging

To use an existing model to supertag a given input.supertags file:

python -m supertagger tag -f fastText.bin -i input.supertags

TODO: Blind

Add a command to remove supertagging information from a given supertagging file, e.g.:

python -m supertagger blind -i input.supertags > input.blind.supertags

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
config		config
supertagger		supertagger
.gitmodules		.gitmodules
README.md		README.md
requirements-cpu.txt		requirements-cpu.txt
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TAG/TWG Supertagger

Installation

Usage

Data format

Configuration

Training

Tagging

TODO: Blind

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TAG/TWG Supertagger

Installation

Usage

Data format

Configuration

Training

Tagging

TODO: Blind

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages