Skip to content
forked from madhavcsa/TFBA

HIgher-order Relation Schema Induction using Tensor Factorization with Back-off and Aggregation

Notifications You must be signed in to change notification settings

parthatalukdar/TFBA

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

TFBA

Codes for Tensor Factorization with Back-off and Aggregation.

Prerequisites:

install sktensor (https://github.com/mnick/scikit-tensor)

This package contains the following files:

  • dataGen.py -- Used to generate tensors from the set of tuples.
  • factorize.py -- Joint tensor factorization.
  • cliqueMine.py -- Constrained Clique mining.

Running Instructions:

  • python2.7 dataGen.py <tuples_file> <output_dir>
    --- Each line in the input file is a tab separated 4-tuple of the format subject "\t" relation "\t" object "\t" other "\t" frequency.
    --- 3-tuples can also be provided in the same file along with 4-tuples, in which case use the string "" for other.
    --- This script will create pkl files in the output directory.

  • python2.7 factorize.py <data_dir> <output_dir> [other options]
    --- Performs the factorization and store the latent factor matrices and core tensors in the <output_dir> directory.
    --- <data_dir> should be same as the <output_dir> of dataGen.py.
    optional arguments:
    -h, --help show this help message and exit
    --minLambda MINLAMBDA [MINLAMBDA ...]
    ** Enter the min lambda (list), default = 0.1 0.1 0.1
    --maxLambda MAXLAMBDA [MAXLAMBDA ...]
    ** Enter the max lambda (list), needed only for grid search. If no grid search, provide only minLambda option. --step STEP Enter the step size for grid search (default = 0.5)
    --maxIters MAXITERS Enter the maximum iterations (default = 10)
    --rank1 RANK1 Enter rank1 (default = 10)
    --rank2 RANK2 Enter rank2 (default = 10)
    --rank3 RANK3 Enter rank3 (default = 10)
    --fit FIT Y/N, default = N. Give Y for fit computation.
    --cores CORES Number of Threads

  • python2.7 cliqueMine.py <data_dir> <output_dir> --rank r1 r2 r3
    --- Performs constrained clique mining and stores the schemas in <output_dir>
    --- <data_dir> should be same as <data_dir> used to run Factorize.py

References:

[1] Madhav Nimishakavi, Manish Gupta and Partha Talukdar. Relation Schema Induction using Tensor Factorization with Back-off and Aggregation. Proceedings of 2018 Conference on Association for Computaional Linguistics (ACL 2018).

About

HIgher-order Relation Schema Induction using Tensor Factorization with Back-off and Aggregation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%