Skip to content

linool/clirnews

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Code Repository for Thesis Fairness-Aware Multilingual Information Retrieval

This repository contains the code for my master thesis at the University of Stavanger in June 2024. It includes a conda environment file (environment.yml), this README file, and three folders: indexers, experiments, and demo.

Repository Structure

  • indexers

    • Li_indexer_eng.py
    • Li_indexer_fas.py
    • Li_indexer_rus.py
    • Li_indexer_zho.py
  • experiments

    • Li_CLIR_Fas_1000_Retrieve_Translate_Rank.ipynb
    • Li_CLIR_Zho_1000_Retrieve_Translate_Rank.ipynb
    • mlir_experiment.py
    • Li_CLIR_Rus_1000_Retrieve_Translate_Rank.ipynb
    • Li_get_dict_docid_lang.ipynb
  • demo

    • clirnews.py

Installation

To run the code, you need to have PyLucene installed. Here are the instructions on how to install PyLucene from the official website:

Building PyLucene

PyLucene is completely code-generated by JCC whose sources are included with the PyLucene sources.

Requirements

To build PyLucene, a Java Development Kit (JDK) is required; use of the resulting PyLucene binaries requires only a Java Runtime Environment (JRE). A recent C/C++ compiler is also required.

  • Starting with release 9.x, Lucene requires Java 11 or above.
  • Starting with release 6.x, Lucene requires Java 1.8.

On macOS and Linux, the Temurin JDK is recommended. See "Notes for Linux" on the PyLucene install page for installation instructions on Linux Debian 11.

On any system, if you're upgrading your Java installation, please rebuild JCC as well. You must use the same version of Java for both JCC and PyLucene.

A modern version of setuptools is required for building JCC in shared mode. See JCC's installation instructions for more information.

Quick Installation Guide

pushd jcc
# Edit setup.py to match your environment
python setup.py build
sudo python setup.py install
popd
# Edit Makefile to match your environment
make
make test # (look for failures)
sudo make install

For more detailed instructions, please refer to the official documentation.

Usage

  1. Clone the repository:
    git clone https://github.com/linool/clirnews.git
  2. Navigate to the repository directory:
    cd clirnews
  3. Create and activate the conda environment:
    conda env create -f environment.yml
    conda activate clir
  4. Run the desired Jupyter notebooks or Python scripts from the respective folders (indexers, experiments, or demo).

License

This project is licensed under the GNU General Public License v3.0. See the LICENSE file for details.

About

Fairness-Aware Multilingual Information Retrieval

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published