135 lines (108 loc) · 14.1 KB

Package Overview

Module	Purpose
erllm	Root package. Contains installation, documentation generation and helper code.
erllm.calibration	Calibration analysis on entity matching LLM predictions.
erllm.dataset	Covers entity representation, dataset loading and downsampling.
erllm.dataset.dbpedia	Handles DBPedia data including loading raw data into SQLite, interaction, and generation of labeled datasets using token blocking for benchmarking.
erllm.dataset.ditto	Convert existing datasets to DITTO format.
erllm.discarder	Explores the similarity-based discarder in isolation. Computes and saves set-based and embedding-based similarities for pairs of entities, Includes functionality to save results and computation time into similarity files, compute various discarder statistics, and generate visualizations.
erllm.discarding_matcher	Simulates and evaluates the similarity-based discarding matcher. Contains generation of performance plots, and analysis of time/performance trade-off.
erllm.discarding_selective_matcher	Implements the discarding selective matcher. It includes functionalities for assessing classification performance, generating comparison tables and creating contour plots.
erllm.ditto	Support for configuring DITTO to run on the DITTO datasets and subsequent evaluation and comparison to selective matcher.
erllm.llm_matcher	Contains code to create prompts from datasets and get responses via OpenAI's API. These are saved into run files which serve as cache for all composite matchers. Also contains code to run and evaluate the LLM matcher.
erllm.selective_classifier	Supports running selective classification on various datasets, evaluating the performance over ranges of threshold/coverage parameters, and generating tables and plots to visualize the classification performance.
erllm.selective_matcher	Implements and evaluates the selective matcher and random labeling. Supports running both methods across parameter ranges and datasets and generating comparison tables.
erllm.serialization_cmp	Compares entity serialization schemes, evaluating their performance with and without attribute names. Also evaluates the impact of data errors.

Package: erllm

Module	Purpose
erllm_setup.py	Add .pth file to the site-packages directory of the current Python interpreter to make erllm discoverable.
gen_docs.py	Generate a package overview table and a table for each package's subfiles in a markdown file.
utils.py	Utility functions for various tasks including file operations, mathematical calculations, and data manipulation.

Package: erllm.calibration

Module	Purpose
calibration_plots.py	Performs calibration analysis on language model predictions for different datasets. Calculating Brier Score and Expected Calibration Error (ECE).
confidence_hist.py	Generate histograms of confidence scores per outcome (TP, TN, FP, FN).
reliability_diagrams.py	Third party code from https://github.com/hollance/reliability-diagrams with some small changes. Calibration computation and visualization using reliability diagrams.

Package: erllm.dataset

Module	Purpose
entity.py	Contains Entity and OrderedEntity classes to represent entities and serialize them into strings for use in prompts.
load_ds.py	Provides functions for loading benchmark data from CSV files into pandas DataFrames or lists of tuples representing entity pairs.
sample_ds.py	Provides a function for sampling elements from a dataset while preserving the label ratio.
stats_ds.py	This module provides functions to compute dataset statistics like the number of pairs.

Package: erllm.dataset.dbpedia

Module	Purpose
access_dbpedia.py	Access the DBPedia SQLite database after it has been created by load_dbpedia.py.
load_dbpedia.py	Loads data from .txt file and loads it into SQLite tables. The primary tables store DBpedia entities with key-value pairs, and an additional table stores matching pairs.
sample_dbpedia.py	Provides functions for generating a sample dataset of entity pairs from the DBPedia database. The dataset includes both matching and non-matching pairs of entities. The matching pairs are generated based on known matches, non-matching pairs are generated by token blocking on random entities.
token_blocking.py	Provides functions for token blocking and clean token blocking in entity resolution tasks.

Package: erllm.dataset.ditto

Module	Purpose
to_ditto.py	Provides functions for converting labeled pairs of entities to Ditto format and split them into train, validation, and test sets.
to_ditto_runner.py	Generates Ditto datasets from existing datasets.

Package: erllm.discarder

Module	Purpose
discarder.py	This module provides functions for computing set-based and embedding-based similarities for pairs of entities within a given dataset. The set-based similarities include Jaccard, Overlap, Monge-Elkan, and Generalized Jaccard, while embedding-based similarities use cosine and Euclidean distance metrics. Saves the results and computation time into similarity files which serve as cache for composite matchers including a discarder.
discarder_eval.py	Computes various functions from similarity files, such as the number of false negatives as function of the number of discarded pairs.
discarder_vis.py	Generates plots to visualize evaluation discarder statistitcs. It includes functions to plot specific relations for a given dataset and generate combined plots for multiple datasets, offering insights into various metrics such as false negatives, risk, false negative rate, and coverage.

Package: erllm.discarding_matcher

Module	Purpose
discarding_matcher.py	This module provides functions for evaluating the performance of a discarding matcher utilizing run and similarity files. It calculates classification, cost and duration metrics.
discarding_matcher_duration_cmp.py	Calculates speedup factor of discarding matcher over LLM matcher.
discarding_matcher_runner.py	Runs the discarding matcher algorithm on multiple datasets with different threshold values. It calculates various performance metrics such as accuracy, precision, recall, F1 score, cost, and duration.
discarding_matcher_tradeoff.py	Generate and analyze performance/cost trade-off for the discarding matcher based on F1 decrease thresholds. Calculates F1 decrease, relative cost, and relative duration for each dataset and threshold.
discarding_matcher_tradeoff_abs.py	Create tables of absolute cost and time required to run the LLM matcher and the discarding matcher at various F1 decrease thresholds.
discarding_matcher_vis.py	Generates performance comparison plots for the discarding matcher.

Package: erllm.discarding_selective_matcher

Module	Purpose
discarding_selective_matcher.py	Implements the discarding selective matcher and includes functions for evaluating its classification performance, cost and duration.
discarding_selective_matcher_allstats_table.py	Creates a table for comparing different matcher architectures based on their discarding error, cost, time and classification metrics.
discarding_selective_matcher_contour.py	Create contour plots which map the discard and label fractions to the mean F1, precision, recall.
discarding_selective_matcher_eval.py	Calculates the mean values across datasets for specified metrics, based on the results obtained by running the discarding selective matcher.
discarding_selective_matcher_metric_table.py	Create a table which shows one metric like mean F1 across different label and discard fractions.
discarding_selective_matcher_runner.py	Runs and evaluates the discarding selective matcher for various configurations.
discarding_selective_matcher_sample_vs_full.py	Ccombines and compares the results on full dataset and ampled version using different configurations of a discarding selective matcher (DSM) algorithm. It generates a comparison table of the classification performance for each configuration.

Package: erllm.ditto

Module	Purpose
add_to_ditto_configs.py	Copy the datasets in DITTO format to the ditto folder into the subfolder data/erllm. Add the new datasets to the configs.json file in the ditto folder.
ditto_combine_predictions.py	Based on the stats of train and valid set and the results of running ditto, calculate the precision, recall, and F1 score for the total dataset.
sm_ditto_comparison.py	Creates a table containing F1 scores for DITTO and SM across all datasets.

Package: erllm.llm_matcher

Module	Purpose
cost.py	Provides cost calculations for language models based on specified configurations, including input and output costs.
evalrun.py	Methods for reading run files, deriving classification decisions, and calculating classification and calibration metrics.
gpt.py	Module for obtaining completions from the older OpenAI Completions API.
gpt_chat.py	Module for obtaining completions from the newer OpenAI Chat Completions API.
llm_matcher.py	Provides functions to evaluate the performance of the LLM matcher on a set of run files obtained from OpenAI's API. It calculates various classification metrics, entropies, and calibration results.
prompt_data.py	Handles serialization of labeled entity pairs and saves result into JSON.
prompts.py	Combines serialized entities from JSON file with prompt prefix/postfix to create full prompts passed to OpenAI's API.

Package: erllm.selective_classifier

Module	Purpose
selective_classifier.py	Run and evaluate selective classification.
selective_classifier_runner.py	Runs selective classification over ranges of threshold/coverage parameters on multiple datasets.
selective_classifier_tab.py	Create table of F1 scores per dataset for different coverages.
selective_classifier_vis.py	Generates classification performance comparison plots for selective classification.

Package: erllm.selective_matcher

Module	Purpose
random_table.py	Create a latex comparison table of F1 scores between LLM matcher and random labeling at different label fractions.
random_table_with_sd.py	Creates a latex table displaying std. dev. of F1 scores for different fractions of random labeling.
selective_matcher.py	Implements the selective matcher and the labeling of randomly chosen predictions. It applies these to predictions on different datasets and calculates various classification metrics.
selective_matcher_runner.py	This script runs and evaluates the selective matcher across parameter ranges and datasets.
selective_matcher_vs_base_table.py	Create a latex comparison table of F1 scores between LLM matcher and selective matcher at different label fractions.

Package: erllm.serialization_cmp

Module	Purpose
attribute_comparison.py	Creates per dataset and mean comparison tables for comparing entitiy serialization schemes with and without attributes names.
data_errors.py	Generate comparison table of LLM matcher's mean F1, precision and recall across datasets in presence of data errors.