SyMANTIC: An Efficient Symbolic Regression Method for Interpretable and Parsimonious Model Discovery in Science and Beyond

SyMANTIC is a novel SR algorithm that efficiently identifies low-dimensional features set from an enormous set of candidates through a unique combination of mutual information-based feature selection, adaptive feature expansion, and recursively applied $\ell_0$-based sparse regression. Additionally, it employs an information-theoretic measure to produce a set of Pareto-optimal equations, each offering the best accuracy for a given complexity. This open-source implementation of SyMANTIC is built on the PyTorch ecosystem.

Quick Start

Install SyMANTIC and dependancies

pip install symantic

Import your data and use the following code to fit a SyMANTIC model and analyze the Pareto front

# import SyMANTIC model class along with other useful packages
from symantic import SymanticModel
import numpy as np
import pandas as pd
# create dataframe composed of targets "y" and primary features "X"
data = np.column_stack((y, X))
df = pd.DataFrame(data)
# create model object to contruct full Pareto using default parameters
model = SymanticModel(df=df, #defines the dataframe,
                      operators = ['+','-','*','/','exp','sin','cos'], #defines the set of operators for feature engineering
                      n_epxansion = None, (default) # Defines the number of feature expansions, if a value is provided then
                      n_term = None, #defines the sparsity that needs to be considered for building models
                      sis_features = 20, (default) # defines the number of features to be screened from the expanded feature space
                      dimensionality = ['u1','u2','u3'], #Defines the units of the feature variables in string representation which later converted into sympy format to do the meaningful feature construction.
                      relational_units = [(symbols('u1')*symbols('u2'),symbols('u3)], #Defines the list of tuples where each tuple represents the relational transformation.
                      output_dim = (symbols('u1')*symbols('u1')), #Defines the units of the target variable which helps in narrowing down the space for Regularization.
                      initial_screening = ["mi" or "spearman", quantile value], #Defines the feature screening option for high dimensional and 1-quantile_value defines
                      metrics = [RMSE, $R^2$], #defines the values of RMSE and $R^2$ that are used to do the adaptive expansions and number of terms
                      disp = True or False #defines whether to print the statements of progress.
                      )
# run SyMANTIC algorithm to fit model and return dictionary "res" and "full_pareto" frontier
res,full_pareto = model.fit()
# generate plot of Pareto front obtained during the fit process
model.plot_pareto_front()
# extract symbolic model at utopia point and relevant metrics
model = res['utopia']['expression']
rmse = res['utopia']['rmse']
r2 = res['utopia']['r2']
complexity = res['utopia']['complexity']

Examples of SyMANTIC can be found in Examples folder and in the Colab Notebook SyMANTIC Examples

Citation

Coming soon

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
examples		examples
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SyMANTIC: An Efficient Symbolic Regression Method for Interpretable and Parsimonious Model Discovery in Science and Beyond

Quick Start

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SyMANTIC: An Efficient Symbolic Regression Method for Interpretable and Parsimonious Model Discovery in Science and Beyond

Quick Start

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages