Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,22 @@
# Project 1

Team Members
Rohit kumar Mamidi - A20541036
Venkata Siva Satya Pavan Ganesh Maddula - A20541032

Put your README here. Answer the following questions.

* What does the model you have implemented do and when should it be used?
* The Elastic Net regression model combines both L1 (Lasso) and L2 (Ridge) regularization techniques to improve the performance of linear regression model and used to reduce multi colliniarity . It does this by combining L1 and L2 penalties to promote sparsity and stabilize coefficient.
*
* How did you test your model to determine if it is working reasonably correctly?
* I have used RMSE and R2 as a metrics to evaulate the model.
*
* What parameters have you exposed to users of your implementation in order to tune performance? (Also perhaps provide some basic usage examples.)
* Alpha Controls the overall strength of regularization. Higher values lead to more regularization.
L1_ratio: Specifies the balance between L1 and L2 penalties:
Learning rate: In iterative optimization algorithms, this controls how much to change the weights during training.
Iterations: The maximum number of iterations for the optimization algorithm to converge.

* Are there specific inputs that your implementation has trouble with? Given more time, could you work around these or is it fundamental?
* Elastic Net regression faces several challenges, including highly imbalanced datasets, severe multicollinearity, and the presence of missing values, which may result in biased predictions and unreliable coefficient estimates. To enhance performance, it’s beneficial to incorporate preprocessing techniques such as feature scaling and addressing missing data, as well as utilizing advanced optimization methods. Furthermore, conducting comprehensive hyperparameter tuning through approaches like Grid Search or Random Search can significantly improve model accuracy. Implementing these strategies will help mitigate the model's limitations and boost its overall effectiveness.
35 changes: 26 additions & 9 deletions elasticnet/models/ElasticNet.py
Original file line number Diff line number Diff line change
@@ -1,17 +1,34 @@

import csv
import numpy as np

class ElasticNetModel():
def __init__(self):
pass

def __init__(self, alpha=1.0, l1_ratio=0.3, lrate=0.02, n_iters=200):
self.alpha = alpha
self.l1_ratio = l1_ratio
self.lrate = lrate
self.n_iters = n_iters
self.weights = None

def fit(self, X, y):
return ElasticNetModelResults()
n_features = X.shape[1]
self.weights = np.zeros(n_features)

for _ in range(self.n_iters):
pred = X.dot(self.weights)

errors = pred - y
gradient = (X.T.dot(errors) / len(y)) + \
(self.alpha * (1 - self.l1_ratio) * self.weights) + \
(self.alpha * self.l1_ratio * np.sign(self.weights))

self.weights -= self.lrate * gradient

return ElasticNetModelResults(self.weights)


class ElasticNetModelResults():
def __init__(self):
pass
def __init__(self, coef):
self.weights = coef

def predict(self, x):
return 0.5
def predict(self, X):
return X @ self.weights
27 changes: 21 additions & 6 deletions elasticnet/tests/test_ElasticNetModel.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
import csv

import numpy

import numpy
from elasticnet.models.ElasticNet import ElasticNetModel

# import matplotlib.pyplot as plt
def test_predict():
model = ElasticNetModel()
data = []
Expand All @@ -12,8 +11,24 @@ def test_predict():
for row in reader:
data.append(row)

X = numpy.array([[v for k,v in datum.items() if k.startswith('x')] for datum in data])
y = numpy.array([[v for k,v in datum.items() if k=='y'] for datum in data])
X = numpy.array([[float(v) for k,v in datum.items() if k.startswith('x')] for datum in data])
y = numpy.array([[float(v) for k,v in datum.items() if k=='y'] for datum in data]).flatten()
# print(X.shape)
results = model.fit(X,y)
preds = results.predict(X)
assert preds == 0.5
rmse = numpy.sqrt(numpy.mean((preds - y) ** 2))

print("Root Mean Square Error (RMSE):", rmse)

tss = numpy.sum((y - numpy.mean(y)) ** 2)
rss = numpy.sum((y - preds) ** 2)

r_2 = 1 - (rss / tss)
print("R2 score: ",r_2)
# plt.plot( y, color='blue')
# plt.plot( preds, color='orange')
# plt.show()
return preds

plt.show()
predss=test_predict()