GraphBLAS backend by Transurgeon · Pull Request #42 · Transurgeon/cvxpy

Transurgeon · 2023-10-08T04:41:01Z

This PR attempts to add a new backend written in python-graphBLAS. This will be an optional dependency.
The next steps for this to be ready will be to:

refactor the operations so that they return a gb.Matrix every time.
add some extra test cases for backend specific.
verify all unit-tests are passing and update any additional changes.
Whenever the matrix order feature is added to graphBLAS, we can update the non-parametrised operations to stay in row major order.

eriknw · 2023-10-16T15:54:43Z

👋 👀

Transurgeon · 2024-01-03T03:08:52Z

I have tested the performance of this backend versus the SciPy one on the following example (taken from the benchmark suite, which had a 13x slowdown)

import cvxpy as cp
import numpy as np
import scipy.stats as st
from cvxpy.settings import GRAPHBLAS_CANON_BACKEND

rs = np.random.RandomState(123)
N = 50
T = 350
cov = rs.rand(N, N) * 1.5 - 0.5
cov = cov @ cov.T / 1000 + np.diag(rs.rand(N) * 0.7 + 0.3) / 1000
mean = np.zeros(N) + 1 / 1000
returns = st.multivariate_normal.rvs(mean=mean, cov=cov, size=T, random_state=rs)

d = cp.Variable((int(T * (T - 1) / 2), 1))
w = cp.Variable((N, 1))
constraints = []
ret_w = cp.Variable((T, 1))
constraints.append(ret_w == returns @ w)
mat = np.zeros((d.shape[0], T))
"""
We need to create a vector that has the following entries:
    ret_w[i] - ret_w[j]
for j in range(T), for i in range(j+1, T).
We do this by building a numpy array of mostly 0's and 1's.
(It would be better to use SciPy sparse matrix objects.)
"""
ell = 0
for j in range(T):
    for i in range(j + 1, T):
        # write to mat so that (mat @ ret_w)[ell] == var_i - var_j
        mat[ell, i] = 1
        mat[ell, j] = -1
        ell += 1
all_pairs_ret_diff = mat @ ret_w
constraints += [d >= all_pairs_ret_diff,
                d >= -all_pairs_ret_diff,
                w >= 0,
                cp.sum(w) == 1]
risk = cp.sum(d) / ((T - 1) * T)
objective = cp.Minimize(risk * 1000)
problem = cp.Problem(objective, constraints)
#print(problem.get_problem_data(solver=cp.CLARABEL, canon_backend=GRAPHBLAS_CANON_BACKEND))
#print(problem.get_problem_data(solver=cp.CLARABEL, canon_backend=cp.SCIPY_CANON_BACKEND))
print(problem.solve(solver=cp.CLARABEL, canon_backend=GRAPHBLAS_CANON_BACKEND))
#print(problem.solve(solver=cp.CLARABEL, canon_backend=cp.SCIPY_CANON_BACKEND))

cvxpy/lin_ops/canon_backend.py

PTNobel · 2024-01-03T23:13:31Z

Okay, after my exploration it looks like the issue is the following: when constructing the constant_data lin_op of a NumPy matrix that is actually sparse, the SciPy backend is correctly filtering out the zero entries, but the graphblas backend is recording them as actual values.

Here's a script that shows the difference in behavior between the two libraries:

import graphblas as gb
import scipy.sparse as sp
import numpy as np

A = np.zeros((100, 100))

print('scipy:', sp.csc_matrix(A).nnz)
print('graphblas:', gb.Matrix.from_dense(A).nvals)

Transurgeon · 2024-01-08T17:01:05Z

@eriknw good news! We have fixed one of the major slowdowns for the graphBLAS backend.
It was due to my mishap as I forgot to include the parameter missing_value=0 in the gb.Matrix.from_dense function call. This create a huge 20m+ entry "sparse" matrix. Thanks to Parth, we were able to trace back to the root of the issue.
There are still 2-3 more benchmarks that are slower with graphBLAS, so I will try to debug into those this upcoming week.
See the image below for the new benchmark comparison vs SciPy

I think we might be able to release this backend as an optional dependency in a future version of cvxpy!

P.S. Im just curious, why did you guys choose to add the additional parameter missing_value? Is there a use case where it would be something other than 0? I feel like it could potentially be better to have a default value set to 0, for the people who don't read the docs carefully like I did :) . Let me know your thoughts on this.

eriknw · 2024-01-10T01:50:36Z

That's great! I'm eager to get my hands dirty with this. Let me know if there's anything specific you'd like me to look at.

P.S. Im just curious, why did you guys choose to add the additional parameter missing_value? Is there a use case where it would be something other than 0? I feel like it could potentially be better to have a default value set to 0, for the people who don't read the docs carefully like I did :) . Let me know your thoughts on this.

That missing values are not assumed to be zero is a fundamental principle of GraphBLAS. Also, "dense" is in the name ;). How can we update the docstring to make it clearer?

Here's the PR where we added this... and specifically where missing_value is used:
https://github.com/python-graphblas/python-graphblas/pull/382/files#diff-030428fe7505f6ad25207ca3e12b2fdc197af1dd4cf219ca6e743712a3f6340eR1452-R1453

into graphblas-backend

…pecific

Transurgeon added 3 commits October 7, 2023 12:03

adding backend from previous branch

c9d2b2e

making all unit-tests pass

14630fc

updating gp backend for all tests

1ff4d0f

adding helper output to findout reasons for slowdown

85fef05

Transurgeon commented Jan 3, 2024

View reviewed changes

cvxpy/lin_ops/canon_backend.py Outdated Show resolved Hide resolved

cvxpy/lin_ops/canon_backend.py Outdated Show resolved Hide resolved

adding mask to remove explicit zeros

4e15dc8

adding missing-value=0 for GB matrix from dense

29cab81

eriknw mentioned this pull request Feb 5, 2024

Encoding and manipulating N-dimensional tensors python-graphblas/python-graphblas#534

Open

Transurgeon and others added 12 commits May 25, 2024 01:35

Merge branch 'master' into graphblas-backend

6ed6490

reverting some changes for benchmarking

2c108d8

Merge branch 'graphblas-backend' of https://github.com/Transurgeon/cvxpy

2ca88cd

into graphblas-backend

adding backend instance test

7662fc8

copying scipy tests over but still need to change them to graphblas s…

e54a2b7

…pecific

merging new changes from 1.6

4b7a4b1

passing all the tests

d498d00

adding try catch around importing graphblas

f7cacad

changing default to see if tests pass

c313848

adding try-catch for tests too

05970a0

changing back default to graphblas

542d582

removing usage of ensure_type in gb.core.utils

21d3f5b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GraphBLAS backend#42

GraphBLAS backend#42
Transurgeon wants to merge 18 commits intomasterfrom
graphblas-backend

Transurgeon commented Oct 8, 2023

Uh oh!

eriknw commented Oct 16, 2023

Uh oh!

Transurgeon commented Jan 3, 2024

Uh oh!

Uh oh!

Uh oh!

PTNobel commented Jan 3, 2024

Uh oh!

Transurgeon commented Jan 8, 2024

Uh oh!

eriknw commented Jan 10, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Transurgeon commented Oct 8, 2023

Uh oh!

eriknw commented Oct 16, 2023

Uh oh!

Transurgeon commented Jan 3, 2024

Uh oh!

Uh oh!

Uh oh!

PTNobel commented Jan 3, 2024

Uh oh!

Transurgeon commented Jan 8, 2024

Uh oh!

eriknw commented Jan 10, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants