-
Notifications
You must be signed in to change notification settings - Fork 152
Description
Description
Proposal
I would like to see a symbolic logic system for the mathematical properties of tensors implemented in pytensor. The name "assumptions" comes from the sympy system that does what I'm asking for. I guess it's actually a "type" system, but that name is quite overloaded.
Under this proposal, variable constructors would allow for a range of assumptions / types to be passed in. For example:
x = pt.tensor('x', shape=(None,), positive=True)By assumption, values of x will always be positive. Other tags include non-negative, zero, non-zero, negative, non-positive. Under the sympy system, these values take one of three possible values: True, False or None, where None is interpreted as unknown or non-determinable.
The assumptions of a variable should be stored in an Assumptions Op, the same way the shape of an Op is stored in a Shape Op. Other Ops can then implement infer_assumptions, which tells us how they act on the various assumptions. Likewise, make_node will need to be expanded to include assumption logic.
The objective here is to have a symbolic "assumptions graph" that can be evaluated the same way the shape graph is evaluated. Calling x.assumptions should return a symbolic assumptions graph. Calling x.type.assumptions should return known static assumptions, or None where assumptions are not statically known.
Assumptions should have internal logic to maintain consistency. For example, a variable cannot be both positive and negative at the same time.
Elementwise Examples
If no assumptions are provided, nothing is returned.
# No assumptions given -- everything is unknown
x = pt.tensor('x', shape=(None,))
x.assumptions.eval() # {}Some Ops can set assumptions even if assumptions of the inputs are unknown:
x = pt.tensor('x', shape=(None,))
z = pt.exp(z)
z.assumptions.eval() # {'positive': True}Ops with multiple inputs should reason over the assumptions of both inputs:
x = pt.tensor('x', shape=(None,), positive=True)
y = pt.tensor('y', shape=(None,), positive=True)
z = x + y
# If x and y are both known to be positive, z can be assumed to be strictly positive
z.assumptions.eval() # {'positive': True}Some Ops will destroy knowledge
x = pt.tensor('x', shape=(None,), positive=True)
y = pt.tensor('y', shape=(None,))
z = x + y
# y is unknown, so we can't assume that x + y is strictly positive
z.assumptions.eval() # {}More Interesting Examples
The motivating examples for this feature relate to assumptions about matrices. For example, triangularity, positive-definite, negative-definite, positive semi-definite, banded, diagonal, topelitz, etc.
Users can define this manually
x = pt.tensor('x', shape=(None, None), positive=True, semidefinite=True)Or it can be the result of an op.
x = pt.tensor('x', shape=(None, None))
L = pt.linalg.cholesky(x)
L.assumptions.eval() # {'lower_triangular: True}
# Question: should/can we propagate information back *up* the graph? Although x is not tagged, we can assume its PSD because it went into cholesky
x.assumptions.eval() # {'positive': True, 'semidefinite':True} ??Multi-input Ops reason about the output assumptions
x = pt.tensor('x', shape=(None, ))
x = pt.diagonal(x)
x.assumptions.eval() # {'diagonal': True}
y = pt.tensor('y', shape=(None, None))
z = x * y
z.assumptions.eval() # {'diagonal': True}Other possible features
We could imagine using the assumptions to track the domain of a variable. This could help in rewrites involving conditions.
x = pt.random.beta(alpha=1, beta=1)
x.assumptions.eval() # {'domain': (0, 1), 'left_inclusive':False, 'right_inclusive': False}
z = pt.maximum(x, 0.5)
z.assumptions.eval() # {'domain': (0.5, 1), 'left_inclusive': True, 'right_inclusive': False}Open Questions
- How should we handle matrix assumptions on higher order tensors? For example, what is the meaning of
x = pt.tensor(shape=(None, None, None), positive=True, semidefinite=True). The first dim is always a batch dim? Or? - Should this be a
COp(like shape)? Or just do it in python? - Should the assumption system duplicate type information? For example, should we have
realandimaginaryassumptions, or should assumptions also look at the dtype separately? - How complex should the reasoning in the assumption system be? On every update, sympy updates every flag to be internally consistent. So when positive becomes
True,nonnegativealso becomesTrue, andnegativebecomesFalse.
Finally, the sympy team experimented with "new assumptions", presumably because there were flaws in the old assumptions. So copying their setup might not be the right way to go. I'm curious how this is handled in e.g. Maple and Mathmetica.