Skip to content

Document linearize and have complete and separated mul decomposition#808

Draft
tias wants to merge 173 commits intomasterfrom
linearize_mul
Draft

Document linearize and have complete and separated mul decomposition#808
tias wants to merge 173 commits intomasterfrom
linearize_mul

Conversation

@tias
Copy link
Collaborator

@tias tias commented Dec 22, 2025

So I wanted to better characterize linearize and add documentation.

Then I felt like 'mul' should be treated separately

Then I added multiplication of int*int as well.

But it is not finished... tests are choking on bool*bool especially, which I kept at bool level (half undone already) but then there are bools popping up at places where there is a (trivial) comparison.

Some other notes:

  • I wonder if we should make mul between two vars a global constraint after all... then we can take this out of linearize? e.g. there might be a CP solver that does not implement mul...

  • We switched away from doing 'decompose_comparison' on GlobalFunctions, in the hope of avoiding unnecessary auxliaries in nested expressions, but I now realize that this way we may include unneeded auxiliary variables for toplevel comparisons, e.g. now b*v >= 5 will be decomposed as decomp-of(b*v = aux), aux >= 5...

  • Maybe the previous point is not too bad, because aux>=5 is just a bound update, not a live constraint. But for b*v = r we would do b*v = aux, aux=r, which happens when flatten flattens some sum that contains b*v, which is another argument for making mul of two vars a global constraint: then we can decompose it before flatten?

Copy link
Collaborator

@IgnaceBleukx IgnaceBleukx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the doc, mostly minor stuff.
Code-wise: I think we are missing a CSE opportunity in the helper function; looks good otherwise.

for translating to Pseudo-Boolean or CNF formats.

There are a number of components to getting a good linearisation:
- **Decomposing global constraints/functions** in a 'linear friendly' way.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but we currently don't do this yet in this PR? (Only for AllDiff)

There are a number of components to getting a good linearisation:
- **Decomposing global constraints/functions** in a 'linear friendly' way.
A common pattern is that constraints that enforce domain consistency on integer variables,
that they should be decomposed over a Boolean representation of the domain. This is the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

broken sentence

- **Decomposing global constraints/functions** in a 'linear friendly' way.
A common pattern is that constraints that enforce domain consistency on integer variables,
that they should be decomposed over a Boolean representation of the domain. This is the
case for :class:`~cpmpy.expressions.globalconstraints.AllDifferent`, :class:`~cpmpy.expressions.globalfunctions.Element`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

certainly also for Count


- **Implications** e.g. `B -> sum(X) <= 4` should be linearised with the Big-M method.

- **Domain encodings** e.g. of `A=cp.intvar(0,4)` would create binary variables and constraints
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit surprising to see in the docs here for me.
The other points are "constraint xxx is not linear, and should be linearized as follows...."; but here there is no "input" constraint?

- **Implications** e.g. `B -> sum(X) <= 4` should be linearised with the Big-M method.

- **Domain encodings** e.g. of `A=cp.intvar(0,4)` would create binary variables and constraints
like `B0 -> A=0`, `B1 -> A=1`, `B2 -> A=2`, etc and `sum(B0..4) = 1`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

B0..B4

and is decomposed as such if not in `supported`.
Any other unsupported global constraint should be decomposed using
:func:`cpmpy.transformations.decompose_global.decompose_in_tree()`.
reified: Whether the constraint is fully reified. When True, nested implications
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not fully reified? We only support half-reified in linearize

- `const`: The difference between the original expression and the new expression,
i.e. a constant term that must be added to pos_expr to be an equivalent linear expression.
Replaces a var/sum/wsum expression containing :class:`~cpmpy.expressions.variables.NegBoolView`
with an equivalent expression using only :class:`~cpmpy.expressions.variables.BoolVar`,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BoolVarImpl_


It might add a constant term to the expression. If you want the constant separately,
use :func:`only_positive_bv_wsum_const`.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if this doc needs an example.

"E.g., the weighed sum 3*~x + 4*y is transformed to 3 + -3*x + 4*y. If you want the constant 3 to be separate, use only_positive_bv_wsum_const instead."

lin_cnt = cp.Model(lin_cons).solveAll(display=assert_cons_is_true(cons))
cons_cnt = cp.Model(cons).solveAll(display=assert_cons_is_true(cons))
self.assertEqual(lin_cnt, cons_cnt)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also test != and reified mult?

@IgnaceBleukx
Copy link
Collaborator

I enabled multiplication and related globals in test_constraints (which was quite satisfying to do); but it seems that reified multiplications are going wrong.
I suppose this is because the encoding is posted as part of the implication constraint in linearize, instead of posted toplevel.
To fix this, we either need to change linearize such that it can also post stuff toplevel, or we make Mul a global after all

@tias
Copy link
Collaborator Author

tias commented Dec 28, 2025

Indeed, we should probably do both. Neither is to be taken lightly.

I have some ideas for a refactor of linearize_constraint, but it might be a bumpy ride.

@tias
Copy link
Collaborator Author

tias commented Jan 3, 2026

First extracting mul as a global (and fixing it) would have been the easier route retrospectively...

This rewrite should be a good basis for further improvements that will simplify it (e.g. doing globals before linearize, incl mul).

Some things that surfaced:

  • should linearize output a single boolvar or only a sum? it already is allowed to output a boolval, so I think it shoudl also be alloed to output a boolvar, will take away some silly recursions
  • int2bool does not use csemap, that should be fixed...
  • should implications also be in csemap, or in a separate cremap... we should make a decision at some point
  • flatten should only half-reify when positive, that will make the mul decompositions (the output) notably simpler

The implimentation style of having a function with a 'condition' argument is quite clean... its what Henk used for pindakaas. We could consider it when rewriting flatten too. (perhaps we should rename it from 'implication_literal' to 'condition' indeed, or actually in pindakaas its 'conditions' as multiple are accepted, which makes sense...)

I don't think we should add more to this PR though

@tias
Copy link
Collaborator Author

tias commented Jan 3, 2026

ah, also, 83000+ tests that took 7 hours+ on my machine, and I don't even have cplex/gurobi installed at the moment...

(and 17 failed, pindakaas bug #810 )

@IgnaceBleukx
Copy link
Collaborator

That seems like a really really long time to run all tests...
I ran them on my machine too, there seems to be something going on with pindakaas and the linearization of mult.
These are the slowest 10 tests:

148.02s call     tests/test_constraints.py::test_comparison_constraints[pindakaas-(boolval(True)) != (pow(x,3))]
101.22s call     tests/test_constraints.py::test_comparison_constraints[pindakaas-(pow(x,3)) != (p)]
94.67s call     tests/test_constraints.py::test_comparison_constraints[pindakaas-(p) != (pow(x,3))]
75.20s call     tests/test_constraints.py::test_comparison_constraints[pindakaas-(pow(x,3)) != (boolval(True))]
74.91s call     tests/test_constraints.py::test_comparison_constraints[pindakaas-1 != pow(x,3)]
74.78s call     tests/test_constraints.py::test_comparison_constraints[pindakaas-pow(x,3) != 1]
69.19s call     tests/test_constraints.py::test_comparison_constraints[pindakaas-(pow(x,3)) != (l)]
68.47s call     tests/test_constraints.py::test_comparison_constraints[pindakaas-(l) != (pow(x,3))]
64.49s call     tests/test_constraints.py::test_reify_imply_constraints[pindakaas-((p) <= (pow(x,3))) == (p)]
64.18s call     tests/test_constraints.py::test_reify_imply_constraints[pindakaas-(p) -> ((p) <= (pow(x,3)))]

@tias
Copy link
Collaborator Author

tias commented Jan 5, 2026

Time to transform the first case is <0.01 for both pindakaas and pysat. Time to solve is 2.5s for pysat and 122s for pindakaas

x,y,z = cp.intvar(-3, 5, shape=3, name=list("xyz"))
C=[(cp.BoolVal(True)) != (pow(x,3))]
t0 = time.time()
print("Out:", cp.SolverLookup.get("pysat", cp.Model(C)).solve())
print("Time:", time.time()-t0)
t0 = time.time()
print("Out:", cp.SolverLookup.get("pindakaas", cp.Model(C)).solve())
print("Time:", time.time()-t0)

@tias
Copy link
Collaborator Author

tias commented Jan 5, 2026

merged #811 in, now only the 4 panics from #810 remain.

The slowness of pindakaas also remains; pysat is notably faster, but still takes quite some time on some tests too.

However, another thing I notice is memory use: when running test/test_constraints with only pysat, memory climbs up linearly, at 30% it was around 15Gb or so... so something is not cleaning up properly. Is it that the python garbage collector is not being run, so it is not cleaning up all our Booleans? or is it because maybe we are not closing the solver objects properly, so they keep their memory alive? or they keep our python objects alive?

I have 0 experience with memory and Python, if somebody knows how to look into this, by all means... (running the tests with ortools stays at less then a Gb...)

@IgnaceBleukx
Copy link
Collaborator

IgnaceBleukx commented Jan 6, 2026

I've done some digging into the memory issues, and there are several parts to it:

  1. Indeed, there is no garabage collection happening when running parameterized tests (as discussed here). There are a couple of options: either we run the garbage collector manually, or we wrap the solve-call in a context manager. E.g., something like:
with cp.SolverLookup(solver, model) as solver:
    assert solver.solve()

For this, we need to implement the context manager protocol for our solvers... https://book.pythontips.com/en/latest/context_managers.html

  1. The power-tests indeed use A LOT of memory... Using pytest-memray reports the top-n memory-heavy tests, which outputs:
Allocation results for tests/test_constraints.py::test_comparison_constraints[pysat-(pow(x,3)) != (boolval(True))] at the high watermark

	 📦 Total memory allocated: 275.0MiB
	 📏 Total allocations: 49
	 📊 Histogram of allocation sizes: |█▇▃▁▃|
	 🥇 Biggest allocating functions:
		- _encode:/Users/ignace/miniforge3/envs/cpmpy/lib/python3.12/site-packages/pysat/pb.py:346 -> 93.0MiB
		- add_clause:/Users/ignace/miniforge3/envs/cpmpy/lib/python3.12/site-packages/pysat/solvers.py:4133 -> 74.8MiB
		- _encode:/Users/ignace/miniforge3/envs/cpmpy/lib/python3.12/site-packages/pysat/pb.py:343 -> 49.9MiB
		- _encode:/Users/ignace/miniforge3/envs/cpmpy/lib/python3.12/site-packages/pysat/pb.py:343 -> 20.2MiB
		- _encode:/Users/ignace/miniforge3/envs/cpmpy/lib/python3.12/site-packages/pysat/pb.py:346 -> 12.0MiB
  1. I think we should really wait with mergin this PR until the csemap is used in int2bool. We somehow are introducing 18 integer variables when transforming (pow(x,3)) != (boolval(True) for PySAT. I will look further into this where these are coming from

@tias tias marked this pull request as draft January 12, 2026 12:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants