|
3 | 3 | [](https://gitter.im/JuliaDiffEq/Lobby?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
|
4 | 4 | [](https://travis-ci.org/JuliaDiffEq/SparseDiffTools.jl)
|
5 | 5 |
|
6 |
| -A Julia package to exploit the sparsity found in derivative matrices to enhance and speed up their computation with the help of matrix coloring. Jacobians of large dimensions frequently have a lot of elements equal to zero, and this fact can be utilised to speed up the process of computing such matrices. This package is fundamentally based and works on this observation. |
| 6 | +This package is for exploiting sparsity in Jacobians and Hessians to accelerate |
| 7 | +computations. Matrix-free Jacobian-vector product and Hessian-vector product |
| 8 | +operators are provided that are compatible with AbstractMatrix-based libraries |
| 9 | +like IterativeSolvers.jl for easy and efficient Newton-Krylov implementation. |
| 10 | +Automatic and numerical differentiation are utilized and optional. In addition, |
| 11 | +the ability to automatically detect the sparsity of a function, perform matrix |
| 12 | +coloring, and utilize coloring in Jacobian and Hessian construction is provided. |
7 | 13 |
|
8 |
| -This package comprises of three separate and independent modules: `Automatic Sparsity Detection`, `Matrix Partitioning`, and `Automatic Differentiation and Recovery`. |
| 14 | +## Example |
9 | 15 |
|
10 |
| -Under automatic sparsity detection, there are 2 main methods: |
| 16 | +Suppose we had the function |
11 | 17 |
|
12 |
| -i. Partial Computation of Jacobians |
| 18 | +```julia |
| 19 | +fcalls = 0 |
| 20 | +function f(dx,x) |
| 21 | + global fcalls += 1 |
| 22 | + for i in 2:length(x)-1 |
| 23 | + dx[i] = x[i-1] - 2x[i] + x[i+1] |
| 24 | + end |
| 25 | + dx[1] = -2x[1] + x[2] |
| 26 | + dx[end] = x[end-1] - 2x[end] |
| 27 | + nothing |
| 28 | +end |
| 29 | +``` |
13 | 30 |
|
14 |
| -ii. Analytical Method |
| 31 | +For this function, we know that the sparsity pattern of the Jacobian is a |
| 32 | +`Tridiagonal` matrix. We represent our sparsity by that matrix: |
15 | 33 |
|
| 34 | +```julia |
| 35 | +sparsity_pattern = Tridiagonal(ones(29),ones(30),ones(29)) |
| 36 | +``` |
16 | 37 |
|
17 |
| -Under matrix partitioning, we have 2 sub categories: |
| 38 | +Now we call `matrix_colors` to get the color vector for that matrix: |
18 | 39 |
|
19 |
| -i. distance-1-coloring, under which we have |
20 |
| - 1. Recursive-Largest First Algorithm (`contraction_algo.jl`) |
21 |
| - 2. Backtracking Sequential Algorithm (`BSC_algo.jl`) |
22 |
| - 3. Largest-First Algorithm |
23 |
| - 4. DSATUR Algorithm |
24 |
| - |
25 |
| -ii. distance-2-coloring |
26 |
| - 1. greedy distance-2-coloring algorithm |
27 |
| - 2. greedy star coloring algorithm |
28 |
| - |
29 |
| - Under automatic recovery module, I plan on implementing 2 direct recovery methods: |
30 |
| - |
31 |
| - i. Automatic Differentiation Recovery |
32 |
| - |
33 |
| - ii. Forward Differencing Recovery |
| 40 | +```julia |
| 41 | +colors = matrix_colors(sparsity_pattern) |
| 42 | +``` |
| 43 | + |
| 44 | +Since `maximum(colors)` is 3, this means that finite differencing can now |
| 45 | +compute the Jacobian in just 4 `f`-evaluations: |
| 46 | + |
| 47 | +```julia |
| 48 | +J = DiffEqDiffTools.finite_difference_jacobian(f, rand(30), color=colors) |
| 49 | +@show fcalls # 4 |
| 50 | +``` |
| 51 | + |
| 52 | +In addition, a faster forward-mode autodiff call can be utilized as well: |
| 53 | + |
| 54 | +```julia |
| 55 | +forwarddiff_color_jacobian!(sparsity_pattern, f, x, color = colors) |
| 56 | +``` |
| 57 | + |
| 58 | +If one only need to compute products, one can use the operators. For example, |
| 59 | + |
| 60 | +```julia |
| 61 | +u = rand(30) |
| 62 | +J = JacVec(f,u) |
| 63 | +``` |
| 64 | + |
| 65 | +makes `J` into a matrix-free operator which calculates `J*v` products. For |
| 66 | +example: |
| 67 | + |
| 68 | +```julia |
| 69 | +v = rand(30) |
| 70 | +res = similar(v) |
| 71 | +mul!(res,J,v) |
| 72 | +``` |
| 73 | + |
| 74 | +makes `res = J*v`. Additional operators for `HesVec` exists, including |
| 75 | +`HesVecGrad` which allows one to utilize a gradient function. These operators |
| 76 | +are compatible with iterative solver libraries like IterativeSolvers.jl, meaning |
| 77 | +the following performs the Newton-Krylov update iteration: |
| 78 | + |
| 79 | +```julia |
| 80 | +using IterativeSolvers |
| 81 | +gmres!(res,J,v) |
| 82 | +``` |
| 83 | + |
| 84 | +## Documentation |
| 85 | + |
| 86 | +### Matrix Coloring |
| 87 | + |
| 88 | +Matrix coloring allows you to reduce the number of times finite differencing |
| 89 | +requires an `f` call to `maximum(colors)+1`, or reduces automatic differentiation |
| 90 | +to using `maximum(colors)` partials. Since normally these values are `length(x)`, |
| 91 | +this can be significant savings. |
| 92 | + |
| 93 | +The API for computing the color vector is: |
| 94 | + |
| 95 | +```julia |
| 96 | +matrix_colors(A::AbstractMatrix,alg::ColoringAlgorithm = GreedyD1Color()) |
| 97 | +``` |
| 98 | + |
| 99 | +The first argument is the abstract matrix which represents the sparsity pattern |
| 100 | +of the Jacobian. The second argument is the optional choice of coloring algorithm. |
| 101 | +It will default to a greedy distance 1 coloring, though if your special matrix |
| 102 | +type has more information, like is a `Tridiagonal` or `BlockBandedMatrix`, the |
| 103 | +color vector will be analytically calculated instead. |
| 104 | + |
| 105 | +The result is a vector which assigns a color to each row of the matrix. |
| 106 | + |
| 107 | +### Color-Assisted Differentiation |
| 108 | + |
| 109 | +Color-assisted differentiation for numerical differentiation is provided by |
| 110 | +DiffEqDiffTools.jl and for automatic differentiation is provided by |
| 111 | +ForwardDiff.jl. |
| 112 | + |
| 113 | +For DiffEqDiffTools.jl, one simply has to use the provided `color` keyword |
| 114 | +argument. See [the DiffEqDiffTools Jacobian documentation](https://github.com/JuliaDiffEq/DiffEqDiffTools.jl#jacobians) for more details. |
| 115 | + |
| 116 | +For forward-mode automatic differentiation, use of a color vector is provided |
| 117 | +by the following function: |
| 118 | + |
| 119 | +```julia |
| 120 | +forwarddiff_color_jacobian!(J::AbstractMatrix{<:Number}, |
| 121 | + f, |
| 122 | + x::AbstractArray{<:Number}; |
| 123 | + dx = nothing, |
| 124 | + color = eachindex(x)) |
| 125 | +``` |
| 126 | + |
| 127 | +This call wiil allocate the cache variables each time. To avoid allocating the |
| 128 | +cache, construct the cache in advance: |
| 129 | + |
| 130 | +```julia |
| 131 | +ForwardColorJacCache(f,x,_chunksize = nothing; |
| 132 | + dx = nothing, |
| 133 | + color=1:length(x)) |
| 134 | +``` |
| 135 | + |
| 136 | +and utilize the following signature: |
| 137 | + |
| 138 | +```julia |
| 139 | +forwarddiff_color_jacobian!(J::AbstractMatrix{<:Number}, |
| 140 | + f, |
| 141 | + x::AbstractArray{<:Number}, |
| 142 | + jac_cache::ForwardColorJacCache) |
| 143 | +``` |
| 144 | + |
| 145 | +### Jacobian-Vector and Hessian-Vector Products |
| 146 | + |
| 147 | +Matrix-free implementations of Jacobian-Vector and Hessian-Vector products is |
| 148 | +provided in both an operator and function form. For the functions, each choice |
| 149 | +has the choice of being in-place and out-of-place, and the in-place versions |
| 150 | +have the ability to pass in cache vectors to be non-allocating. When in-place |
| 151 | +the function signature for Jacobians is `f!(du,u)`, while out-of-place has |
| 152 | +`du=f(u)`. For Hessians, all functions must be `f(u)` which returns a scalar |
| 153 | + |
| 154 | +The functions for Jacobians are: |
| 155 | + |
| 156 | +```julia |
| 157 | +auto_jacvec!(du, f, x, v, |
| 158 | + cache1 = ForwardDiff.Dual{DeivVecTag}.(x, v), |
| 159 | + cache2 = ForwardDiff.Dual{DeivVecTag}.(x, v)) |
| 160 | + |
| 161 | +auto_jacvec(f, x, v) |
| 162 | + |
| 163 | +# If compute_f0 is false, then `f(cache1,x)` will be computed |
| 164 | +num_jacvec!(du,f,x,v,cache1 = similar(v), |
| 165 | + cache2 = similar(v); |
| 166 | + compute_f0 = true) |
| 167 | +num_jacvec(f,x,v,f0=nothing) |
| 168 | +``` |
| 169 | + |
| 170 | +For Hessians, the following are provided: |
| 171 | + |
| 172 | +```julia |
| 173 | +num_hesvec!(du,f,x,v, |
| 174 | + cache1 = similar(v), |
| 175 | + cache2 = similar(v), |
| 176 | + cache3 = similar(v)) |
| 177 | + |
| 178 | +num_hesvec(f,x,v) |
| 179 | + |
| 180 | +numauto_hesvec!(du,f,x,v, |
| 181 | + cache = ForwardDiff.GradientConfig(f,v), |
| 182 | + cache1 = similar(v), |
| 183 | + cache2 = similar(v)) |
| 184 | + |
| 185 | +numauto_hesvec(f,x,v) |
| 186 | + |
| 187 | +autonum_hesvec!(du,f,x,v, |
| 188 | + cache1 = similar(v), |
| 189 | + cache2 = ForwardDiff.Dual{DeivVecTag}.(x, v), |
| 190 | + cache3 = ForwardDiff.Dual{DeivVecTag}.(x, v)) |
| 191 | + |
| 192 | +autonum_hesvec(f,x,v) |
| 193 | +``` |
| 194 | + |
| 195 | +`numauto` and `autonum` both mix numerical and automatic differentiation, with |
| 196 | +the former almost always being more efficient and is thus recommended. In addition, |
| 197 | +the following forms allow you to provide a gradient function `g(dx,x)` or `dx=g(x)` |
| 198 | +respectively: |
| 199 | + |
| 200 | +```julia |
| 201 | +num_hesvecgrad!(du,g,x,v, |
| 202 | + cache2 = similar(v), |
| 203 | + cache3 = similar(v)) |
| 204 | + |
| 205 | +num_hesvecgrad(g,x,v) |
| 206 | + |
| 207 | +auto_hesvecgrad!(du,g,x,v, |
| 208 | + cache2 = ForwardDiff.Dual{DeivVecTag}.(x, v), |
| 209 | + cache3 = ForwardDiff.Dual{DeivVecTag}.(x, v)) |
| 210 | + |
| 211 | +auto_hesvecgrad(g,x,v) |
| 212 | +``` |
| 213 | + |
| 214 | +#### J*v and H*v Operators |
| 215 | + |
| 216 | +The following produce matrix-free operators which are used for calculating |
| 217 | +Jacobian-vector and Hessian-vector products where the differentiation takes |
| 218 | +place at the vector `u`: |
| 219 | + |
| 220 | +```julia |
| 221 | +JacVec(f,u::AbstractArray;autodiff=true) |
| 222 | +HesVec(f,u::AbstractArray;autodiff=true) |
| 223 | +HesVecGrad(g,u::AbstractArray;autodiff=false) |
| 224 | +``` |
| 225 | + |
| 226 | +These all have the same interface, where `J*v` utilizes the out-of-place |
| 227 | +Jacobian-vector or Hessian-vector function, whereas `mul!(res,J,v)` utilizes |
| 228 | +the appropriate in-place versions. To update the location of differentiation |
| 229 | +in the operator, simply mutate the vector `u`: `J.u .= ...`. |
0 commit comments