Advanced Options

This section covers solver knobs and patterns: backend selection, factorization cache, dtypes, and when to tell the solver that the sparsity pattern of \(A\) has changed.

Prefer cuSOLVER Dense (prefer_dense)

By default, the solver tries cuDSS first for general, symmetric, and SPD square systems. If you prefer to use cuSOLVER Dense instead (e.g. to avoid cuDSS or to work around rectangular cuDSS limitations), set prefer_dense=True:

from cudass import CUDASparseSolver, MatrixType

solver = CUDASparseSolver(
    matrix_type=MatrixType.SPD,
    use_cache=True,
    prefer_dense=True,
)
# ... update_matrix, solve ...

The solver will choose the cusolver_dn backend for supported matrix types, which densifies \(A\) and uses cuSOLVER (via PyTorch).

Forcing a specific backend (force_backend)

You can force a backend and disable fallback with force_backend:

solver = CUDASparseSolver(
    matrix_type=MatrixType.GENERAL,
    force_backend="cusolver_dn",   # or "cudss", "cusolver_sp"
)
  • "cudss" — Use cuDSS only; raises if cuDSS bindings are missing or cuDSS returns “not supported”.

  • "cusolver_dn" — Use cuSOLVER Dense; works for all supported matrix types.

  • "cusolver_sp" — Reserved for cuSolverSp (currently a stub).

Use force_backend when you need reproducibility, when debugging a specific backend, or when fallback is undesirable (e.g. you want to detect that cuDSS is unavailable).

Note

If you set force_backend="cudss" and cuDSS is not built or returns “not supported” for the matrix, the solver will raise. Use force_backend="cusolver_dn" for a robust fallback.

Factorization cache (use_cache)

CUDASparseSolver(..., use_cache=True) (the default) caches factorizations so that when you call update_matrix again with the same sparsity pattern and matrix shape, only the values are updated and refactorization can be faster. The cache has a limited size; old entries are evicted.

Set use_cache=False to disable caching (e.g. to reduce memory or if you never reuse the same structure):

solver = CUDASparseSolver(matrix_type=MatrixType.SPD, use_cache=False)

Float32 vs float64

You can use torch.float32 or torch.float64 for value and b. Pass dtype when creating the solver if you want to enforce a specific precision; otherwise it is inferred from the first update_matrix.

import torch
from cudass import CUDASparseSolver, MatrixType

# Use float32; index, m, n as in other examples
index = torch.tensor([[0, 0, 1, 1], [0, 1, 0, 1]], device="cuda", dtype=torch.int64)
value = torch.tensor([4.0, 1.0, 1.0, 3.0], device="cuda", dtype=torch.float32)
m, n = 2, 2
b = torch.tensor([1.0, 2.0], device="cuda", dtype=torch.float32)

solver = CUDASparseSolver(
    matrix_type=MatrixType.SPD,
    dtype=torch.float32,   # optional; can be inferred from value
)
solver.update_matrix((index, value, m, n))
x = solver.solve(b)  # same dtype as b

b and the solution \(x\) must use the same dtype as the matrix values. For ill-conditioned or large systems, float64 can improve robustness.

Structure change vs value-only update

When you call update_matrix again because \(A\) has changed, you can tell the solver whether only the values changed or the sparsity pattern (or shape) changed:

# Only values changed (same non-zero pattern, same m, n) — faster update
solver.update_matrix((index_new, value_new, m, n), structure_changed=False)

# Sparsity pattern or shape changed — full refactorization
solver.update_matrix((index_new, value_new, m_new, n_new), structure_changed=True)

If you omit structure_changed (or pass None), the solver auto-detects by comparing with the previous \(A\): same shape and same index → value-only; otherwise → structure change.

Tip

For value-only updates, passing structure_changed=False avoids unnecessary checks. For a fully new matrix, structure_changed=True or None is fine.

Device

The solver uses the device of the tensors you pass to update_matrix. You can also set device=torch.device("cuda:0") at construction to fix the device; update_matrix and solve will require tensors on that device.

import torch
from cudass import CUDASparseSolver, MatrixType

solver = CUDASparseSolver(
    matrix_type=MatrixType.SPD,
    device=torch.device("cuda:1"),  # pin to GPU 1
)
# index, value, b must be on cuda:1

If device=None (the default), the device is taken from the first update_matrix call.

Inspecting the backend

After the first update_matrix (or before, when no backend is chosen yet), you can inspect which backend is in use:

print(solver.backend_name)  # 'cudss', 'cusolver_dn', 'cusolver_sp', or 'stub'

'stub' appears only before the first update_matrix that triggers factorization. After that, it is one of the real backend names.

Complete example: all options

import torch
from cudass import CUDASparseSolver, MatrixType

index = torch.tensor([[0, 0, 1, 1], [0, 1, 0, 1]], device="cuda", dtype=torch.int64)
value = torch.tensor([4.0, 1.0, 1.0, 3.0], device="cuda", dtype=torch.float64)
m, n = 2, 2
b = torch.tensor([1.0, 2.0], device="cuda", dtype=torch.float64)

solver = CUDASparseSolver(
    matrix_type=MatrixType.SPD,
    use_cache=True,
    dtype=torch.float64,
    device=None,
    prefer_dense=False,
    force_backend=None,
)
solver.update_matrix((index, value, m, n))
x = solver.solve(b)
print(solver.backend_name, x)

For more details, see the API reference and the docstrings of cudass.CUDASparseSolver.