.. _advanced_options:

Advanced Options
----------------

This section covers solver knobs and patterns: backend selection, factorization
cache, dtypes, and when to tell the solver that the sparsity pattern of :math:`A`
has changed.

.. _prefer_dense:

Prefer cuSOLVER Dense (``prefer_dense``)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

By default, the solver tries **cuDSS** first for general, symmetric, and SPD
square systems. If you prefer to use **cuSOLVER Dense** instead (e.g. to avoid
cuDSS or to work around rectangular cuDSS limitations), set ``prefer_dense=True``:

.. code-block:: python

   from cudass import CUDASparseSolver, MatrixType

   solver = CUDASparseSolver(
       matrix_type=MatrixType.SPD,
       use_cache=True,
       prefer_dense=True,
   )
   # ... update_matrix, solve ...

The solver will choose the ``cusolver_dn`` backend for supported matrix types,
which densifies :math:`A` and uses cuSOLVER (via PyTorch).

.. _force_backend:

Forcing a specific backend (``force_backend``)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You can force a backend and disable fallback with ``force_backend``:

.. code-block:: python

   solver = CUDASparseSolver(
       matrix_type=MatrixType.GENERAL,
       force_backend="cusolver_dn",   # or "cudss", "cusolver_sp"
   )

* ``"cudss"`` — Use cuDSS only; raises if cuDSS bindings are missing or
  cuDSS returns "not supported".
* ``"cusolver_dn"`` — Use cuSOLVER Dense; works for all supported matrix types.
* ``"cusolver_sp"`` — Reserved for cuSolverSp (currently a stub).

Use ``force_backend`` when you need reproducibility, when debugging a specific
backend, or when fallback is undesirable (e.g. you want to detect that cuDSS
is unavailable).

.. note::

   If you set ``force_backend="cudss"`` and cuDSS is not built or returns
   "not supported" for the matrix, the solver will raise. Use
   ``force_backend="cusolver_dn"`` for a robust fallback.

Factorization cache (``use_cache``)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

``CUDASparseSolver(..., use_cache=True)`` (the default) caches factorizations so
that when you call ``update_matrix`` again with the **same sparsity pattern** and
matrix shape, only the values are updated and refactorization can be faster. The
cache has a limited size; old entries are evicted.

Set ``use_cache=False`` to disable caching (e.g. to reduce memory or if you
never reuse the same structure):

.. code-block:: python

   solver = CUDASparseSolver(matrix_type=MatrixType.SPD, use_cache=False)

Float32 vs float64
~~~~~~~~~~~~~~~~~~

You can use ``torch.float32`` or ``torch.float64`` for ``value`` and ``b``. Pass
``dtype`` when creating the solver if you want to enforce a specific precision;
otherwise it is inferred from the first ``update_matrix``.

.. code-block:: python

   import torch
   from cudass import CUDASparseSolver, MatrixType

   # Use float32; index, m, n as in other examples
   index = torch.tensor([[0, 0, 1, 1], [0, 1, 0, 1]], device="cuda", dtype=torch.int64)
   value = torch.tensor([4.0, 1.0, 1.0, 3.0], device="cuda", dtype=torch.float32)
   m, n = 2, 2
   b = torch.tensor([1.0, 2.0], device="cuda", dtype=torch.float32)

   solver = CUDASparseSolver(
       matrix_type=MatrixType.SPD,
       dtype=torch.float32,   # optional; can be inferred from value
   )
   solver.update_matrix((index, value, m, n))
   x = solver.solve(b)  # same dtype as b

``b`` and the solution :math:`x` must use the same dtype as the matrix values.
For ill-conditioned or large systems, ``float64`` can improve robustness.

Structure change vs value-only update
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

When you call ``update_matrix`` again because :math:`A` has changed, you can tell
the solver whether only the **values** changed or the **sparsity pattern** (or
shape) changed:

.. code-block:: python

   # Only values changed (same non-zero pattern, same m, n) — faster update
   solver.update_matrix((index_new, value_new, m, n), structure_changed=False)

   # Sparsity pattern or shape changed — full refactorization
   solver.update_matrix((index_new, value_new, m_new, n_new), structure_changed=True)

If you omit ``structure_changed`` (or pass ``None``), the solver **auto-detects**
by comparing with the previous :math:`A`: same shape and same ``index``
→ value-only; otherwise → structure change.

.. tip::

   For value-only updates, passing ``structure_changed=False`` avoids
   unnecessary checks. For a fully new matrix, ``structure_changed=True`` or
   ``None`` is fine.

Device
~~~~~~

The solver uses the device of the tensors you pass to ``update_matrix``. You can
also set ``device=torch.device("cuda:0")`` at construction to fix the device;
``update_matrix`` and ``solve`` will require tensors on that device.

.. code-block:: python

   import torch
   from cudass import CUDASparseSolver, MatrixType

   solver = CUDASparseSolver(
       matrix_type=MatrixType.SPD,
       device=torch.device("cuda:1"),  # pin to GPU 1
   )
   # index, value, b must be on cuda:1

If ``device=None`` (the default), the device is taken from the first
``update_matrix`` call.

Inspecting the backend
~~~~~~~~~~~~~~~~~~~~~~

After the first ``update_matrix`` (or before, when no backend is chosen yet),
you can inspect which backend is in use:

.. code-block:: python

   print(solver.backend_name)  # 'cudss', 'cusolver_dn', 'cusolver_sp', or 'stub'

``'stub'`` appears only before the first ``update_matrix`` that triggers
factorization. After that, it is one of the real backend names.

Complete example: all options
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   import torch
   from cudass import CUDASparseSolver, MatrixType

   index = torch.tensor([[0, 0, 1, 1], [0, 1, 0, 1]], device="cuda", dtype=torch.int64)
   value = torch.tensor([4.0, 1.0, 1.0, 3.0], device="cuda", dtype=torch.float64)
   m, n = 2, 2
   b = torch.tensor([1.0, 2.0], device="cuda", dtype=torch.float64)

   solver = CUDASparseSolver(
       matrix_type=MatrixType.SPD,
       use_cache=True,
       dtype=torch.float64,
       device=None,
       prefer_dense=False,
       force_backend=None,
   )
   solver.update_matrix((index, value, m, n))
   x = solver.solve(b)
   print(solver.backend_name, x)

For more details, see the :doc:`API reference <../api/index>` and the docstrings
of :class:`cudass.CUDASparseSolver`.