API Reference¶

class cudass.CUDASparseSolver(matrix_type: MatrixType, use_cache: bool = True, dtype: dtype = torch.float64, device: device | None = None, prefer_dense: bool = False, force_backend: str | None = None)[source]¶

Bases: object

High-performance CUDA sparse linear solver.

Supports multiple matrix types with cuDSS (primary), cuSOLVER Dense (fallback for singular/rectangular), and cuSolverSp (OOM fallback).

Parameters:

matrix_type – Matrix type (must be specified explicitly).
use_cache – Whether to cache factorizations.
dtype – Floating point precision (torch.float32 or torch.float64).
device – CUDA device (auto-detected from inputs if None).
prefer_dense – If True, prefer cusolver_dn over cudss when applicable.
force_backend – If set to ‘cudss’, ‘cusolver_dn’, or ‘cusolver_sp’, use that backend and do not fallback.

Raises:

RuntimeError – If CUDA is not available.

property backend_name: str¶

‘cudss’, ‘cusolver_dn’, ‘cusolver_sp’, or ‘stub’ before first solve.

Returns:: Backend name.
Return type:: str
Type:: Backend in use

solve(b: Tensor) → Tensor[source]¶

Solve Ax = b using the current A from the last update_matrix.

Parameters:

b – RHS vector/matrix, shape [m] or [m, k], same dtype as A, on CUDA.

Returns:

Solution x, shape [n] or [n, k], same dtype and device as b.

Raises:

RuntimeError – If the backend solver fails or a CUDA error occurs.
ValueError – If no matrix set (update_matrix first) or shapes/dtypes/devices invalid.

update_matrix(A_sparse: Tuple[Tensor, Tensor, int, int], structure_changed: bool | None = None) → None[source]¶

Set or update the matrix A: factorize and cache.

Call before the first solve; call again whenever A changes.

Parameters:

A_sparse – Sparse matrix tuple (index, value, m, n).
structure_changed – If True, sparsity pattern changed (full refactorization); if False, only values changed (fast update). If None, auto-detect from previous A.

Raises:

ValueError – If matrix shape/type incompatible with solver, or invalid dtypes/devices.

class cudass.MatrixProperties(shape: Tuple[int, int], is_square: bool, is_overdetermined: bool, is_underdetermined: bool, is_singular: bool | None = None)[source]¶

Bases: object

Validated or derived matrix properties (e.g. from shape). For validation only.

is_overdetermined: bool¶

is_singular: bool | None = None¶

is_square: bool¶

is_underdetermined: bool¶

shape: Tuple[int, int]¶

class cudass.MatrixType(value)[source]¶

Bases: Enum

Matrix type for solver selection. Real matrices only (float32/float64).

User must specify explicitly; no auto-inference. Shape: square types require m==n; GENERAL_RECTANGULAR and GENERAL_RECTANGULAR_SINGULAR require m!=n.

GENERAL = 'general'¶

GENERAL_RECTANGULAR = 'general_rectangular'¶

GENERAL_RECTANGULAR_SINGULAR = 'general_rectangular_singular'¶

GENERAL_SINGULAR = 'general_singular'¶

SPD = 'spd'¶

SYMMETRIC = 'symmetric'¶

SYMMETRIC_SINGULAR = 'symmetric_singular'¶

Matrix type enumeration and properties for the CUDA sparse solver.

class cudass.types.MatrixProperties(shape: Tuple[int, int], is_square: bool, is_overdetermined: bool, is_underdetermined: bool, is_singular: bool | None = None)[source]¶

Bases: object

Validated or derived matrix properties (e.g. from shape). For validation only.

is_overdetermined: bool¶

is_singular: bool | None = None¶

is_square: bool¶

is_underdetermined: bool¶

shape: Tuple[int, int]¶

class cudass.types.MatrixType(value)[source]¶

Bases: Enum

Matrix type for solver selection. Real matrices only (float32/float64).

User must specify explicitly; no auto-inference. Shape: square types require m==n; GENERAL_RECTANGULAR and GENERAL_RECTANGULAR_SINGULAR require m!=n.

GENERAL = 'general'¶

GENERAL_RECTANGULAR = 'general_rectangular'¶

GENERAL_RECTANGULAR_SINGULAR = 'general_rectangular_singular'¶

GENERAL_SINGULAR = 'general_singular'¶

SPD = 'spd'¶

SYMMETRIC = 'symmetric'¶

SYMMETRIC_SINGULAR = 'symmetric_singular'¶

cudass.types.validate_matrix_type_shape(matrix_type: MatrixType, m: int, n: int) → None[source]¶

Validate that matrix_type is consistent with shape (m, n).

Parameters:

matrix_type – The declared matrix type.
m – Number of rows.
n – Number of columns.

Raises:

ValueError – If square types have m != n or rectangular types have m == n.

Main solver interface for the CUDA sparse linear solver.

class cudass.solver.CUDASparseSolver(matrix_type: MatrixType, use_cache: bool = True, dtype: dtype = torch.float64, device: device | None = None, prefer_dense: bool = False, force_backend: str | None = None)[source]¶

Bases: object

High-performance CUDA sparse linear solver.

Supports multiple matrix types with cuDSS (primary), cuSOLVER Dense (fallback for singular/rectangular), and cuSolverSp (OOM fallback).

Parameters:

matrix_type – Matrix type (must be specified explicitly).
use_cache – Whether to cache factorizations.
dtype – Floating point precision (torch.float32 or torch.float64).
device – CUDA device (auto-detected from inputs if None).
prefer_dense – If True, prefer cusolver_dn over cudss when applicable.
force_backend – If set to ‘cudss’, ‘cusolver_dn’, or ‘cusolver_sp’, use that backend and do not fallback.

Raises:

RuntimeError – If CUDA is not available.

property backend_name: str¶

‘cudss’, ‘cusolver_dn’, ‘cusolver_sp’, or ‘stub’ before first solve.

Returns:: Backend name.
Return type:: str
Type:: Backend in use

solve(b: Tensor) → Tensor[source]¶

Solve Ax = b using the current A from the last update_matrix.

Parameters:

b – RHS vector/matrix, shape [m] or [m, k], same dtype as A, on CUDA.

Returns:

Solution x, shape [n] or [n, k], same dtype and device as b.

Raises:

RuntimeError – If the backend solver fails or a CUDA error occurs.
ValueError – If no matrix set (update_matrix first) or shapes/dtypes/devices invalid.

update_matrix(A_sparse: Tuple[Tensor, Tensor, int, int], structure_changed: bool | None = None) → None[source]¶

Set or update the matrix A: factorize and cache.

Call before the first solve; call again whenever A changes.

Parameters:

A_sparse – Sparse matrix tuple (index, value, m, n).
structure_changed – If True, sparsity pattern changed (full refactorization); if False, only values changed (fast update). If None, auto-detect from previous A.

Raises:

ValueError – If matrix shape/type incompatible with solver, or invalid dtypes/devices.

Backends¶

Backend selection: select_backend and create_backend.

cudass.backends.factory.create_backend(matrix_type: MatrixType, shape: Tuple[int, int], device: device, dtype: dtype, use_cache: bool = True, cache: object | None = None, prefer_dense: bool = False, force_backend: str | None = None) → BackendBase[source]¶

Instantiate the backend for the given matrix type and shape.

Parameters:

matrix_type – MatrixType for the linear system.
shape – (m, n) matrix dimensions.
device – CUDA device for tensors.
dtype – torch.float32 or torch.float64.
use_cache – Whether to cache factorizations.
cache – FactorizationCache instance (or None).
prefer_dense – If True, select cusolver_dn when otherwise cudss would be chosen.
force_backend – If set to ‘cudss’, ‘cusolver_dn’, or ‘cusolver_sp’, use that backend and do not fallback (e.g. cudss bindings missing will raise).

Returns:

cudss, cusolver_dn, or cusolver_sp backend instance.

Return type:

BackendBase

Raises:

ValueError – When force_backend or selected backend name is not in ‘cudss’, ‘cusolver_dn’, ‘cusolver_sp’.
RuntimeError – When force_backend is ‘cudss’ and cudss_bindings are not available (no fallback if force_backend is set).

cudass.backends.factory.select_backend(matrix_type: MatrixType, shape: Tuple[int, int], prefer_dense: bool = False) → str[source]¶

Choose backend name from matrix type and shape.

Parameters:

matrix_type – MatrixType (GENERAL, SYMMETRIC, SPD, GENERAL_RECTANGULAR, etc.).
shape – (m, n) matrix dimensions.
prefer_dense – If True, prefer cusolver_dn over cudss when applicable.

Returns:

“cudss”, “cusolver_dn”, or “cusolver_sp”.

class cudass.backends.base.BackendBase[source]¶

Bases: ABC

Abstract base for solver backends. To be implemented by cudss, cusolver_dn, cusolver_sp.

abstract property backend_name: str¶

‘cudss’, ‘cusolver_dn’, or ‘cusolver_sp’.

Type:: Backend identifier

abstract property device: device¶

abstract property dtype: dtype¶

abstractmethod solve(b: Tensor) → Tensor[source]¶

abstractmethod update_matrix(A_sparse: Tuple[Tensor, Tensor, int, int], structure_changed: bool = False) → None[source]¶

cuDSS backend: COO->CSR, ANALYSIS/FACTORIZATION/SOLVE, multiple RHS.

Prefers our cudss_bindings; falls back to nvmath’s cudss only if our bindings are not built/importable.

class cudass.backends.cudss_backend.CUDSSBackend(matrix_type: MatrixType, device: device, dtype: dtype, use_cache: bool = True, cache: Any | None = None)[source]¶

Bases: BackendBase

cuDSS backend for GENERAL, SYMMETRIC, SPD, and tentative GENERAL_RECTANGULAR.

property backend_name: str¶

‘cudss’, ‘cusolver_dn’, or ‘cusolver_sp’.

Type:: Backend identifier

property device: device¶

property dtype: dtype¶

solve(b: Tensor) → Tensor[source]¶

update_matrix(A_sparse: Tuple[Tensor, Tensor, int, int], structure_changed: bool = False) → None[source]¶

cuSOLVER Dense backend: densify, potrs/sytrs/getrs/gels/gesvd/syevd.

class cudass.backends.cusolver_dn_backend.CusolverDnBackend(matrix_type: MatrixType, device: device, dtype: dtype, use_cache: bool = True, cache: Any | None = None)[source]¶

Bases: BackendBase

Dense backend using torch.linalg (cuSOLVER/cuBLAS on CUDA).

property backend_name: str¶

‘cudss’, ‘cusolver_dn’, or ‘cusolver_sp’.

Type:: Backend identifier

property device: device¶

property dtype: dtype¶

solve(b: Tensor) → Tensor[source]¶

update_matrix(A_sparse: Tuple[Tensor, Tensor, int, int], structure_changed: bool = False) → None[source]¶

cudass.backends.cusolver_dn_backend.select_cusolver_routine(matrix_type: MatrixType, shape: Tuple[int, int]) → str[source]¶

Select routine: potrs, sytrs, syevd, getrs, gels, gesvd.

Parameters:

matrix_type – MatrixType (SPD, SYMMETRIC, GENERAL, etc.).
shape – (m, n) matrix dimensions.

Returns:

One of ‘potrs’, ‘sytrs’, ‘syevd’, ‘getrs’, ‘gels’, ‘gesvd’.

Return type:

str

cuSolverSp backend - OOM fallback. Stub until Phase 3.

class cudass.backends.cusolver_sp_backend.CusolverSpBackend(matrix_type: MatrixType, device: device, dtype: dtype, use_cache: bool = True, cache: Any | None = None)[source]¶

Bases: BackendBase

Stub: to be implemented in cusolver_sp_backend (Phase 3).

property backend_name: str¶

‘cudss’, ‘cusolver_dn’, or ‘cusolver_sp’.

Type:: Backend identifier

property device: device¶

property dtype: dtype¶

solve(b: Tensor) → Tensor[source]¶

update_matrix(A_sparse: Tuple[Tensor, Tensor, int, int], structure_changed: bool = False) → None[source]¶

Factories and utilities¶

Factorization cache: get/put/clear, LRU, device-aware, thread-safe.

class cudass.factorization.cache.FactorizationCache(max_size: int = 100)[source]¶

Bases: object

Cache for solver factorizations. Device-aware, LRU eviction, thread-safe.

clear(device: device | None = None) → None[source]¶

get(cache_key: str, device: device) → Any | None[source]¶

put(cache_key: str, factorization: Any, device: device) → None[source]¶

RefactorizationManager: should_refactorize (value-only vs structure change).

class cudass.factorization.refactorization.RefactorizationManager[source]¶

Bases: object

Determines if refactorization is needed and whether structure changed.

should_refactorize(old_A: Tuple[Tensor, Tensor, int, int] | None, new_A: Tuple[Tensor, Tensor, int, int]) → Tuple[bool, bool][source]¶

Return (needs_refactorization, structure_changed).

If old_A is None: (True, True).
If (m,n) or indices differ: (True, True).
If indices equal, values differ: (True, False).
If same structure and values: (False, False).

Parameters:

old_A – Previous (index, value, m, n) or None.
new_A – Current (index, value, m, n).

Returns:

(needs_refactorization, structure_changed).

Return type:

Tuple[bool, bool]

Kernels¶

cudass.cuda.kernels.sparse_to_dense(index: Tensor, value: Tensor, m: int, n: int, out: Tensor | None = None) → Tensor[source]¶

COO (index [2,nnz], value [nnz]) to dense [m,n], on GPU.

Parameters:

index – COO indices [2, nnz], int64, CUDA.
value – COO values [nnz], float32/float64, CUDA.
m – Number of rows.
n – Number of columns.
out – Optional output tensor [m, n]; if None, allocated.

Returns:

Dense tensor [m, n], same dtype and device as value (or out).

Raises:

RuntimeError – If sparse_to_dense kernel is not built.