Quick Start

This page walks through a minimal example: a 2×2 symmetric positive definite system \(A x = b\). You will see how to build the sparse matrix, create the solver, and obtain the solution.

The example

We solve:

\[\begin{split}\begin{pmatrix} 4 & 1 \\ 1 & 3 \end{pmatrix} \begin{pmatrix} x_0 \\ x_1 \end{pmatrix} = \begin{pmatrix} 1 \\ 2 \end{pmatrix}\end{split}\]
 1import torch
 2from cudass import CUDASparseSolver, MatrixType
 3
 4# 1. Sparse matrix A in COO: (index [2,nnz], value [nnz], m, n)
 5#    A[0,0]=4, A[0,1]=1, A[1,0]=1, A[1,1]=3
 6index = torch.tensor([[0, 0, 1, 1], [0, 1, 0, 1]], device="cuda", dtype=torch.int64)
 7value = torch.tensor([4.0, 1.0, 1.0, 3.0], device="cuda", dtype=torch.float64)
 8m, n = 2, 2
 9
10# 2. Right-hand side b [m]
11b = torch.tensor([1.0, 2.0], device="cuda", dtype=torch.float64)
12
13# 3. Create solver, set A, solve
14solver = CUDASparseSolver(matrix_type=MatrixType.SPD, use_cache=True)
15solver.update_matrix((index, value, m, n))
16x = solver.solve(b)
17
18print(x)                    # tensor([0.0909, 0.6364], device='cuda:0', dtype=torch.float64)
19print(solver.backend_name)  # 'cudss' or 'cusolver_dn'

What each part does

Lines 7–9 — Sparse matrix in COO

  • index[0] = row indices [0, 0, 1, 1], index[1] = column indices [0, 1, 0, 1].

  • value = [4, 1, 1, 3] for entries (0,0), (0,1), (1,0), (1,1).

  • m, n = 2, 2: 2×2 matrix. All tensors must be on CUDA and index must be int64.

Line 12 — Right-hand side

  • b has shape [m] = [2]. For multiple RHS, use shape [m, k].

Lines 15–17 — Solver and solve

  • CUDASparseSolver(matrix_type=MatrixType.SPD, ...): we declare \(A\) as SPD so the solver can use Cholesky (via cuDSS or cuSOLVER).

  • update_matrix((index, value, m, n)): sets \(A\) and factorizes it. Call again whenever \(A\) (or its sparsity) changes.

  • solve(b): returns \(x\) with shape [n] on the same device and dtype as b.

Line 20 — Backend

  • solver.backend_name is 'cudss' if the cuDSS backend is in use, or 'cusolver_dn' if it fell back to cuSOLVER Dense.

Running it

Save the script (e.g. quickstart.py) and run:

python quickstart.py

Ensure PyTorch sees CUDA (torch.cuda.is_available() == True) and that cudass is installed (see Getting Started).

Reusing the factorization

If \(A\) stays the same and only \(b\) changes, call solve multiple times without calling update_matrix again:

x1 = solver.solve(b)
b2 = torch.tensor([0.5, 1.5], device="cuda", dtype=torch.float64)
x2 = solver.solve(b2)

The factorization from the first update_matrix is reused (and cached when use_cache=True).

Next steps