Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] sample in mps.jl broken when using CUDA #92

Closed
tipfom opened this issue Nov 13, 2024 · 4 comments · Fixed by #93
Closed

[BUG] sample in mps.jl broken when using CUDA #92

tipfom opened this issue Nov 13, 2024 · 4 comments · Fixed by #93

Comments

@tipfom
Copy link
Contributor

tipfom commented Nov 13, 2024

The following minimal working example crashes in a MethodError: no method matching DenseVector{Float64}(::UndefInitializer, ::Tuple{Int64}) in ITensorMPS version 0.3.1:

using CUDA
using ITensors, ITensorMPS, NDTensors

N = 10
sites = siteinds("Qubit", N)
ψ = NDTensors.cu(MPS(sites, repeat(["+"], N)))

sample!(ψ)

The error can be mitigated by copying the MPS to the cpu first, or by adopting the sample function in the following way:

function sample(rng::AbstractRNG, m::MPS)
  N = length(m)

  if orthocenter(m) != 1
    error("sample: MPS m must have orthocenter(m)==1")
  end
  if abs(1.0 - norm(m[1])) > 1E-8
    error("sample: MPS is not normalized, norm=$(norm(m[1]))")
  end

  result = zeros(Int, N)
  A = m[1]

  for j in 1:N
    s = siteind(m, j)
    d = dim(s)
    # Compute the probability of each state
    # one-by-one and stop when the random
    # number r is below the total prob so far
    pdisc = 0.0
    r = rand(rng)
    # Will need n,An, and pn below
    n = 1
    An = NDTensors.cu(ITensor()) # this line is changed
    pn = 0.0
    while n <= d
      projn = ITensor(s)
      projn[s => n] = 1.0
      An = A * dag(NDTensors.cu(projn)) # this line is changed
      pn = real(scalar(dag(An) * An))
      pdisc += pn
      (r < pdisc) && break
      n += 1
    end
    result[j] = n
    if j < N
      A = m[j + 1] * An
      A *= (1.0 / sqrt(pn))
    end
  end
  return result
end

I suppose that there is a more performant fix, since the above implementation is rather slow.

@mtfishman
Copy link
Member

Thanks for the report, it looks like we are implicitly constructing some tensors on CPU inside that code. The fix you wrote is reasonable, though it can be made more general (for other GPU backends) using something like adapt(datatype(A), projn), as seen here: https://github.com/ITensor/ITensorMPS.jl/blob/v0.3.1/src/mps.jl#L818.

@mtfishman
Copy link
Member

@kmp5VT could you make a PR?

@mtfishman
Copy link
Member

Also, I see a lot of places in that code that implicitly use double precision, like https://github.com/ITensor/ITensorMPS.jl/blob/v0.3.1/src/mps.jl#L685 (1.0 is implicitly Float64 in Julia). We should make that function more generic/agnostic about the scalar type in order to preserve the precision of the input.

@tipfom
Copy link
Contributor Author

tipfom commented Nov 13, 2024

Hi @mtfishman and @kmp5VT, I created a pull request encorporating the desired changes, which fixes the issue in my workflow (A100 gpu and CUDA.jl). I also tested for normal CPU based MPS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants