Shifted MINRES Allocations Issue #937

farhadrclass · 2024-12-11T16:28:45Z

Description:

The MINRES solver unexpectedly allocates memory when using shifted computations, even though it should not. My tests indicate that the solver does not maintain zero allocations as expected, specifically when running the MINRES_lambda test case.

Steps to Reproduce:

Below is the test code used to verify memory allocation for the shifted MINRES solver:

for FC in (Float32, Float64)
  @testset "Data Type: $FC" begin

    A   = FC.(get_div_grad(18, 18, 18))  # Dimension m x n
    m, n = size(A)
    k   = div(n, 2)
    Au  = A[1:k,:]          # Dimension k x n
    Ao  = A[:,1:k]          # Dimension m x k
    b   = Ao * ones(FC, k)  # Dimension m
    c   = Au * ones(FC, n)  # Dimension k
    λ = FC(1.3)

    T = real(FC)
    shifts = T[1; 2; 3; 4; 5]
    nshifts = 5
    nbits_FC = sizeof(FC)  # 8 bits for ComplexF32 and 16 bits for ComplexF64
    nbits_T = sizeof(T) 

    @testset "MINRES" begin
      # MINRES needs:
      # 6 n-vectors: x, r1, r2, w1, w2, y
      storage_minres_bytes(n) = nbits_FC * 6 * n

      expected_minres_bytes = storage_minres_bytes(n)
      minres(A, b)  # warmup
      actual_minres_bytes = @allocated minres(A, b)
      @test expected_minres_bytes ≤ actual_minres_bytes ≤ 1.02 * expected_minres_bytes

      solver = MinresSolver(A, b)
      minres!(solver, A, b)  # warmup
      inplace_minres_bytes = @allocated minres!(solver, A, b)
      @test inplace_minres_bytes == 0
    end

    @testset "MINRES_lambda" begin
      # MINRES needs:
      # 6 n-vectors: x, r1, r2, w1, w2, y
      storage_minres_bytes(n) = nbits_FC * 6 * n

      expected_minres_bytes = storage_minres_bytes(n)
      minres(A, b)  # warmup
      actual_minres_bytes = @allocated minres(A, b)
      @test expected_minres_bytes ≤ actual_minres_bytes ≤ 1.02 * expected_minres_bytes

      solver = MinresSolver(A, b)
      minres!(solver, A, b, λ = λ)  # warmup
      inplace_minres_bytes = @allocated minres!(solver, A, b, λ = λ)
      @test inplace_minres_bytes == 0
    end
  end
end

Observed Results:

Testing Running tests...
MINRES_lambda: Test Failed at C:\Users\Farhad\Documents\Github\Farhad-Phd\Krylov.jl\test\test_allocations.jl:149
  Expression: inplace_minres_bytes == 0
   Evaluated: 160 == 0

Test Summary:           | Pass  Fail  Error  Total  Time
allocations             |   12     2      2     16  9.7s
  Data Type: Float32    |    3     1             4  3.8s
    MINRES              |    2                   2  1.4s
    MINRES_lambda       |    1     1             2  0.5s
  Data Type: Float64    |    3     1             4  1.8s
    MINRES              |    2                   2  0.9s
    MINRES_lambda       |    1     1             2  0.2s

Expected Results:

The solver should allocate no additional memory when using minres! with a shift parameter (λ). Specifically, the inplace_minres_bytes should equal 0.

Impact:

This issue affects the expected performance and efficiency of the solver, particularly in memory-constrained environments.

Additional Information:

The observed allocation of 160 bytes occurs consistently during the MINRES_lambda test case for both Float32 and Float64 data types.

The text was updated successfully, but these errors were encountered:

amontoison · 2024-12-11T17:39:28Z

Impact:

This issue affects the expected performance and efficiency of the solver, particularly in memory-constrained environments.

I will fix it but it's just 160 bits Farhad. :)
The impact should be 0.000001% on the performance.

amontoison · 2024-12-11T17:50:37Z

@farhadrclass
The function doesn't allocate with a shift parameter, the allocations that you see are related to the named tuple used to pass keyword arguments to a Julia function.
You can use the macro @ballocated for more advanced profiling.
I also suppose that you didn't tested too much before saying that this issue affects the performance.

using BenchmarkTools

for FC in (Float32, Float64)
  @testset "Data Type: $FC" begin

    A   = FC.(get_div_grad(18, 18, 18))  # Dimension m x n
    m, n = size(A)
    k   = div(n, 2)
    Au  = A[1:k,:]          # Dimension k x n
    Ao  = A[:,1:k]          # Dimension m x k
    b   = Ao * ones(FC, k)  # Dimension m
    c   = Au * ones(FC, n)  # Dimension k
    λ = FC(1.3)

    T = real(FC)
    shifts = T[1; 2; 3; 4; 5]
    nshifts = 5
    nbits_FC = sizeof(FC)  # 8 bits for ComplexF32 and 16 bits for ComplexF64
    nbits_T = sizeof(T) 

    @testset "MINRES" begin
      # MINRES needs:
      # 6 n-vectors: x, r1, r2, w1, w2, y
      storage_minres_bytes(n) = nbits_FC * 6 * n

      expected_minres_bytes = storage_minres_bytes(n)
      minres(A, b)  # warmup
      actual_minres_bytes = @allocated minres(A, b)
      @test expected_minres_bytes ≤ actual_minres_bytes ≤ 1.02 * expected_minres_bytes

      solver = MinresSolver(A, b)
      minres!(solver, A, b)  # warmup
      inplace_minres_bytes = @allocated minres!(solver, A, b)
      @test inplace_minres_bytes == 0
    end

    @testset "MINRES_lambda" begin
      # MINRES needs:
      # 6 n-vectors: x, r1, r2, w1, w2, y
      storage_minres_bytes(n) = nbits_FC * 6 * n

      expected_minres_bytes = storage_minres_bytes(n)
      minres(A, b)  # warmup
      actual_minres_bytes = @ballocated minres($A, $b)
      @test expected_minres_bytes ≤ actual_minres_bytes ≤ 1.02 * expected_minres_bytes

      solver = MinresSolver(A, b)
      minres!(solver, A, b; λ)  # warmup
      inplace_minres_bytes = @ballocated minres!($solver, $A, $b, λ=$λ)
      @test inplace_minres_bytes == 0
    end
  end
end

Test Summary:      | Pass  Total   Time
Data Type: Float32 |    4      4  21.6s
Test Summary:      | Pass  Total   Time
Data Type: Float64 |    4      4  24.6s

farhadrclass · 2024-12-11T19:40:03Z

Thanks @amontoison

The issue I have is that when I use MINRES in my solver: https://github.com/Farhad-phd/JSOSolvers.jl/blob/iR2/src/R2N.jl
I get allocation when I use minres and cg_solvers in general

Subsolver	Time (μs)	Allocations	Memory
ShiftedLBFGSSolver	72.800	268	9.50 KiB
CgSolver	94.000	899	63.62 KiB
MinresSolver	49.800	443	40.17 KiB

I still need to clean the code a bit but the allocation is there

amontoison · 2024-12-11T19:47:37Z

The allocation could be in Krylov.jl but you have to prove it :)
Please profile the code and ping me again if you can isolate the culprit.
You can see that you also have allocations when it's not a KrylovSolver so I suspect that the issue is in your code.

farhadrclass · 2024-12-11T19:51:29Z

@amontoison exactly, I am trying to remove my allocation part of the code, however the 3 algorithm uses the same code except the call to subsolver, so I assumed 268 is from there. For CgSolver the main issue is that I need to shift the H with \sgima*I which creates the memory and I am fixing that right now. I was expecting the MinresSolver however to have around 268 or less allocation but that is not the case.

Would it be possible to check on Minres end and see if allocate when using shift ?

amontoison · 2024-12-11T19:53:45Z

I checked it with my unit tests above (no allocation with @ballocated) and I verified quickly the code, there is no reason that it allocates.

amontoison · 2024-12-23T18:19:07Z

@farhadrclass
The number of outer iterations in your optimization method depends on how accurately the linear subproblem is solved.
You should also have 268 allocations with CG, MINRES or Shifted-LBFGS if you perform EXACTLY the same number of iterations in your outer loop.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shifted MINRES Allocations Issue #937

Shifted MINRES Allocations Issue #937

farhadrclass commented Dec 11, 2024

amontoison commented Dec 11, 2024 •

edited

Loading

amontoison commented Dec 11, 2024 •

edited

Loading

farhadrclass commented Dec 11, 2024

amontoison commented Dec 11, 2024

farhadrclass commented Dec 11, 2024

amontoison commented Dec 11, 2024

amontoison commented Dec 23, 2024 •

edited

Loading

Shifted MINRES Allocations Issue #937

Shifted MINRES Allocations Issue #937

Comments

farhadrclass commented Dec 11, 2024

Description:

Steps to Reproduce:

Observed Results:

Expected Results:

Impact:

Additional Information:

amontoison commented Dec 11, 2024 • edited Loading

amontoison commented Dec 11, 2024 • edited Loading

farhadrclass commented Dec 11, 2024

amontoison commented Dec 11, 2024

farhadrclass commented Dec 11, 2024

amontoison commented Dec 11, 2024

amontoison commented Dec 23, 2024 • edited Loading

amontoison commented Dec 11, 2024 •

edited

Loading

amontoison commented Dec 11, 2024 •

edited

Loading

amontoison commented Dec 23, 2024 •

edited

Loading