Request for help #12

goedman · 2019-02-02T02:54:17Z

Tamas, I was wondering if you would mind to take a look at m10.04d.jl?

Until chapter 10 in StatisticalRethinking.jl I've had no trouble adding the DynamicHMC versions in addition to Turing and CmdStan versions, e.g. m10.04t.jl and m10.04s.jl for. model 10.4.

I followed the approach I'd used successfully for m10.02d1.jl, which works fine, but in this case I get

julia> include("/Users/rob/.julia/dev/StatisticalRethinking/scripts/10/m10.04d.jl")

MCMC, adapting ϵ (75 steps)
0.011 s/step ...done
MCMC, adapting ϵ (25 steps)
0.0088 s/step ...done
MCMC, adapting ϵ (50 steps)
ERROR: LoadError: ArgumentError: isfinite(value) && all(isfinite, gradient) || value == -Inf must hold.
Stacktrace:
 [1] LogDensityProblems.ValueGradient{Float64,Array{Float64,1}}(::Float64, ::Array{Float64,1}) at /Users/rob/.julia/packages/ArgCheck/BUMkA/src/checks.jl:165
 [2] Type at /Users/rob/.julia/packages/LogDensityProblems/F34aW/src/LogDensityProblems.jl:73 [inlined]
 [3] logdensity(::Type{LogDensityProblems.ValueGradient}, ::LogDensityProblems.ForwardDiffLogDensity{TransformedLogDensity{TransformVariables.TransformNamedTuple{(:β, :α),Tuple{TransformVariables.ArrayTransform{TransformVariables.Identity,1},TransformVariables.ArrayTransform{TransformVariables.Identity,1}}},m_10_04d_model{Array{Int64,1},Array{Int64,2},Array{Int64,1}}},ForwardDiff.GradientConfig{ForwardDiff.Tag{getfield(LogDensityProblems, Symbol("##3#4")){TransformedLogDensity{TransformVariables.TransformNamedTuple{(:β, :α),Tuple{TransformVariables.ArrayTransform{TransformVariables.Identity,1},TransformVariables.ArrayTransform{TransformVariables.Identity,1}}},m_10_04d_model{Array{Int64,1},Array{Int64,2},Array{Int64,1}}}},Float64},Float64,9,Array{ForwardDiff.Dual{ForwardDiff.Tag{getfield(LogDensityProblems, Symbol("##3#4")){TransformedLogDensity{TransformVariables.TransformNamedTuple{(:β, :α),Tuple{TransformVariables.ArrayTransform{TransformVariables.Identity,1},TransformVariables.ArrayTransform{TransformVariables.Identity,1}}},m_10_04d_model{Array{Int64,1},Array{Int64,2},Array{Int64,1}}}},Float64},Float64,9},1}}}, ::Array{Float64,1}) at /Users/rob/.julia/packages/LogDensityProblems/F34aW/src/AD_ForwardDiff.jl:45

<snipped>

Different starting values in θ = (β = [1.0, 0.0], α = [-1.0, 10.0, -1.0, -1.0, -1.0, 0.0, 2.0]) do not solve the issue, thus I wonder if this way of modeling such problems is correct.

Thanks for you help (if doable!),
Rob

The text was updated successfully, but these errors were encountered:

tpapp · 2019-02-02T06:13:00Z

Will look into it today.

tpapp · 2019-02-02T07:59:39Z

I think it is an issue with AD producing NaNs or similar. Ideally one would investigate, but I added a quick fix in the form of a wrapper, I made a PR in the repo.

Please review and let me know if this works.

tpapp · 2019-02-02T14:37:21Z

Also, you can investigate corner cases of log densities with LogDensityProblems.stresstest.

goedman · 2019-02-02T15:21:48Z

As I mentioned in the SR pull request comment, will try several more models and compare the results. I wonder if rejecting the cases where AD produces NaNs might influence the resulting densities. Maybe stress testing will help out here.

If that's ok, I'll try some more models before closing this issue.

tpapp · 2019-02-02T15:29:44Z

That's fine, keep it open as long as you like and feel free to ask if there is anything I can help with.

goedman · 2019-02-10T14:57:12Z

Tamas, in trying to dig a bit deeper into the posterior draws from m10.4d.jl, I have tried to apply stresstest:

LogDensityProblems.stresstest(P, N=1000, scale=1.0)
0-element Array{Array{Float64,1},1}

P in this case uses:

P = TransformedLogDensity(problem_transformation(p), p)
#∇P = LogDensityRejectErrors(ADgradient(:ForwardDiff, P));
∇P = ADgradient(:ForwardDiff, P);

Interestingly, interchanging the ∇P options in m12.6d.jl is spot on (and unaffected), which I interpret as some support for my earlier hypothesis that the rejection errors might create a bias.

Am I correct to assume that stresstest, if applied correctly, would show parameter values where NaNs occur? I was hoping to be able to plot those regions as part of the debugging proces.

tpapp · 2019-02-10T15:00:58Z

You should try stresstest with the ∇P (plain vanilla ADgradient, no LogDensityRejectErrors). It should indeed return the vector of locations which are not OK (eg evaluate with NaNs). Plotting is a good strategy.

goedman · 2019-02-10T15:12:19Z

As you suggest as well, I'd commented out the LogDensityRejectErrors line and replaced it with the vanilla ADgradient:

P = TransformedLogDensity(problem_transformation(p), p)
#∇P = LogDensityRejectErrors(ADgradient(:ForwardDiff, P));
∇P = ADgradient(:ForwardDiff, P);

and it fails on the NUTS_init_tune_mcmc(∇P, 1000) line.

I'll experiment a bit more.

goedman · 2019-03-11T16:48:27Z

Tamas, as I noticed (and tested) your updates to DynamicHMC and LogDensityProblems, can I ask for a little push to get stresstest() to work in trying to resolve the only model that continues to give different results in DynamicHMC vs. Turing & CmdStan?

In m10.4d1.jl (results illustrated in m10.4d1) I've tried to use stresstest() and different AD methods.

If I enable do_stresstest it will show:

julia> include("/Users/rob/.julia/dev/DynamicHMCModels/scripts/10/m10.04d1.jl")
MCMC, adapting ϵ (75 steps)
ERROR: Error while loading expression starting at /Users/rob/.julia/dev/DynamicHMCModels/scripts/10/m10.04d1.jl:87
caused by [exception 1]
InvalidLogDensityException: value is Dual{ForwardDiff.Tag{getfield(LogDensityProblems, Symbol("##3#4")){TransformedLogDensity{TransformVariables.TransformNamedTuple{(:β, :α),Tuple{TransformVariables.ArrayTransform{TransformVariables.Identity,1},TransformVariables.ArrayTransform{TransformVariables.Identity,1}}},m_10_04d_model{Array{Int64,1},Array{Int64,2},Array{Int64,1}}}},Float64}}(NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN)

Using LogDensityRejectErrors provides very stable results and the plots of the chains look reasonable (I've also tried 3 and 4 chains). It is true that the pooled estimate of a[2] is a bit higher in scale. But it is surprising this is the only model where I've seen a problem.

Particularly the sd values are vey high, in Turing, using NUTS, usually an indication that the adaptation draws are included but I don't think the posterior includes those in DynamicHMC.

tpapp · 2019-03-11T17:13:06Z

If I understand correctly, there are two issues:

the script errors,
the results are different.

Did these happen after the recent (minor) updates to LogDensityProblems and DynamicHMC, or did you have them before?

goedman · 2019-03-11T17:45:59Z

Correct, w.r.t. issue 2), normally (for all other models) I use LogDensityRejectErrors(ADgradient(:ForwardDiff, P)) and the results are spot on.

W.r.t. issue 1), only for the m10.4d1.jl model I have experimented with stresstest by setting do_stresstest = true. The last few weeks it fails on the ADgradient() call with above script error.

tpapp · 2019-03-11T18:37:52Z

Are you using the latest LogDensityProblems and DynamicHMC?

I could not reproduce the errors above.

Also, I am not sure how I would detect the discrepancy, should I look at the plots?

goedman · 2019-03-11T18:42:51Z

In the script the results from Turing and CmdStan are included:

rethinking = "
      mean   sd  5.5% 94.5% n_eff Rhat
a[1] -0.74 0.27 -1.19 -0.31  2899    1
a[2] 10.77 5.20  4.60 20.45  1916    1
a[3] -1.05 0.28 -1.50 -0.62  3146    1
a[4] -1.05 0.28 -1.50 -0.61  3525    1
a[5] -0.73 0.28 -1.17 -0.28  3637    1
a[6]  0.22 0.27 -0.21  0.67  3496    1
a[7]  1.82 0.41  1.21  2.50  3202    1
bp    0.83 0.27  0.42  1.27  2070    1
bpC  -0.13 0.31 -0.62  0.34  3430    1
";

I tried this morning with:

[[DynamicHMC]]
deps = ["ArgCheck", "DataStructures", "DocStringExtensions", "LinearAlgebra", "LogDensityProblems", "Parameters", "Random", "Statistics", "StatsFuns", "Test", "TransformVariables"]
git-tree-sha1 = "f9c9e6e703afccb02a1179e4886c97134ff20b55"
uuid = "bbc10e6e-7c05-544b-b16e-64fede858acb"
version = "1.0.4"

...

[[LogDensityProblems]]
deps = ["ArgCheck", "BenchmarkTools", "DiffResults", "DocStringExtensions", "Parameters", "Pkg", "Random", "Requires", "Test", "TransformVariables"]
git-tree-sha1 = "1ce5ebd16b5026a35cbf4daec66b2433b83d0d23"
uuid = "6fdf6af0-433a-55f7-b3ed-c6c6e0b8df7c"
version = "0.8.0"

in Manifest.toml.

goedman · 2019-03-11T18:49:04Z

The results I'm getting:

Log evidence      = 0.0
Iterations        = 1:3000
Thinning interval = 1
Chains            = 1
Samples per chain = 3000
parameters        = bp, bpC

Empirical Posterior Estimates
───────────────────────────────────────────
parameters
     Mean    SD   Naive SE  MCSE     ESS   
 bp 1.5337 3.6815   0.0672 0.0886 1728.4017
bpC 0.4149 0.2460   0.0045 0.0038 3000.0000

Quantiles
───────────────────────────────────────────
parameters
      2.5%    25.0%   50.0%  75.0%  97.5% 
 bp -10.7577 -1.0319 1.4902 4.0067 16.9960
bpC  -0.4675  0.2435 0.4137 0.5796  1.3324

Log evidence      = 0.0
Iterations        = 1:3000
Thinning interval = 1
Chains            = 1
Samples per chain = 3000
pooled            = a[1], a[2], a[3], a[4], a[5], a[6], a[7]

Empirical Posterior Estimates
─────────────────────────────────────────────
pooled
       Mean    SD   Naive SE  MCSE     ESS   
a[1] -1.9824 3.6843   0.0673 0.0887 1727.1929
a[2]  9.8785 6.0428   0.1103 0.1973  937.6438
a[3] -2.2879 3.6866   0.0673 0.0899 1681.8172
a[4] -2.2809 3.6940   0.0674 0.0893 1712.3698
a[5] -1.9804 3.6927   0.0674 0.0896 1698.3607
a[6] -1.0489 3.6885   0.0673 0.0885 1736.5395
a[7]  0.5142 3.6860   0.0673 0.0908 1647.0961

Quantiles
─────────────────────────────────────────────
pooled
       2.5%    25.0%   50.0%   75.0%   97.5% 
a[1] -17.1055 -4.4397 -1.9079  0.5685 10.6230
a[2]  -7.1898  5.7305  9.3678 13.3436 34.4317
a[3] -17.9862 -4.7485 -2.2723  0.2305 10.1737
a[4] -17.6815 -4.7411 -2.2326  0.2594 10.6519
a[5] -17.6976 -4.3925 -1.9362  0.5755  9.9957
a[6] -16.6376 -3.5379 -1.0127  1.5214 11.4556
a[7] -14.9577 -1.9840  0.5571  3.0386 12.6625

or, just printing the means of the posterior vector (from another run):

2-element Array{Array{Float64,1},1}:
 [1.3940051607581527, 0.4210166348245524]                                                                                                     
 [-1.844585401458457, 10.05388176750645, -2.1519882289247705, -2.14490281190503, -1.8458711437273487, -0.9082955910437809, 0.6615688096440819]

goedman · 2019-03-12T20:40:00Z

Tamas, when you say "I could not reproduce the errors above." does that path (with do_stresstest=true) work on your system? Or if you use the LogDensityRejectErrors path, do you see estimates like in my previous post?

tpapp · 2019-04-09T15:25:50Z

Apologies for not getting back to you about this yet, this is a busy time for me.

I really appreciate this test case. Various other users have reported subtle (or not-so-subtle) maladaptation and bias issues. I have a couple of guesses about this:

bugs in AD, or just numerical corner cases, for which I created (stess)test derivatives LogDensityProblems.jl#42,
overly eager dense matrix adaptation, which can be spurious, to be addressed in adapt a diagonal mass matrix by default DynamicHMC.jl#35,
a possible bug in the adaptation code,
a possible bug in the tree-building and/or acceptance rate calculation code.

To investigate 3 and 4, I will be working on tpapp/DynamicHMC.jl#30. I ask for your patience in the meantime. The test cases are greatly appreciated.

goedman · 2019-04-09T18:35:11Z

Thank you Tamas. Absolutely no apologies needed! I simply can't think of anyone contributing as much to the Julia community as you do!

Any time I notice an update to one the various packages I will continue to try it. For now I will always compare the results to at least Turing and Stan. When I started the work on StatisticalRethinking I had a time frame of about 2 years in mind so from that point of view I'm not even halfway.

tpapp mentioned this issue Feb 2, 2019

Fixes AD NaN issue. StatisticalRethinkingJulia/StatisticalRethinking.jl#3

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request for help #12

Request for help #12

goedman commented Feb 2, 2019

tpapp commented Feb 2, 2019

tpapp commented Feb 2, 2019

tpapp commented Feb 2, 2019

goedman commented Feb 2, 2019

tpapp commented Feb 2, 2019

goedman commented Feb 10, 2019

tpapp commented Feb 10, 2019

goedman commented Feb 10, 2019

goedman commented Mar 11, 2019

tpapp commented Mar 11, 2019

goedman commented Mar 11, 2019

tpapp commented Mar 11, 2019

goedman commented Mar 11, 2019

goedman commented Mar 11, 2019

goedman commented Mar 12, 2019

tpapp commented Apr 9, 2019

goedman commented Apr 9, 2019

Request for help #12

Request for help #12

Comments

goedman commented Feb 2, 2019

tpapp commented Feb 2, 2019

tpapp commented Feb 2, 2019

tpapp commented Feb 2, 2019

goedman commented Feb 2, 2019

tpapp commented Feb 2, 2019

goedman commented Feb 10, 2019

tpapp commented Feb 10, 2019

goedman commented Feb 10, 2019

goedman commented Mar 11, 2019

tpapp commented Mar 11, 2019

goedman commented Mar 11, 2019

tpapp commented Mar 11, 2019

goedman commented Mar 11, 2019

goedman commented Mar 11, 2019

goedman commented Mar 12, 2019

tpapp commented Apr 9, 2019

goedman commented Apr 9, 2019