Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ValueSupport to allow non-integer discrete support #941

Closed
wants to merge 36 commits into from
Closed

Fix ValueSupport to allow non-integer discrete support #941

wants to merge 36 commits into from

Conversation

richardreeve
Copy link
Contributor

@richardreeve richardreeve commented Jul 26, 2019

I've made this PR to provide a better way of handling non-integer discrete support, since it doesn't currently exist in Distributions. There are cleverer things that could be done, but this seems to work and so far provides all of the functionality people are talking about that I've seen.

This PR is now split, with the ValueSupport changes in #945 and the new distributions using the new type hierarchy here. This PR contains both, but the important stuff is the latter, as the former will go once #945 is merged.

@DilumAluthge
Copy link

I like it

@codecov-io
Copy link

codecov-io commented Jul 26, 2019

Codecov Report

Merging #941 into master will decrease coverage by 1.27%.
The diff coverage is 54.32%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #941      +/-   ##
==========================================
- Coverage   77.31%   76.03%   -1.28%     
==========================================
  Files         112      114       +2     
  Lines        5369     5508     +139     
==========================================
+ Hits         4151     4188      +37     
- Misses       1218     1320     +102
Impacted Files Coverage Δ
src/multivariate/mvtdist.jl 58.69% <ø> (-0.45%) ⬇️
src/multivariate/mvlognormal.jl 96.77% <ø> (-0.06%) ⬇️
src/multivariate/mvnormal.jl 71.08% <ø> (-0.18%) ⬇️
src/multivariate/mvnormalcanon.jl 80.43% <ø> (-0.42%) ⬇️
src/multivariate/dirichletmultinomial.jl 98.55% <ø> (ø) ⬆️
src/multivariate/dirichlet.jl 59.35% <ø> (-0.22%) ⬇️
src/Distributions.jl 100% <ø> (ø) ⬆️
src/univariate/discrete/poisson.jl 62.5% <0%> (ø) ⬆️
src/univariate/discrete/binomial.jl 73.68% <0%> (ø) ⬆️
src/univariate/discrete/geometric.jl 79.68% <0%> (ø) ⬆️
... and 26 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a15c8ba...45809a4. Read the comment docs.

@matbesancon
Copy link
Member

@richardreeve could you merge master into your branch? Very sorry for the mess, it's a big PR that just got merged (but boilerplaty so the diff should be trivial)

@richardreeve
Copy link
Contributor Author

Done. Any comments welcome by the way @matbesancon and others! I need to put in some tests for the new distributions, but apart from that I’m pretty happy with this and I think it does what I need, and I think @DilumAluthge is also okay with it...

@matbesancon matbesancon requested a review from mschauer July 30, 2019 21:17
src/common.jl Outdated Show resolved Hide resolved
@@ -169,11 +169,11 @@ invlogccdf(d::Normal, lp::Real) = xval(d, -norminvlogcdf(lp))
function quantile(d::Normal, p::Real)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait what? Some the code wrong before? If so, could you add a test fixed by this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code previously was giving the right answer by accident as the final line of code (the xval() call) worked on its own anyway even in these edge cases so there's nothing to test. I just corrected the earlier code that was not previously returning anything. So two mistakes cancelled each other out and there's nothing to test.

pdf(d::Bernoulli, x::Bool) = x ? succprob(d) : failprob(d)
pdf(d::Bernoulli, x::Int) = x == 0 ? failprob(d) :
pmf(d::Bernoulli, x::Bool) = x ? succprob(d) : failprob(d)
pmf(d::Bernoulli, x::Int) = x == 0 ? failprob(d) :
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to add two methods nor to hard-code Int:
Bool <: Integer and true == T(1) for any T <: Integer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this isn't my code - all I've done is searched and replaced pdf with pmf for discrete distributions. I'm sure there are lots of problems with the existing code, but I felt that I was already being ambitious enough!

@matbesancon
Copy link
Member

ok before going any further, this PR is huge, should probably be chunked in multiple ones, there should be one for the type hierarchy, then we'll see for the rest of it

@richardreeve
Copy link
Contributor Author

I did wonder about that, but it's worth pointing out that what you're spotting up there are not my introductions. Although the changes touch a lot of files, it's mostly just search and replace of pdf for pmf...

@richardreeve
Copy link
Contributor Author

It's also true that the Dirac work has been sitting around as a #861 for a while and needed updating to this new hierarchy, and the same is true in #916 for the DiscreteNonParametric and Categorical distributions - I've just merged and updated these, so they have been largely reviewed already. Also, the type hierarchy is all in common.jl (except for exports in Distributions.jl and the doc updates), so it can be reviewed almost entirely in isolation. However, I can split it out if you really think it will be better as I obviously want this to be merged.

@matbesancon
Copy link
Member

@richardreeve maybe wait for #945 before continuing this? Otherwise you risk running into merge conflicts

@richardreeve
Copy link
Contributor Author

Absolutely. I'm just keeping the two synced by merging in rr/countable and checking it still works...

@richardreeve
Copy link
Contributor Author

richardreeve commented Aug 3, 2019

Keeping up to date with #945, so we now have ContiguousSupport as a subtype of CountableSupport.

@richardreeve
Copy link
Contributor Author

This PR is now split, with the ValueSupport changes in #945 and the new distributions using the new type hierarchy here. This PR contains both, but the important stuff is the latter, as the former will go once #945 is merged.

@cscherrer
Copy link

Can you give some more detail of the benefit of splitting pdf/pmf? Formally, a pmf is still a pdf over counting measure. Breaking them apart forces everything to consider two separate cases without any obvious benefit. This can be a pain point for PPL implementation in particular.

I can see a benefit if you're describing something that's singular over the base measure like a sample from a Dirichlet process, but I don't see that addressed here.

@richardreeve
Copy link
Contributor Author

Once you start having arbitrary mixtures of distributions, that includes mixtures of continuous and discrete distributions. If you do that, you get infinite spikes in the (pdf) density that represent discrete probability masses (in the pmf). However, in the pdf there is no way of quantifying the size of the spikes (as far as I'm aware).

For cumulative densities this isn't a problem, but I don't see a way around having two functions for the density and mass functions if you want to generally be able to describe the density and the mass of distributions with discontinuous cdfs but continuous support (like spike-and-slab and hurdle distributions).

@cscherrer
Copy link

Once you start having arbitrary mixtures of distributions, that includes mixtures of continuous and discrete distributions. If you do that, you get infinite spikes in the (pdf) density that represent discrete probability masses (in the pmf). However, in the pdf there is no way of quantifying the size of the spikes (as far as I'm aware).

Ok, so this addresses the problem of singularities. That makes sense. Say you have a mixture like this, and d is in the support of the discrete component, while c is outside it. Does your implementation return

pdf(mix, d) == Inf
pdf(mix, c) == continuousWeight * pdf(continuousComponent, c)
pmf(mix, d) == discreteWeight * pmf(discreteComponent, d)
pmf(mix, c) == 0

? Is that what it "should" return, or am I thinking about it wrong?

For cumulative densities this isn't a problem, but I don't see a way around having two functions for the density and mass functions if you want to generally be able to describe the density and the mass of distributions with discontinuous cdfs but continuous support (like spike-and-slab and hurdle distributions).

Me neither. I hadn't considered carefully representing mixed continuous/discrete distributions as even being a possibility, but it would be pretty great. It seems like it would also allow things like MvNormals with positive semidefinite covariance, which could help for working with low-rank models.

Have you seen such general representation of distributions in other systems? It would be helpful to know what trouble people have had to be sure we're learning from their mistakes.

@richardreeve
Copy link
Contributor Author

Ok, so this addresses the problem of singularities. That makes sense. Say you have a mixture like this, and d is in the support of the discrete component, while c is outside it. Does your implementation return

pdf(mix, d) == Inf
pdf(mix, c) == continuousWeight * pdf(continuousComponent, c)
pmf(mix, d) == discreteWeight * pmf(discreteComponent, d)
pmf(mix, c) == 0

? Is that what it "should" return, or am I thinking about it wrong?

Yes, that's exactly what I was thinking of! SpikeSlab and CompoundDistribution in this PR implement this in fact, though I haven't put any testing in yet to check as I got distracted by creating #945.

For cumulative densities this isn't a problem, but I don't see a way around having two functions for the density and mass functions if you want to generally be able to describe the density and the mass of distributions with discontinuous cdfs but continuous support (like spike-and-slab and hurdle distributions).

Me neither. I hadn't considered carefully representing mixed continuous/discrete distributions as even being a possibility, but it would be pretty great. It seems like it would also allow things like MvNormals with positive semidefinite covariance, which could help for working with low-rank models.

Great, I hadn't thought of that. Obviously it's important that this extension to the type hierarchy is as extensible as possible - though it doesn't necessarily have to be able to handle everything on day one. At the moment, I'm not at all that confident that the UnionSupport type that I suggest is the right way to go, so I'm open to other suggestions?

Have you seen such general representation of distributions in other systems? It would be helpful to know what trouble people have had to be sure we're learning from their mistakes.

Unfortunately not. It's only having the ability to work on Distributions.jl in Julia directly that gave me the idea that I might be able to do this at all...

@cscherrer
Copy link

I guess to be really rigorous about it we should be working (eventually) in terms of the refinement to Lebesgue's decomposition.

@richardreeve
Copy link
Contributor Author

I may have to take your word for that!

@richardreeve
Copy link
Contributor Author

In the meanwhile, it would be good to know what you think of this first step...

@cscherrer
Copy link

I think it's definitely a step in the right direction. And if it's really non-breaking, I say go for it. But I'm having trouble seeing how this can be non-breaking. If we have mixed discrete/continuous distributions, then

pdf(mix, d) == Inf
pdf(mix, c) == continuousWeight * pdf(continuousComponent, c)
pmf(mix, d) == discreteWeight * pmf(discreteComponent, d)
pmf(mix, c) == 0

seems sensible. But say we have some simple example of this, like mix is a mixture of a standard Normal() and a Bernoulli(0.5). Then we'd have

pdf(mix, 0.0) == Inf
pdf(Bernoulli(0.5)) == 0.5

This would seem to imply that pdf(Normal(),0.0) == Inf, which of course isn't right. The problem is that we're implicitly switching base measures.

@richardreeve
Copy link
Contributor Author

richardreeve commented Aug 4, 2019

You're right, but it's non-breaking in the sense that everything behaves exactly as it did before except for new distributions, which obviously can't be breaking however you code them. And up till now it hasn't been possible to make these mixed distributions at all, so I'm safe...

My feeling is that we ought to move to a system where pdf and pmf are distinct though, and so the pdf(d::CountableDistribution) = pmf(d) compatibility bit should be dropped, for exactly the reason you bring up.

The only other option I can think of would be to have a third (mostly optional) argument to pdf, that allowed you to choose the type of density to return. By default it would return the appropriate density:

pdf(d::CountableDistribution, x) = pdf(d, x, Mass())
pdf(d::ContinuousDistribution, x) = pdf(d, x, Density())
pdf(::ContinuousDistribution, _, ::Mass) = 0.0
pdf(d::CountableDistribution, x, ::Density) = insupport(d, x) ? Inf : 0.0

But this would not be defined for mixed distributions, so you'd have to select the appropriate density type explicitly. I don't have any feeling for whether that's a good idea, but it would allow us to maintain backward compatibility indefinitely...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants