-
-
Notifications
You must be signed in to change notification settings - Fork 478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
deprecation warnings in LDA example code #125
Comments
relatedly, whats the correct / recommend way to rewrite the sum over the gammas? as written, its should it be a) |
The code is way out of date. It's in The current marginalization over the topic (
That can be reduced to
It'd be even better to define |
@cartazio: Feel free to submit a pull request. And a warning---you can't really do Bayesian inference for LDA because of the multimodality. You'll see that you won't satisfy convergence diagnostics running in multiple chains, and not just because of label switching. |
@bob-carpenter thanks! Thats super helpful. by multi-mode you mean: there are different local optima when viewed as an optimization problem / things are nonconvex? (ie vary the priors and there will be different local optima in the posterior?). I had to google around to figure out what you meant, https://scholar.harvard.edu/files/dtingley/files/multimod.pdf seemed the most clearly expositional despite the double spaced formatting :) is there any good reading/references on how the "variational" formulations such as Mallet/VowpalWabbit etc deal with that issue? or is it just one of those things that tends to stay hidden in folklore common knowledge? |
Yes. I meant local optima by "mode".
Nobody can deal with the issue. It's computationally intractable (at least unless P = NP). Run multiple times with different randomizer, get different answers. Usually it's only used for exploratory data analysis or to generate features for something else, so the multiple answers aren't a big deal---you just choose one either randomly or with human guidance.
Some of the later literature tries to add more informative priors to guide solutions. Some of the early work by Griffiths and Steyvers tried to measure just how different the different modes were that the algorithms found with random inits.
|
thanks! i'll have do a bit of digging into this :)
intractability is no surprise, i was slightly imagining it might be
interesting to look at the topology of how tthe different inits / regions
of answers connect
also what does the term label switching mean here?
…On Sun, Nov 12, 2017 at 4:58 PM, Bob Carpenter ***@***.***> wrote:
Yes. I meant local optima by "mode".
Nobody can deal with the issue. It's computationally intractable (at least
unless P = NP). Run multiple times with different randomizer, get different
answers. Usually it's only used for exploratory data analysis or to
generate features for something else, so the multiple answers aren't a big
deal---you just choose one either randomly or with human guidance.
Some of the later literature tries to add more informative priors to guide
solutions. Some of the early work by Griffiths and Steyvers tried to
measure just how different the different modes were that the algorithms
found with random inits.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#125 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAQwgXYqgvwDq7PFaGYwSGNECWYX2CFks5s12n2gaJpZM4Qa_oo>
.
|
On Nov 12, 2017, at 11:01 PM, Carter Tazio Schonwald ***@***.***> wrote:
thanks! i'll have do a bit of digging into this :)
intractability is no surprise, i was slightly imagining it might be
interesting to look at the topology of how tthe different inits / regions
of answers connect
I don't know of any work characterizing this, even for simpler mixtures than LDA.
|
Considering that would veer into topological data analysis / computational
topology and likely be #P hard, I’m not surprised. :)
What’s the relable stuff you mentioned ?
On Mon, Nov 13, 2017 at 2:17 PM Bob Carpenter <[email protected]>
wrote:
…
> On Nov 12, 2017, at 11:01 PM, Carter Tazio Schonwald <
***@***.***> wrote:
>
> thanks! i'll have do a bit of digging into this :)
> intractability is no surprise, i was slightly imagining it might be
> interesting to look at the topology of how tthe different inits / regions
> of answers connect
I don't know of any work characterizing this, even for simpler mixtures
than LDA.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#125 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAQwvtcayPF2aO9DPdQrO4snZvpvbxIks5s2JW8gaJpZM4Qa_oo>
.
|
For the Griffiths and Steyvers experiment on relating topics across initializations of LDA:
• Steyvers, Mark and Tom Griffiths. 2007. Probabilistic topic models. In Thomas K. Landauer, Danielle S. McNamara, Simon Dennis and Walter Kintsch (eds.), Handbook of Latent Semantic Analysis. Laurence Erlbaum.
They use a greedy empirical KL-divergence for alignment, which is crude, but useful.
|
hey @bob-carpenter , @lizzyagibson and I have been looking at the lda example code (its nice how it closely maps to the generative description in the LDA journal paper), and theres a few deprecation warnings related to
<-
along with the target updateincrement_log_sum_exp
expression. you may wanna update them :)thanks for the lovely examples!
The text was updated successfully, but these errors were encountered: