Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various fixes in word cloud (spacing, library update, ...) #493

Merged
merged 3 commits into from
Jan 17, 2020

Conversation

PrimozGodec
Copy link
Collaborator

@PrimozGodec PrimozGodec commented Jan 8, 2020

Issue

Fixes #399
Fixes #225
Fixes #390 as much as possible

Description of changes

With this PR I made various fixes in the word cloud widget:

  • updated the wordcloud.js library
  • Addressed extra space in the projection
  • Words with smaller weights do not disappear anymore
  • The issue from Word Cloud: don't update on resize #390 is addressed but not fixed completely. On resize, word cloud must replot, since the distribution of words on canvas must change (otherwise it would look weird). Now words keep the same color after replot.
  • Words still overlap and draw one inside each other, this must be fixed in the library. I cannot do any temporary fix.
  • Weights that are negative for the topic are now treated as positive in the cloud visualization WordCloud Keep Weights Order for Topics #225 and colored orange when negative and green when positive (same coloring than in topic widget).
  • Moved info box in the status row.
  • Code refactoring
  • Additional tests

TODO:

  • Documentation will be updated (screenshots) when we agree with the changes.
Includes
  • Code changes
  • Tests
  • Documentation

@codecov-io
Copy link

codecov-io commented Jan 8, 2020

Codecov Report

Merging #493 into master will increase coverage by 0.94%.
The diff coverage is 92.78%.

@@            Coverage Diff             @@
##           master     #493      +/-   ##
==========================================
+ Coverage   62.02%   62.96%   +0.94%     
==========================================
  Files          59       59              
  Lines        6199     6265      +66     
  Branches      809      823      +14     
==========================================
+ Hits         3845     3945     +100     
+ Misses       2218     2181      -37     
- Partials      136      139       +3

else 0)
input_numbers = f"{cor_output_len or 0}|{n_selected or 0}|{cc_len}"
input_string = f"{cor_output_len or 0} documents\n" \
f"{n_selected or 0} selected words" \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add line break.

@ajdapretnar
Copy link
Collaborator

Two things - now the font size is exceedingly big. Font size should be smaller.
Also, font type should be the same as in other Text / Orange widgets. I think this is TNR, but we use... dunno, Lato?

Overlap still occurs. Use election-tweets-2016, Preprocess Text and WC. Perhaps this can't be solved.
Screen Shot 2020-01-09 at 09 29 30

Also, is there any use to 'Regenerate word cloud' button? Does the visualization change at all?

@ajdapretnar
Copy link
Collaborator

I really appreciate this PR, because WC is mostly a mess. One thing that would be amazing if solved:
Screen Shot 2020-01-09 at 09 31 31
The 'bleeding' / protruding parts of the underlying visualization. See bottom and right side of the visualization. This is beyond annoying.

@PrimozGodec PrimozGodec force-pushed the word-cloud branch 2 times, most recently from 56ee7aa to d819853 Compare January 13, 2020 12:27
@ajdapretnar
Copy link
Collaborator

Comments:

  • When I connect Topic Modeling to WC, I get green words for all topic models, even for LDA and HDP, which always give positive words. I don't mind the words being green, but if I deselect 'Color words', words should become gray.
  • Sometimes I resize the window to have a nice ellipsoid word cloud. Then I select 'Regenerate word cloud' or even deselect 'Color words' (which prompts replotting) and I get the square version of WC. Very strange.
  • Trying to fit all the words for, say, book-excerpts, I need 1/3 of the screen. I think this is a bit excessive. I know we talked about this, but have a look at how this appears on my screen. I have put a couple of widgets alongside for comparison. Perhaps there's no optimal solution here.

Screen Shot 2020-01-13 at 15 57 49

- [IDEA] Could we estimate the number of the words shown in WC by counting the number of characters of all words. Then we make a threshold, say less than 400 chars in WC. 🤔 What do you think?

@PrimozGodec
Copy link
Collaborator Author

Thanks for comments:

  • Topics are my fault. I forgot that the second two topics have not negative values. So now we have green/red only for the first method and normal colors for others. Also when the first method is used there is special black/white coloring for topics.
  • I also realize that behavior when resizing. Resizing is handled by wordcloud2.js itself. It seems that it optimizes resizing and do not recompute all parameters on every resize. It didn't seem too problematic for me since the word cloud is staying nice.
  • I decreased the size of fonts so that the cloud fit in the smaller screen area. I also added the parameter that counts word length combined with their sizes (only word length does not solve the problem when for example many words have high weights). This parameter is now included in word size calculation. The difference is the best visible when we show the topic LDA which gives similar/same values to all words (before all of them were shown with huge fonts). It works on all the examples I have tried. I hope it works on others too.

@PrimozGodec PrimozGodec changed the title [WIP] Various fixes in word cloud (spacing, library update, ...) Various fixes in word cloud (spacing, library update, ...) Jan 13, 2020
@ajdapretnar ajdapretnar merged commit 3300abe into biolab:master Jan 17, 2020
@PrimozGodec PrimozGodec deleted the word-cloud branch March 29, 2023 10:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ENH] Reimplement Word Cloud Word Cloud: don't update on resize WordCloud Keep Weights Order for Topics
3 participants