Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Corpus & Bow: Improve sparsity handling according to Orange>=3.8.0 #281

Merged
merged 1 commit into from
Dec 5, 2017

Conversation

nikicc
Copy link
Contributor

@nikicc nikicc commented Jul 19, 2017

Issue

BoW features when comming from compute values were dense, when some features already existed in the corpus. The core of the problems is bad logic for deciding on the sparsity, which was fixed in biolab/orange3#2341 and released in Orange=3.8.0.

Description of changes
  • Remove the preference for sparse matrix when X is empty.
  • Mark all BoW features as sparse — this assures that X becomes sparse even when data comes from compute values.
  • This also fixes the currently failing tests on the master which failed due to updates in Orange=3.8.0.

IMPORTANT: Orange version in requirements.txt wasn't bumped to 3.8.0 on purpuse and the code was written in a backwards comparible manner so that the addon can also work with older versions of Orange. Note, however, that it is higly recommended to update Orange to >=3.8.0 since as of this PR the BoW data on older Orange versions will become dense!

Includes
  • Code changes
  • Tests
  • Documentation

@nikicc nikicc force-pushed the sparsity-fixups branch 4 times, most recently from ba5be6b to 334d624 Compare December 4, 2017 13:14
@codecov-io
Copy link

codecov-io commented Dec 4, 2017

Codecov Report

Merging #281 into master will decrease coverage by 0.33%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master     #281      +/-   ##
==========================================
- Coverage   85.26%   84.92%   -0.34%     
==========================================
  Files          33       33              
  Lines        1866     1864       -2     
  Branches      337      336       -1     
==========================================
- Hits         1591     1583       -8     
- Misses        239      242       +3     
- Partials       36       39       +3

@nikicc nikicc changed the title Corpus & Bow: Improve sparsity handling [ENH] Corpus & Bow: Improve sparsity handling according to Orange>=3.8.0 Dec 4, 2017
@nikicc nikicc merged commit fc6b45d into biolab:master Dec 5, 2017
@nikicc nikicc deleted the sparsity-fixups branch December 5, 2017 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants