Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Corpus refactoring #767

Merged
merged 2 commits into from
May 9, 2022
Merged

Conversation

PrimozGodec
Copy link
Collaborator

@PrimozGodec PrimozGodec commented Jan 12, 2022

Issue

Originally I started to solve an issue that Corpus.from_file didn't load pickles together with preprocessing, since Corpus was transformed to Table after loading and then back.

It led to the decision to refactor Corpus to use Table's __new__ class method and to match constructors signatures to Tables. There was no other nice way to fix the first error.

Description of changes

Corpus constructor refactoring. All old ways of initializing corpus should still work (the user is warned that it will change in future).

Edit:
After a fix in Orange biolab/orange3#5812 it would even not be necessary to refactor the Corpus, but it is done already (since before this change there was no other way). I suggest that we refactor it anyway since it is implemented already.

This PR needs Orange to be released (it needs biolab/orange3#5812). Tests will not pass before version 3.32 or 3.31.2 is available.

Includes
  • Code changes
  • Tests
  • Documentation

@PrimozGodec PrimozGodec marked this pull request as draft January 12, 2022 15:54
@PrimozGodec PrimozGodec force-pushed the fix-load-pickle branch 4 times, most recently from 45679de to dbcf83f Compare January 26, 2022 10:46
@PrimozGodec PrimozGodec force-pushed the fix-load-pickle branch 4 times, most recently from 200a5b9 to a6f3085 Compare February 8, 2022 09:32
@PrimozGodec PrimozGodec marked this pull request as ready for review April 29, 2022 11:02
@PrimozGodec PrimozGodec changed the title [FIX] Corpus refactoring [ENH] Corpus refactoring Apr 29, 2022
@ajdapretnar
Copy link
Collaborator

This doesn't solve the issue of missing preprocessing for me. I tried deerwester, used default preprocessing and saved it as a pickle. Upon loading it with Corpus, the preprocessing is missing.

@PrimozGodec
Copy link
Collaborator Author

This doesn't solve the issue of missing preprocessing for me. I tried deerwester, used default preprocessing and saved it as a pickle. Upon loading it with Corpus, the preprocessing is missing.

It should be fixed now

@ajdapretnar ajdapretnar merged commit fb0cbd7 into biolab:master May 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants