-
-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Import documents: Import from URL #637
Conversation
08e93cf
to
6ca6394
Compare
Codecov Report
@@ Coverage Diff @@
## master #637 +/- ##
==========================================
+ Coverage 71.29% 71.60% +0.31%
==========================================
Files 66 66
Lines 7806 7939 +133
Branches 1027 1037 +10
==========================================
+ Hits 5565 5685 +120
- Misses 2038 2048 +10
- Partials 203 206 +3 |
One more comment that we do not forget: this implementation is using |
True, I'll raise the version. |
737cdb4
to
d58b7e0
Compare
from Orange.util import Registry | ||
|
||
from orangecontrib.text.corpus import Corpus | ||
|
||
|
||
DefaultFormats = ("docx", "odt", "txt", "pdf", "xml") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should .yaml also be specified here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it should not. These formats are 'documents' formats. Each file of these format represents a document. Metadata is considered differently, since it appends data to each document. Its formats are hardcoded in _read_meta_data()
.
I noticed there's a YAML reader and a function that should recognize metadata. But I don't see how this is reflected in the widget. |
It is not. It just reads the metadata, if there is any. Do you think we should add something to the GUI? What do you propose? |
776982f
to
72ed961
Compare
72ed961
to
c23a605
Compare
c23a605
to
f1d0dcf
Compare
Issue
Description of changes
URL
option toImport Documents
widgetIncludes