You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
For documents containing the string 'hci' (from human-computer interaction) in the corpus, Bag of Words changes 'hci' to 'hcus' in its sparse-matrix representation
To Reproduce
See attached workflow. Dataset is shared through Google Drive link
Expected behavior
"hci" should be kept as 'hci' in sparse data. Could this be some automatic conversion of latin plurals ending with '-i' to singular ending with '-us' (such as nuclei -> nucleus) caused by the lemmatizer?
Orange version:
3.36.2.
Text add-on version:
1.15.0 Screenshots
If applicable, add screenshots to help explain your problem.
Describe the bug
For documents containing the string 'hci' (from human-computer interaction) in the corpus, Bag of Words changes 'hci' to 'hcus' in its sparse-matrix representation
To Reproduce
See attached workflow. Dataset is shared through Google Drive link
Expected behavior
"hci" should be kept as 'hci' in sparse data. Could this be some automatic conversion of latin plurals ending with '-i' to singular ending with '-us' (such as nuclei -> nucleus) caused by the lemmatizer?
Orange version:
3.36.2.
Text add-on version:
1.15.0
Screenshots
If applicable, add screenshots to help explain your problem.
Operating system:
Mac OS 14.3.1
Example workflow
hcus bug.ows.zip
The text was updated successfully, but these errors were encountered: