Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation 805 #905

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
Open

Implementation 805 #905

wants to merge 14 commits into from

Conversation

J007X
Copy link
Collaborator

@J007X J007X commented Nov 18, 2022

This PR fixes [https://github.com//issues/805].

Description of changes

According to discussion and requirements in the ticket and , a new test NLP pipeline using Forte is created in a new test class "NLP_Pipeline_Performance_Test" to cover typical scenarios relating to entry/token, such as serialization, POS tagging and NER as mentioned in the ticket. Also to make sure 0.2.0 and current code base are run in exactly some way/code, some code in copied from some referenced (but changed over the version) packages, to make sure we perform an apple to apple test

Possible influences of this PR.

This can be used in several iterations of profiling-improvement efforts, that adding profiling coverage on key/typical scenarios.

Test Conducted

Profiling test on serialization, POS tagging and NER as requested in [https://github.com/J007X/forte/tree/implementation_805].

J007X added 6 commits October 17, 2022 17:12
…nltk or fortex.nltk to prevent version conflicts initally and use all-in-one approach to make sure the test code will be run the same way with 0.2.0 and new version > 0.3.0.
… neck related to nested generator/sortedlist area in DataPack.
…ne) : please provide conll data directory to the commented out "input_dir" parameter in code.
…ovide output local dir name in serialization test case.
@codecov
Copy link

codecov bot commented Nov 18, 2022

Codecov Report

Merging #905 (8068e13) into master (6e2d6ea) will decrease coverage by 0.07%.
The diff coverage is 72.48%.

@@            Coverage Diff             @@
##           master     #905      +/-   ##
==========================================
- Coverage   81.05%   80.99%   -0.07%     
==========================================
  Files         256      257       +1     
  Lines       19851    19999     +148     
==========================================
+ Hits        16091    16198     +107     
- Misses       3760     3801      +41     
Impacted Files Coverage Δ
tests/forte/data/data_pack_profiling_test.py 72.29% <72.29%> (ø)
forte/data/ontology/ontology_code_generator.py 89.75% <100.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@J007X J007X requested a review from hunterhector November 24, 2022 11:19
Copy link
Member

@hunterhector hunterhector left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Maybe we also want to produce some numbers for the test?
  2. How to run this test on actual data (like ontonotes), we shoud provide instructions.
  3. The first line "PR fixes" message is not correctly formatted. Github won't associate the PR to the issue like this.

tests/forte/data/data_pack_profiling_test.py Outdated Show resolved Hide resolved
tests/forte/data/data_pack_profiling_test.py Outdated Show resolved Hide resolved
tests/forte/data/data_pack_profiling_test.py Outdated Show resolved Hide resolved
tests/forte/data/data_pack_profiling_test.py Outdated Show resolved Hide resolved
tests/forte/data/data_pack_profiling_test.py Show resolved Hide resolved
tests/forte/data/data_pack_profiling_test.py Outdated Show resolved Hide resolved
tests/forte/data/data_pack_profiling_test.py Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants