-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Direct answers #45
Direct answers #45
Conversation
archive_query_log/orm.py
Outdated
@@ -190,6 +190,18 @@ class Snippet(SnippetId): | |||
text: str | None = Text() | |||
|
|||
|
|||
class DirectAnswerId(InnerDocument): | |||
id: str = Keyword() | |||
rank: int = Integer() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does a direct answer really have an interpretable "rank" (i.e., are the direct answers ordered in some way)?
- removed boxes and added url and text in DirectAnswer - changed xpath str to xpaths List[str]
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## elastic #45 +/- ##
========================================
Coverage 89.68% 89.68%
========================================
Files 61 61
Lines 2724 2724
========================================
Hits 2443 2443
Misses 281 281 ☔ View full report in Codecov by Sentry. |
archive_query_log/orm.py
Outdated
@@ -461,9 +459,8 @@ class WarcDirectAnswerParser(BaseDocument): | |||
url_pattern_regex: str | None = Keyword() | |||
priority: float | None = RankFeature(positive_score_impact=True) | |||
parser_type: WarcDirectAnswerParserType = Keyword() | |||
xpath: str | None = Keyword() | |||
xpaths: List[str] | None = Keyword() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned on Discord, we can just use one XPath with alternatives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice Julian! I think this is almost ready to merge 👍
Only two very minor fixes left.
archive_query_log/imports/yaml.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Julian, actually, we don't need to have an importer for direct answer parsers as there are no such parsers in the "old" AQL. So you can just revert the changes in this file (which should fix most of the type-safety issues here).
else: | ||
raise ValueError(f"Invalid parser type: {parser_type}") | ||
WarcDirectAnswersParser.init(using=config.es.client) | ||
add_warc_direct_answers_parser( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type-checker complains that a parameter is missing here.
No description provided.