Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump nutch from 1.11 to 1.18 in /autoextractor-spark #3

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dependabot[bot]
Copy link

@dependabot dependabot bot commented on behalf of github Mar 21, 2022

Bumps nutch from 1.11 to 1.18.

Changelog

Sourced from nutch's changelog.

Nutch Change Log

Nutch 1.18 Release 14/01/2021 (dd/mm/yyyy) Release Report: https://s.apache.org/lqara

Breaking Changes

- As part of NUTCH-2805, the plugin urlfilter-domainblacklist has been renamed to urlfilter-domaindenylist. And the fields required for the plugin urlfilter.domainblacklist.rules and urlfilter.domainblacklist.file has been replaced with urlfilter.domaindenylist.rules and urlfilter.domaindenylist.file respectively. See NUTCH-2802 for more details.

Sub-task

[NUTCH-2671] - Upgrade ant ivy library
[NUTCH-2672] - Ant build erronously installs *-test.jar instead *.jar for target "nightly"
[NUTCH-2805] - Rename plugin urlfilter-domainblacklist
[NUTCH-2809] - Upgrade any23 plugin dependency to 2.4
[NUTCH-2816] - Add Spotbugs target to ant build
[NUTCH-2817] - Avoid check for equality of URL path and file part using ==/!=
[NUTCH-2829] - Fix ant target "clean-cache"

Bug

[NUTCH-2669] - Reliable solution for javax.ws packaging.type
[NUTCH-2697] - Upgrade Ivy to fix the issue of an unset packaging.type property
[NUTCH-2801] - RobotsRulesParser command-line checker to use http.robots.agents as fall-back
[NUTCH-2810] - FreeGenerator to actually apply configured number of fetch lists
[NUTCH-2813] - MoreIndexingFilter - can't parse erroneous date - 2019-07-03T10:28:14
[NUTCH-2814] - HttpDateFormat's internal time zone may change after parsing a date
[NUTCH-2818] - Ant build: upgrade Apache Rat report task
[NUTCH-2823] - IllegalStateException in IndexWriters.describe() when validating url param for SolrIndexer
[NUTCH-2824] - urlnormalizer-basic to unescape percent-encoded host names

Improvement

[NUTCH-1190] - MoreIndexingFilter refactor: move data formats used to parse "lastModified" to a config file.
[NUTCH-2582] - Set pool size of XML SAX parsers used for MIME detection in Tika 1.19
[NUTCH-2730] - SitemapProcessor to treat sitemap URLs as Set instead of List
[NUTCH-2782] - protocol-http / lib-http: support TLSv1.3
[NUTCH-2796] - Upgrade to crawler-commons 1.1
[NUTCH-2799] - Add .asf.yaml file
[NUTCH-2833] - Upgrade to Tika 1.25
[NUTCH-2835] - Upgrade commons-jexl from 2 --> 3
[NUTCH-2836] - Upgrade various commons dependencies
[NUTCH-2837] - Update multiple dependencies
[NUTCH-2841] - Upgrade xercesImpl dependency

Wish

[NUTCH-2834] - Deduplication mode via command line in crawl script

Task

... (truncated)

Commits

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
  • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
  • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
  • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
  • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

Bumps [nutch](https://github.com/apache/nutch) from 1.11 to 1.18.
- [Release notes](https://github.com/apache/nutch/releases)
- [Changelog](https://github.com/apache/nutch/blob/master/CHANGES.txt)
- [Commits](https://github.com/apache/nutch/commits/release-1.18)

---
updated-dependencies:
- dependency-name: org.apache.nutch:nutch
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
@dependabot dependabot bot added the dependencies Pull requests that update a dependency file label Mar 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants