Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performances issues with API v2 bulk_search #1681

Open
tdruez opened this issue Nov 22, 2024 · 1 comment
Open

Performances issues with API v2 bulk_search #1681

tdruez opened this issue Nov 22, 2024 · 1 comment
Assignees
Labels

Comments

@tdruez
Copy link
Contributor

tdruez commented Nov 22, 2024

The current implementation of the package bulk_search in API v2 is not usable.
It takes over 2 minutes to fetch a single purl, and searching for several purls ends in a timeout.

from timeit import default_timer as timer
import requests

purls = [
    "pkg:alpm/archlinux/[email protected]",
    "pkg:composer/guzzlehttp/[email protected]",
    "pkg:composer/guzzlehttp/[email protected]",
    "pkg:maven/org.apache.commons/[email protected]",
    "pkg:maven/org.elasticsearch/[email protected]",
    "pkg:nuget/[email protected]",
    "pkg:nuget/[email protected]",
    "pkg:nuget/[email protected]",
    "pkg:pypi/[email protected]",
    "pkg:pypi/[email protected]",
    "pkg:pypi/[email protected]",
    "pkg:pypi/[email protected]",
    "pkg:pypi/[email protected]",
    "pkg:pypi/[email protected]",
    "pkg:pypi/[email protected]",
]
data = {
    "purls": purls,
}

# v1, 15 purls
url = "https://public.vulnerablecode.io/api/packages/bulk_search"
start_time = timer()
response = requests.post(url, data)
print(timer() - start_time)  # 7 seconds
print(len(response.json())) # -> 15 entries

# v2, 15 purls
url = "https://public.vulnerablecode.io/api/v2/packages/bulk_search"
start_time = timer()
response = requests.post(url, data) # -> TIMEOUT
print(timer() - start_time)

# v2, single purl
url = "https://public.vulnerablecode.io/api/v2/packages/bulk_search"
data = {"purls": "pkg:pypi/[email protected]"}
start_time = timer()
response = requests.post(url, data)
print(timer() - start_time)  # -> 138 seconds
@yawningwinner
Copy link

Hi, I'am interested in helping out, could you please provide more details

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants