-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: update ScannerCapability, enabled_capabilities, HarborSbomRepor… #17
Conversation
Hi @knqyf263 we have some updates on the scanner spec 1.2. Please help to review it. Thanks! Summary of change points:
cc: @wy65701436 , @stonezdj |
cc: @chlins |
data: | ||
type: array | ||
items: | ||
$ref: '#/components/schemas/SbomData' | ||
additionalProperties: true | ||
description: 'The raw data of the sbom generated by the scanner and its format.' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not easy for us to implement this API. Currently, each report is stored separately. Trivy doesn't generate both SPDX and CycloneDX together. We prefer returning one report at /scan/{scan_request_id}/report
. Or it is better to return all reports (vulnerability and SBOM) together.
This suggestion is as follows:
- Vulnerability and SBOM reports are separately returned
- SPDX and CycloneDX reports are aggregated.
It looks inconsistent. I'd suggest
- Vulnerability, SPDX SBOM and CycloneDX SBOM are returned separately.
Or
- Vulnerability, SPDX SBOM and CycloneDX SBOM are returned together
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API doesn't require the backend implementation to store the information with requestID, you could stored each report with the key <requestID>:<media_type>
. then you can store whatever vulnerability, sbom/spdx or sbom/cyclonedx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@knqyf263 , the /scan/{scan_request_id}/report
is identified by scan_request_id
and Accept header
(example: application/vnd.security.vulnerability.report; version=1.1
or application/vnd.security.sbom.report+json; version=1.0
). That's the reason why SPDX SBOM report and CycloneDX SBOM report are aggregated.
We do understand that "Trivy doesn't generate both SPDX and CycloneDX together". For example, when /scan
is requested like below:
{
"enabled_capabilities": [
{
"type": "sbom",
"produces_mime_types": [
"application/vnd.security.sbom.report+json; version=1.0"
],
"accept_media_types": [
"application/spdx+json",
"application/vnd.cyclonedx+json"
]
}
]
}
trivy-adapter may need to run trivy CLI twice passing different parameters to generate different format of SBOM report.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API doesn't require the backend implementation to store the information with requestID, you could stored each report with the key :<media_type>. then you can store whatever vulnerability, sbom/spdx or sbom/cyclonedx
They all have different statuses, like spdx: completed
, cyclonedx: in progress
, but the API forces scanners to aggregate results. The specification should define what should be returned in that case. There are more considerations to take into account, so it is closer to a new API rather than an extension of the current API.
That's the reason why SPDX SBOM report and CycloneDX SBOM report are aggregated.
We can do that, but it complicates implementation. If you pass the media type to /scan/{scan_request_id}/report
, it simply returns one report. Does Harbor benefit from this specification? If you have a big advantage on this API, I'm ok, but it makes implementation more complicated for scanners and brings more considerations as mentioned above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@knqyf263 if the /scan
request asks for both SPDX sbom and Cyclonedx sbom, like below:
{
"enabled_capabilities": [
{
"type": "sbom",
"produces_mime_types": [
"application/vnd.security.sbom.report+json; version=1.0"
],
"accept_media_types": [
"application/spdx+json",
"application/vnd.cyclonedx+json"
]
}
]
}
and if we pass the Accept-Media-Type
header (application/spdx+json
or application/vnd.cyclonedx+json
) to /scan/{scan_request_id}/report
so that it simply returns one report at one http request to /scan/{scan_request_id}/report
, there will be one {scan_request_id}
used in two requests to /scan/{scan_request_id}/report
. Can the one {scan_request_id}
be used two times?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scan_request_id
is already used multiple times for vulnerabilities and SBOM.
If I understand correctly, the current suggestion is as follows (Option 1). The same scan_request_id
is used twice.
scan_request_id
(e.g. ABCDE12345) +mime_type
- ABCDE12345 +
application/vnd.security.vulnerability.report+json; version=1.1
- Return one report
- ABCDE12345 +
application/vnd.security.sbom.report+json; version=1.0
- Return two reports (SPDX and CycloneDX) in one request
- ABCDE12345 +
Would the following approach (Option 2) not work?
scan_request_id
(e.g. ABCDE12345) +mime_type
(+media_type
)- ABCDE12345 +
application/vnd.security.vulnerability.report+json; version=1.1
- Return one report
- ABCDE12345 +
application/vnd.security.sbom.report+json; version=1.0
+application/spdx+json
- Return one report (SPDX SBOM)
- ABCDE12345 +
application/vnd.security.sbom.report+json; version=1.0
+application/vnd.cyclonedx+json
- Return one report (CycloneDX SBOM)
- ABCDE12345 +
The same ID can be used three times.
Or aggregate all reports (Option 3). scan_request_id
will be unique.
scan_request_id
(e.g. ABCDE12345)- ABCDE12345
- Return three reports (vuln, SPDX and CycloneDX)
- ABCDE12345
I'm just curious Option 1 has any advantages over Option 2 and Option 3. Option 2 is simpler from the implementation perspective (it is already done in aquasecurity/harbor-scanner-trivy#422). But again, if you see benefits in Option 1, I'm ok with the approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@knqyf263 if we go with the Option 2
you mentioned above, but having media_type
(application/spdx+json
or application/vnd.cyclonedx+json
) as optional query parameter in /scan/{scan_request_id}/report
, for example /scan/{scan_request_id}/report?media_type=application/spdx+json
(the /
in application/spdx+json
might be encoded), is it okay with you?
- /scan/{scan_request_id}/report
- return one report for vulnerabilities
- /scan/{scan_request_id}/report?media_type=application/spdx+json
- return one report of SPDX SBOM
- /scan/{scan_request_id}/report?media_type=application/vnd.cyclonedx+json
- return one report of CycloneDX
Once you confirm it, i'll update the PR.
Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it works. I tried to express "optional" by (+ media_type
). But all of the above options work. You can decide it. I just tried to understand the pros/cons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@knqyf263 , Thank you for the confirmation! We think your suggestion of Option 2
makes sense and it makes the process of getting scan report request much simpler. Updated the PR accordingly.
Please help to review it. Appreciated!
31db260
to
1cbaa54
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
} | ||
] | ||
properties: | ||
$ref: "#/components/schemas/ScannerProperties" | ||
description: | | ||
Represents metadata of a Scanner Adapter which allows Harbor to lookup a scanner capable | ||
Represents metadata of a Scanner Adapter which allows Harbor to lookup a scanner capabilities |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
capability
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
776efab
@knqyf263 for awareness, made a few subtle changes after your approval on this PR: |
and add /scan/{scan_request_id}/report 400 response code Signed-off-by: Shengwen Yu <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
After I started implementing this spec, another question came to my mind.
pluggable-scanner-spec/api/spec/scanner-adapter-openapi-v1.2.yaml Lines 314 to 323 in 776efab
Is there any reason another parameter name ( pluggable-scanner-spec/api/spec/scanner-adapter-openapi-v1.2.yaml Lines 359 to 368 in 776efab
It is not a big deal, but I feel like using the same name is more straightforward because their relationship is "supported capabilities" and "enabled capabilities". They refer to essentially the same thing. What do you think? @zyyw |
Hi @knqyf263 , regarding
The main reason of using two different terms refer to the same value is that |
Yes, it makes sense. |
the |
Thanks for your explanation. OK, I'll implement it. |
…t and sbom data spec