Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harbor opa agent #135

Open
wants to merge 34 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 30 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
3d2a3a2
Initial commit for the Harbor OPA integration proposal#
prahaladdarkin Mar 15, 2020
3066765
Text summarization
prahaladdarkin Mar 15, 2020
6757627
Adjusting text sections between Abstract and Background
prahaladdarkin Mar 15, 2020
8ed60b6
Adding Harbor Policy layer images
prahaladdarkin Mar 15, 2020
b3b1361
Embedding Harbor Policy Agent Component view
prahaladdarkin Mar 15, 2020
e8fbb03
Correcting an image path
prahaladdarkin Mar 15, 2020
4dc472e
Uploading sequence diagrams and low level architecture diagrams
prahaladdarkin Mar 15, 2020
2432fa7
Image formatting
prahaladdarkin Mar 15, 2020
1d9526c
Modifying low level component diagram and sequence diagram to also hi…
prahaladdarkin Mar 15, 2020
22bf648
Adding a section for deployment of the Harbor Policy Agent
prahaladdarkin Mar 16, 2020
930f3a6
Adding section for use cases
prahaladdarkin Mar 24, 2020
e99c2ac
Updating proposal with Use cases, user workflow flowcharts and modifi…
prahaladdarkin Mar 26, 2020
816c7b5
Adding some separation text
prahaladdarkin Mar 26, 2020
287503b
Merge branch 'master' into harbor_opa_agent
prahaladdarkin Mar 26, 2020
bfcf10e
docs[P2P]: add meeting minutes for 03/12
steven-zou Mar 12, 2020
e7f6842
Initial commit for the Harbor OPA integration proposal#
prahaladdarkin Mar 15, 2020
3ac2c69
Text summarization
prahaladdarkin Mar 15, 2020
b7dceab
Adjusting text sections between Abstract and Background
prahaladdarkin Mar 15, 2020
803e054
Adding Harbor Policy layer images
prahaladdarkin Mar 15, 2020
1aa9325
Embedding Harbor Policy Agent Component view
prahaladdarkin Mar 15, 2020
aeb89a5
Correcting an image path
prahaladdarkin Mar 15, 2020
a2920b4
Uploading sequence diagrams and low level architecture diagrams
prahaladdarkin Mar 15, 2020
490a75f
Image formatting
prahaladdarkin Mar 15, 2020
1ab40dc
Modifying low level component diagram and sequence diagram to also hi…
prahaladdarkin Mar 15, 2020
dd8d47a
Adding a section for deployment of the Harbor Policy Agent
prahaladdarkin Mar 16, 2020
13f658f
Adding section for use cases
prahaladdarkin Mar 24, 2020
4e400c2
Updating proposal with Use cases, user workflow flowcharts and modifi…
prahaladdarkin Mar 26, 2020
49dd8a7
Adding some separation text
prahaladdarkin Mar 26, 2020
67ec26d
Merge branch 'harbor_opa_agent' of https://github.com/prahaladdarkin/…
prahaladdarkin Mar 26, 2020
7a5648a
Merge branch 'master' into harbor_opa_agent
prahaladdarkin May 6, 2020
b24062a
Round 1 - Addressed review comments. Still more to go
prahaladdarkin May 28, 2020
def0217
Merge branch 'harbor_opa_agent' of https://github.com/prahaladdarkin/…
prahaladdarkin May 28, 2020
b06b565
Adding a new screen mock-up to illustrate how Project Owners can uplo…
prahaladdarkin May 29, 2020
22b878c
Merge branch 'master' into harbor_opa_agent
prahaladdarkin Jun 8, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
159 changes: 159 additions & 0 deletions proposals/new/opaintegration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@
# Proposal: Integrate Harbor with Open Policy Agent

Author: Prahalad Deshpande

## Abstract
This proposal introduces an integration between Harbor and the [Open Policy Agent](https://www.openpolicyagent.org/). This integration will allow users to evaluate and enforce custom policies on images being stored and retrieved from Harbor. Additionally, the integration would also allow a variety of security and compliance enforcement checks to be performed as a part of the build and deploy pipelines by exposing a uniform set of APIs.
The motivation behind this proposal is the Common Vunerability Schema Specification for the Cloud Native Workload. Integrating Harbor with OPA will introduce rich policy evaluation capabilities within Harbor in addition to opening up to other potential integrations with the tools for enforcement of IT GRC compliance in the cloud native ecosystem

## Background
Harbor currently has support for scanning images for OS vulnerabilities using the [pluggable scanner framework](https://github.com/goharbor/pluggable-scanner-spec). Using the framework, end users can use OS vulnerability scanners of their choice to understand the OS vulnerabilities present within the system. However, reporting capabilities of Harbor with respect to the security and compliance posture of the images persisted within it is very minimal and is restricted to providing a summarized aggregate for the High, Medium and Low vulnerabilities within the images. There are some other crucial limitations such as:
* The scanning for vulnerabilities is scoped to images within a project of which the end user is a member
* The scanning is restricted to OS vulnerabilties.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This limitation is from scanner themselves, not related to any compliance checking. It seems not suitable to put it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I have removed this line

* There is no mechanism available for the end users to utilize the results of the scan to make further actionable decisions for e.g. quarantining the image or prevent creation of Pod workloads from these images in Kubernetes clusters.
* There is also no mechanism right now to "slice and dice" the results of the vulnerability evaluation.

The above limitations introduce shortcoming in addressing of critical security use cases of the enterprise security administrator who is concerned about the security and compliance posture of the applications being deployed and the container registry itself (including identification of projects containing the offending images) as opposed to individual project level granularities.
Additionally, from the dev-ops perspective; it is not possible to build a deployment criteria that prevents deploying a service based out of an image not satisfying the acceptance criteria or pre-empt the creation of deployment artifacts (e.g. Helm charts) when the deployments use images that do not satisfy the acceptance criteria
As can be seen from the use cases above; there is a requirement to persist the results of a scan or vulnerability evaluation in a format that supports ad-hoc querying as well as presentable within a report.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

supports ad-hoc querying as well as presentable within a report should be covered by OPA? It seems a function of some metric service like Elastic search etc..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually should be separate proposal. It is related to the Common Vulnerability Schema documentation that I had shared previously where-in the results of the compliance evaluation as well as policy evaluation would be stored in RDBMS or Elastic Search.. I will create another proposal for that. Removing this statement from the current proposal

Additionally, there also exists a critical requirement for the end user to be able to author complex policies that can evaluate the results of an image scan and produce an output that flags the image as matching or failing the acceptance criteria and also share the policies across departments to implement and enforce set of best practices uniformly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the real requirement of this feature.


To address the above requirements and use cases, an integration between Harbor and [Open Policy Agent](https://www.openpolicyagent.org/) is proposed. Open Policy Agent (OPA) is the policy authoring and evaluation framework that is being adopted widely by the Cloud Native Computing Foundation. Refer to [OPA Integrations](https://www.openpolicyagent.org/docs/latest/ecosystem/) to see a set of compelling and interesting integrations.

##Use Cases

### Security Admin Persona
* What are the **Critical** vulnerabilities present in my **Harbor registry**?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this already exists but we need ability to display these across all projects that user is entitled do, instead of having to dig into each project to view

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Some of the use cases mentioned here can be satisfied with the current implementations in harbor too with enhancements to the user interface. I was enlisting the use cases from a completeness perspective with respect to policy integration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disagree. IMO, OPA integration should focus on policy enforcement, not data querying or reorganization.

This can be covered by other tools/services.

Copy link
Contributor Author

@prahaladdarkin prahaladdarkin May 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steven-zou I agree with your point in principle. The above use case can be satisfied by re-organizing the data within the harbor database to confirm to the schema mentioned in the original document shared related to Cloud Native Common Vunerability Evaluation and Reporting Framework.I will create a separate proposal for that.

However, the very nature of OPA is that it allows users to write policies that need not be decision making based. I my understanding is correct, eventually Harbor would also function as a policy store for image compliance policies (@xaleeks please correct my understanding if wrong) and these policies need not be only those that return True or False or a single scalar value for decision making. Other tools and services can query the policy data store to also answer the type of questions in the use case above.

This type of integration would open up the possibility for the user to run custom analytics and queries against the Harbor OPA data store. @xaleeks please let us know what you think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steven-zou I agree with you that in in principle, the above use case can be achieved using data schema re-organization as was described in the original document shared related to Cloud-Native Common Vulnerability Specification.
That said, OPA allows users to write policies that return data other than scalars (true, false or a number). Since we would be storing this data within the Harbor OPA data store, other tools and services could query the Harbor OPA datastore to answer such queries and take further actions/decisions.
The use case was mentioned here with the intent of highlighting the possibility of user running ad-hoc queries and analytics against the data in the harbor OPA data store.

* Which images are impacted by **CVE-12345** which has been flagged as business critical?
* Which Helm charts use images with **Critical** vulnerabilities?
* Where can I get access to a summary report on a regular cadence?
* What **Services** out in the field use an image containing **CVE-12345**?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if harbor itself can answer that question because we don't track things 'out in the field', this seems like a question for something like sonobuoy that can work in conjunction with Harbor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes agreed. This is not the functionality of core Harbor. However, Harbor OPA integration would be storing the OPA policy evaluation results and hence could act as the data store that can be used by frameworks like Sonobuoy.
Similar to the pluggable scanner framework, Sonobuoy could query the Harbor OPA datastore with the name of the vulnerability and would be able to retrieve the results or image names that contain these vulnerabilities and then perform further action as appropriate
The report however for such a query would come from Sonobuoy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seem concerns with @xaleeks.

I don't think Harbor has answers to those questions too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steven-zou As mentioned in my response above - Harbor need not answer this as it's core functionality. However, the OPA data store in Harbor would be able to answer this query for external tools. @xaleeks please comment on what you think about this particular approach.

* I have a set of enterprise wide "Acceptance for Use" critieria that must be satisified. How can I identify images that satisfy these criteria and those that do not?
* How can I share the best practice checks that I have designed with my enterprise partner organizations so all maintain the same level of compliance?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this referring to? replicating opa policies to another harbor instance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We can replicate the best practices and compliance checks across the board to maintain a uniform level of compliance across the board. It also allows for centralized policy authoring

* I want to evaluate images against standard IT-GRC policies (PCI, HIPAA) and score images against these policies.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how would the scoring be accomplished?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using OPA policies. The user would write custom OPA policies using rego (a policy language used by OPA) where the scoring logic can implemented. The scores can then be retrieved from the Harbor OPA data store.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds statistic calculations. Need to depend on OPA?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steven-zou There is a need to depend upon OPA for this. While at the base level, the task involves querying data from the data store and aggregating then using some calculations; these calculations would need to be done in OPA on the fetched data if we want to come up with standard scoring policies that can be shared across the board.
The way I envision it working:

  1. Fetch data of compliance evaluation from scanners
  2. Evaluate OPA policies containing the statistical and mathematical calculations against this data
  3. Store the results of the evaluation in the Harbor OPA datastore.

The results in step 3 can then be queried by external tools and services to take further actions/decisions

* I want to apply custom scoring policy on images based on my organization's acceptance criteria. I then want to ensure that images with low score are not permitted to be used when launching any workloads.


### Dev Ops Persona
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this equivalent to the system admin?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. This persona refers to someone or something (like say a Kubernetes Operator) who would be responsible for creating services and pods in production kubernetes or docker swarm clusters from the images stored in Harbor.


* Do not deploy a service S which uses images containing a **Critical** vulnerability **V**
* Fail the creation of Helm charts if they use an image with **Critical** vulnerabilities.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's no building of helm charts on harbor atm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. However, as mentioned previously, if Harbor OPA were to store the results of the policy scanning, then the deployment scripts would be able to query this data and decide on the next steps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Helm CLI provide such an interface for integration? Or it depends on the user's behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the thing you mentioned is not preventing pulling from Harbor, right? Do you want to provide a querying service for calling?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was referring to the Helm Plug-ins (https://helm.sh/docs/topics/plugins/) which can be written in any language as per the documentation. Users have the ability to write their own plug-ins that will take actions based on the data queried from the Harbor OPA data store. Here is a workflow that can be achieved with Helm:

  1. Helm Plug-in queries Harbor OPA data store to check for any specific policy violations.
  2. If policy violations found do not pull image from Harbor.

As has been mentioned in the design below, a reporting API layer would expose the harbor OPA data store to be queried by the external world.


### Application Developer/Owner Persona
* Pull images having that have a PCI compliance policy evaluation score greater than 8 out of 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume 'pull' and 'deploy a service' is the same thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I was referring to the Harbor functionality where-in we can fetch images from another registry like say Docker. The intent here is to ensure that when images are pulled from Docker, they would be scanned and evaluated for compliance using the Harbor OPA agent prior to the operation being marked as successful/failed.


### Project Owner Persona
* Quarantine images whose PCI compliance policy evaluation score is less than 8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does quarantine mean in harbor? We don't have this concept

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Harbor currently does not have this functionality. What however this implies is that a workflow can be developed where-in Harbor OPA flags non-compliant images and then Harbor either runs a Garbage Collection on them or prevents the use and replication of such flagged images

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per my understanding, quarantine == prevent pulling, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steven-zou by the term quarantine I meant isolating these images from other compliant images and making sure that images are never pulled or replicated. Such images would end up eating storage space and hence it would be also good if the Garbage Collector would simply delete them.

* Do not replicate images having vulnerability **CVE-12345** to any destination.
* Do not replicate images with low PCI scores to my production Harbor registry.




As can be seen from the use cases above; there is a requirement to persist the results of a scan or vulnerability evaluation in a format that supports ad-hoc querying as well as presentable within a report.

Additionally, there also exists a critical requirement for the end user to be able to author complex policies that can evaluate the results of an image scan and produce an output that flags the image as matching or failing the acceptance criteria and also share the policies across departments to implement and enforce set of best practices uniformly.

## Proposal

The next sections describe the user workflows, low level architectures and messaging flow that enable the Harbor - OPA integration

### Harbor Policy Agent User Workflow

Keeping adherance the multi-tenancy tenets of Harbor, OPA policies would be scoped at a per-user level and would be availble for use across **all** projects of which the user is a member. This level of scoping will satisfy the following scenarios:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scoped means users can author his own policies right? not enforced at the user level, that would be messy

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. The term scoped has been used to refer to the fact that the user can author and upload/update his/her policies. It is not enforcement at the user level.

* Users can author policies relevant to the projects that they manage or a member of and ensure that images within those projects can evaluated against the policies
* Helps easy migration to a cloud based deployment where-in each user would be a tenant and could own multiple projects against which the policies can be evaluated.
* To satisfy the use case of an uber registry wide or organization specific Security Admin persona, a new user account for the Security Admin can be created and all enterprise wide acceptance policies can be placed under the Security Admin account. Security Admins can then run evaluations against these policies either on demand or inline with a scan or using a periodinc schedule.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so are policies created centrally by a new security admin role or specified at the user level mentioned earlier? It might turn into one of those project vs system where project can extend or system level settings as is the case for cve whitelisting

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Policies can be authored at both levels. The system wide policies are more like the Enterprise wide policies which every image must adhere to, whereas individual projects could have additional policies that the images should comply. A typical example could be that all Photon images used within Acme E-Commerce Corp must have the latest certified build of Photon. Additionally, images lying within the Harbor project owned by the Payment Services team must comply with PCI related checks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's very different from the current harbor design. For applying to multiple projects, it's put at the system level. For a single project, it's put under the specified project. At present Harbor does not support managing resources at the same place for users with different permissions.

Actually, I like this management view. But it seems not doable based on current Harbor implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steven-zou The mechanism I am proposing here is the same as about is applied for CVE whitelists as @xaleeks has already mentioned. I also went through the documentation here https://goharbor.io/docs/1.10/administration/vulnerability-scanning/configure-system-whitelist/.
Basically, the intent is to have system-wide compliance policies that must be adhered to within the registry and per project based policies that can override or have additional policies.
The implementation of the security admin persona is not related to this proposal. It has been mentioned in this proposal for completeness. From what I understand after having read the documetation and @xaleeks comment, the new Security Admin role is not a requirement at the moment.


* **Note Security Admin Persona is not within the scope of this proposal and would need to be treated as a separate proposal**


The below sub-sections detail some of the proposed workflows. Instead of providing a UI mockup, the focus is on the segues through which the user will navigate to accomplish actions related to Policy evaluation and reporting

**Images and icons need to be finalized and are used for illustration purposes only**

#### Accessing Policy Functionality in Harbor

The below mockup shows the mechanism of accessing the Policy functionality within harbor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would need system admin access fyi, so project admin would not be able to access these

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. That was an illustration. However, we can have a mechanism where-in even project admins could be able to author and upload policies. I have not been able to come up with the UX/UI for that at the moment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's inconsistent with the workflow design shown above.

All the sections under Administration can be accessed only by the users with System Admin role.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steven-zou - I placed it within that panel only for the purpose of illustration. Individual project owners who are not system admins can upload their authored policy bundle as mentioned in the new screenshot that I have provided.

![Harbor Policy Upload Screen](../images/opaintegration/Harbor_Policy_Link.png)

#### Policy Upload User Workflow

![Harbor Policy Upload User Workflow](../images/opaintegration/User_Policy_Upload_Diagram.png)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so is this where OPAs are injected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct.


#### Policy Evaluation - Policy Centric Workflow

The below diagram depicts the user workflow to be followed for a policy centric evaluation i.e. the policy is a reference point against which images are evaluated

![Harbor Policy Evaluation Policy_Centric](../images/opaintegration/User_Policy_Evaluation_Policy_Centric.png)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the 'policies' tab should just hold policies, they are executed against in the actual projects, just like vulnerability scanning, ie the configuration of the scanners vs the actual execution of the scans

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is how it would be. Policies tab lists the policies and then the user can select the policies that need to be evaluated against images in the projects where they are an admin.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As rego policies are pure texts, maybe supporting online editor can also be an option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. That would actually provide for an integrated experience :). But again it could be taken up in a later iteration IMHO, after implementing the core OPA integration pipeline.



#### Policy Evaluation - Policy Centric Workflow

The below diagram depicts the user workflow to be followed for a image centric evaluation i.e. the image is a reference point against which policies are evaluated

![Harbor Policy Evaluation Image_Centric](../images/opaintegration/User_Policy_Evaluation_Image_Centric.png)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it, this is akin to the vulnerability scanning experience

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it is similar to the vulnerability scanning experience.


#### Policy Evaluation Reporting Workflow

![Harbor Policy Evaluation Reporting Workflow](../images/opaintegration/User_Policy_Evaluation_Reports.png)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does the report look like for a single policy, is this aggregated summary of all historical evaluations against this policy, across all projects for this particular user? what if I'm logged in as the security admin

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not yet come up with the UI representation of the report. But in terms of content it would be what you said. I was envisioning a simple tabular list enumerating the image, the corresponding project and a link to the latest evaluation report. Do we need historical results? I will update the content to be a more specific once we complete this initial discussion

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems you're using OPA policy to support some new functions.

Before checking this, I was thinking we can use OPA to do compliance checking in the middleware layer by the harbor system (not by users) and replace the current checking policies that are implemented with native code.

@xaleeks What do you think?



The next sections detail the low level architecture and component view and interaction diagrams for the Harbor Policy Agent
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please help review @steven-zou @reasonerjt

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make the OPA itself a configurable service? That means I can leverage my existing OPA to do the work. The user just needs to specify an OPA endpoint in harbor side.

I don't think the policy agent needs to listen to the job service webhook, it can directly listen to the harbor webhook which includes the ScanComplete event.

The architecture is not too clear and may need to be refactored.

Actually, the most important thing is before reviewing the implementation design details we need to reach an agreement on the points in the previous sections, especially the personas/user stories.

I'm thinking whether we need to expand the OPA to be a policy engine as well as a metric engine or not. And whether we should provide entries for users to directly run policy evaluations and check results.

We can continue the offline discussions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steven-zou Yes, the OPA instance being used could be something that already exists in the user environment and Harbor can be made to talk to that. In fact, all the PoCs that have been demonstrated until now, have been done using this assumption. I also think that this will result in a more cleaner integration; the OPA instance could be upgraded without impacting Harbor thereby allowing the user to use the latest and greatest features. Additionally, it also absolves Harbor from managing the configuration of OPA.
However, the results of the policy evaluation would be stored in Harbor using the schema proposed in the Common Vulnerability Evaluation and Reporting framework document. We can discuss this aspect offline to conclude on this aspect. Also, would discuss what parts of the architecture need to be refactored to make things more clear.


### Harbor Policy Agent Component View

The **Harbor Policy Agent** provides policy evaluation and reporting capabilities within the Harbor ecosystem. A component view of the policy agent is shown below

![Harbor Policy Agent Component View](../images/opaintegration/Harbor_Policy_Agent.png)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please review this architecture @steven-zou @reasonerjt


The core components of the policy evaluation and reporting layer are
* Policy Agent
* PostgresSQL DB
* Elasticsearch

#### Policy Agent
The **Policy Agent** contains all the required components for processing OPA policies, evaluating them and then persisting the results of the evaluation to the Postgres DB and Elasticsearch store. Each layer within the **Policy Agent** performs a specific responsibility
* Vulnerability Data Fetch layer - responsible for loading vulnerability an scan data from a set of data stores. The data stores could be based out of a File system or a Database.
* Policy layer - responsible for Policy storage, retrieval and evaluation using th OPA framework. The layer has been further detailed in sections below.
* Storage Layer - responsible for providing the required storage abstractions to various data stores for the policy evaluation results and optionally any additional data.
* Reporting layer - responsible for exposing a set of REST APIs, for querying policy and evaluation data and metrics.

#### PostgresSQL DB
The PostgresSQL DB will store the results of the policy evaluation process in a normalized form that allows for ad-hoc query of the data.

#### ElasticSearch
The Elasticsearch data store will store the results of the policy evaluation indexed by the text contents so that a Full Text Search capability is available on the policy data and the policy evaluation results.

### Harbor Policy Agent Low Level Design

The low level (interface and struct level) components of the Policy Agent are specified in the below diagram

![Harbor Policy Agent Low Level Component Diagram](../images/opaintegration/Harbor_Policy_Agent_Detailed_View.png)

### Harbor Policy Agent Policy Upload Workflow

The below sequence diagram depicts the interactions between the various components of the **Harbor Policy Agent** during the Policy upload process

![Harbor Policy Agent Policy Upload Workflow](../images/opaintegration/Harbor_Policy_Workflow_PolicyUpload.png)

### Harbor Policy Agent Policy Evaluation Workflow

Harbor Policy Evaluation can be triggered in any of the below three forms
* Inline with an image scan - Policy evaluation happens immediately after an image scan completes
* On demand - Policy evaluation is triggered by the user by specifying a policy Id and an image id for the purposes of the scan
* Scheduled - Policy evaluation can be scheduled to trigger off exactly once or at a periodic interval.

The below sequence diagram depicts the interactions between the various components of the **Harbor Policy Agent** during the Policy evaluation process. It also depicts how Harbor interacts with the Policy Agent using the Policy HTTP Endpoint to trigger a policy evaluation.

![Harbor Policy Agent Policy Evaluation Workflow](../images/opaintegration/Harbor_Policy_Evaluation_Workflow.png)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where exactly does the agent reside?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned that in the low level component diagram within the architecture, the Agent would be residing on a separate container image within the Harbor network.


##Misc Considerations

### Multi-tenancy

The proposed solution

### Deployment

The **Harbor Policy Agent** would be deployed as a service within a container alongside other services that make up the Harbor installation.