Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enabling S3 bucket share #848

Merged
merged 61 commits into from
Nov 20, 2023
Merged

Conversation

anushka-singh
Copy link
Contributor

@anushka-singh anushka-singh commented Oct 31, 2023

Feature or Bugfix

  • Feature

Detail

  • We want to enable bucket sharing along with access point share which already exists in data all right now.
  • A user will be able to request shares at bucket level and at the folder level with access points.
  • Please NOTE: There is some common code between Access point share managers and processors and S3 Bucket managers and processors. We will send out a separate PR for that refactoring work at a later time.

Relates

Contributors:

Security

Please answer the questions below briefly where applicable, or write N/A. Based on
OWASP 10.

  • Does this PR introduce or modify any input fields or queries - this includes
    fetching data from storage outside the application (e.g. a database, an S3 bucket)?
    • Is the input sanitized?
    • What precautions are you taking before deserializing the data you consume?
    • Is injection prevented by parametrizing queries?
    • Have you ensured no eval or similar functions are used?
  • Does this PR introduce any functionality or component that requires authorization?
    • How have you ensured it respects the existing AuthN/AuthZ mechanisms?
    • Are you logging failed auth attempts?
  • Are you using or adding any cryptographic features?
    • Do you use a standard proven implementations?
    • Are the used keys controlled by the customer? Where are they stored?
  • Are you introducing any new policies/roles/users?
    • Have you used the least-privilege principle? How?

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

noah-paige and others added 30 commits September 15, 2023 08:18
…ata-dot-all#748)

### Feature or Bugfix
<!-- please choose -->
- Feature Enhancement

### Detail
- Adding additional error messages for KMS Key lookup when importing a
new dataset
  - 1 Error message to determine if the KMS Key Alias Exists
- 1 Error message to determine if the PivotRole has permissions to
describe the KMS Key

### Relates
- data-dot-all#712 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
### Feature or Bugfix
<!-- please choose -->
- NA

### Detail
- Get latest code in `main` to `v2m1m0` branch to keep in sync

### Relates
- NA

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

NA
```
- Does this PR introduce or modify any input fields or queries - this includes
fetching data from storage outside the application (e.g. a database, an S3 bucket)?
  - Is the input sanitized?
  - What precautions are you taking before deserializing the data you consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires authorization?
  - How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?
```

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
### Feature or Bugfix
<!-- please choose -->
- Enahncement / Bugfix

### Detail
- When creating an environment and specifying default Env IAM Role we
assume it is of the structure `arn:aws:iam::ACCOUNT:role/NAME_SPECIFIED`
- This does not work when there is a service path in the role arn such
as with SSO: `arn:aws:iam::ACCOUNT:role/sso/NAME_SPECIFIED`
- Causes issues when importing an IAM Role for an invited Team in an
environment and/or with dataset sharing


- This PR takes in the full IAM role ARN when importing the IAM role in
order to correctly determine the role name




### Relates
- [data-dot-all#695 ](data-dot-all#695)

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
### Feature or Bugfix
<!-- please choose -->
- Enhancement / Bugfix

### Detail
- Ensure the names passed for OpenSearch Domain and OpenSearch
Serverless Collection, Access Policy, Security Policy, and VPC Endpoint
all follow naming conventions required by the service, meaning

    - The name must start with a lowercase letter
    - Must be between 3 and 28 characters
    - Valid characters are a-z (lowercase only), 0-9, and - (hyphen).

### Relates
- data-dot-all#540 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: dlpzx <[email protected]>
# Conflicts:
#	deploy/app.py
### Feature or Bugfix
Update

### Detail


### Relates
See data-dot-all#655: 
> In Nov 27, 2023 the Lambda runtime node14 and Python3.7 will be
deprecated!

Checked all lambdas that explicitly set the runtime engine: only cognito
httpheader redirection lambda used node14.
All lambdas use python3.8 and node16 or node18.

For cdk dependencies: upgraded to a newest `aws-cdk-lib` `v2.99.0` just
in case if python3.7 is hardcoded somewhere inside of 2.78.0 (shouldn't
be)

### 
Testing: 
- [x] uploaded the changes to my isengard account
- [x] deployment is green 
- [x] could access app page, userguide page, and userguide from the app
page.

### Security
`N/A` - upgraded to a newer version of node js

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
### Feature or Bugfix
- Feature

### Detail
The guiding principle is that:
1. dataset IAM role is the role accessing data
2. pivot role is the role used by the central account to perform SDK
calls in the environment account

In this PR we
- Replace pivot role by dataset role in dataset Lake Formation
registration
- Use pivot role to trigger upload files feature and create folder
feature, but use the dataset IAM role to perform the putObject
operations-> removes the need for read and `putObject` permissions. for
the pivot role
- Redefine pivot role CDK stack to manage S3 buckets (bucket policies)
for only the datasets S3 buckets that have been created or imported in
the environment.
- implement IAM policy utils to handle the new dynamic policies. We need
to verify that the created policy statements do not exceed the maximum
policy size. In addition we replace the previous "divide in chunks of 10
statements" by a function that divides in chunks based on the size of
the policy statements. This way we optimize the policy size, which helps
us in reducing the number of managed policies attached to the pivot
role. --> it can be re-used in other "chunkenization" of policies
- We did not implement force update of environments (pivot role nested
stack) with new datasets added because it is already forced in
`backend/dataall/modules/datasets/services/dataset_service.py`

### Backwards compatibility Testing

Pre-update setup:
- 1 environment (auto-created pivot role)
- 2 datasets in environment, 1 created, 1 imported: with tables and
folders
- Run profiling jobs in tables

Update with the branch changes:
- [X] CICD pipeline runs successfully
- [X] Click update environment on environment -> successfully updated
policy of pivot role with imported datasets in policy. Reduction of
policies
- [X] Click update datasets --> registration in Lake formation updated
to dataset role
- [X] Update files works
- [X] Create folder works
- [X] Crawler and profiling jobs work
 

### Relates
- data-dot-all#580 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Are you introducing any new policies/roles/users? `Yes`
- Have you used the least-privilege principle? How? `In this PR we
restrict the permissions of the pivot role, a super role that handles
SDK calls in the environment accounts. Instead of granting permissions
to all S3 buckets, we restrict it to data.all handled S3 buckets only`


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
…eation (data-dot-all#781)

### Feature or Bugfix
- Feature
- Bugfix

### Detail


The different alternatives considered are discussed in data-dot-all#556

This PR introduces a new query `listValidEnvironments` that replaces the
query `listEnvironments` for certain operations.
`listEnvironments` - lists all environments independently of their
CloudFormation stack statys with a lot of additional details
`listValidEnvironments` - lists only "CloudFormation" stable and
successful environments. Retrieves only basic info about the
environment.

Operations such as opening a share request or creation a
Dataset/Notebook/etc require the selection of an environment. The
environment options are now retrieved from `listValidEnvironments`
ensuring that only valid environments are selectable. Moreover, this
query is more light and does not need to query and obtain as many fields
as the original `listEnvironments`, improving the efficiency of the
code.

### Relates
- data-dot-all#556 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
### Feature or Bugfix
<!-- please choose -->
- Feature

### Detail
Allows user to configure a session timeout . Today data.all by default
sets the refresh token to 30 days but with this change it becomes
configurable

### Relates
data-dot-all#421

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

Co-authored-by: Manjula <[email protected]>
### Feature or Bugfix
- Feature
- Bugfix

### Detail
As explained in the [semgrep
docs](https://semgrep.dev/docs/cheat-sheets/python-command-injection/#1b-shelltrue):
"Functions from the subprocess module have the shell argument for
specifying if the command should be executed through the shell. Using
shell=True is dangerous because it propagates current shell settings and
variables. This means that variables, glob patterns, and other special
shell features in the command string are processed before the command is
run, making it much easier for a malicious actor to execute commands.
The subprocess module allows you to start new processes, connect to
their input/output/error pipes, and obtain their return codes. Methods
such as Popen, run, call, check_call, check_output are intended for
running commands provided as an argument ('args'). Allowing user input
in a command that is passed as an argument to one of these methods can
create an opportunity for a command injection vulnerability."

In our case the risk is not exposed as no user input is directly taken
into the subprocess commands. Nevertheless we should strive for the
highest standards on security and this PR works on replacing all the
`shell=True` executions in the data.all
code.

In this PR:
- when possible we have set `shell=False`
- in cases where the command was too complex a `CommandSanitizer`
ensures that the input arguments are strings following the
regex=`[a-zA-Z0-9-_]`

Testing: 
- [X] local testing - deployment of any stack
(`backend/dataall/base/cdkproxy/cdk_cli_wrapper.py`)
- [X] local testing - deployment of cdk pipeline stack
(`backend/dataall/modules/datapipelines/cdk/datapipelines_cdk_pipeline.py`)
- [X] local testing - deployment of codepipeline pipeline stack
(`backend/dataall/modules/datapipelines/cdk/datapipelines_pipeline.py`)
- [ ] AWS testing - deployment of data.all
- [ ] AWS testing - deployment of any stack
(`backend/dataall/base/cdkproxy/cdk_cli_wrapper.py`)
- [ ] AWS testing - deployment of cdk pipeline stack
(`backend/dataall/modules/datapipelines/cdk/datapipelines_cdk_pipeline.py`)
- [ ] AWS testing - deployment of codepipeline pipeline stack
(`backend/dataall/modules/datapipelines/cdk/datapipelines_pipeline.py`)

### Relates
- data-dot-all#738 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
- Is the input sanitized? ---> 🆗 This is exactly what this PR is trying
to do
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
…uester (data-dot-all#793)

### Feature or Bugfix
- Bugfix

### Detail
- Allowing to submit a share when you are both an approver and a
requester

### Security

**DOES NOT APPLY**

Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Zilvinas Saltys <[email protected]>
### Feature or Bugfix
- Feature

### Detail
Adding a redirect to the share UI once a share object is created.
Additionally updating the breadcrumb message to more clearly indicate
that a "Draft share request is created" rather than suggesting that the
share has actually been sent to the data owners team.

### Relates
N/A

### Security
N/A


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

Co-authored-by: Zilvinas Saltys <[email protected]>
### Feature or Bugfix
Fix data-dot-all#792: Fix: condition when there are no public subnets

---------
### Feature or Bugfix
- Feature

### Detail
- Removing unused variable in local graphql server pointing to a fixed
AWS region

### Relates
N/A

### Security
N/A


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

Co-authored-by: Zilvinas Saltys <[email protected]>
### Feature or Bugfix

- Feature

### Detail
- For a dataset to make sense all the tables within a dataset should
have their location pointing to the same place as the dataset S3 bucket.
However it is possible that a database can have tables which do not
point to the same bucket which is perfectly legal in LakeFormation.
Therefore we propose that data.all automatically only lists tables that
have the same S3 bucket location as the dataset. This will solve a
problem for Yahoo where we want to import a database that contains many
tables with different buckets. Additionally Catalog UI should also only
list prefiltered tables.

### Testing
- Tested this in local env. I was able to create and share datasets even
after pre-filtering process takes place.
- Will send separate PR for unit testing. 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

Co-authored-by: Anushka Singh <[email protected]>
### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- Fix method to detect if other share objects exist on the environment
before cleaning up environment-level shared resources (i.e. RAM
invitation and PivotRole permissions)
- Originally, if TeamA in EnvA had 2 shares approved and succeeded on
DatasetB and TeamA rejects 1 of the pre-existing shares, the method
`other_approved_share_object_exists` was returning `False`and deleting
necessary permissions for the other existing Share
- Also disables the other existing shares ability to Revoke the still
existing share since pivotRole no longer has permissions

- Also fixes the removal of dataall QS Group permissions if there are
still existing shares to EnvA

### Security
NA
```
Please answer the questions below briefly where applicable, or write `N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this includes
fetching data from storage outside the application (e.g. a database, an S3 bucket)?
  - Is the input sanitized?
  - What precautions are you taking before deserializing the data you consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires authorization?
  - How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?
```

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: dlpzx <[email protected]>
### Feature or Bugfix
- Feature

### Detail

Whenever a share request is created and transitions from states (
approved, revoked, etc ) a notification is created. This notification is
displayed on the bell icon on the UI .

We want such a similar notification to be sent to the dataset owner,
requester, etc via email

Please take a look at Github Issue 734 For more details -
data-dot-all#734

### Relates
- data-dot-all#734


### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)? No
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization? No
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features? No
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users? Yes
- Have you used the least-privilege principle? How? --> **Permission
granted for SES:sendEmail to Lambda on resources - (Ses identity and
configuration set ) , Also created KMS and SNS for SES setup to handle
email bounces . Used least privleged and restricted access on both
whenever required. **


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

Co-authored-by: trajopadhye <[email protected]>
### Feature or Bugfix
- Feature

### Detail
- Adding frontend support for all feature flags defined in config.json
with a new util method isFeatureEnabled
- Adding a new flag **preview_data** in the datasets module to control
whether previewing data is allowed
- Adding a new flag **glue_crawler** in the datasets module to control
whether running glue crawler is allowed
- Updating environment features to be hidden or visible based on whether
the module is active. Adding a new util isAnyFeatureModuleEnabled to
check whether to render the entire feature box.

### Relates
N/A

### Security
Not relevant

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Zilvinas Saltys <[email protected]>
### Feature or Bugfix
- Refactoring

### Detail
As a rule of thumb, we encourage customization of `modules` while
changes in `core` should be avoided when possible. `notifications` is a
component initially in core which is only used by `dataset_sharing`. To
facilitate customization of the `notifications` module and also to
clearly see its dependencies we have:

- Moved `notifications` code from core to modules as it is a reusable
component that is not needed by any core component.
- Moved dataset_sharing references inside dataset_sharing module and
left `notifications` independent from any other module (done mostly in
data-dot-all#734, so credits to @TejasRGitHub)
- Added depends_on in the dataset_sharing module to load notifications
if the data_sharing module is imported.
- Modified frontend navigation bar to make it conditional of the
notifications module
- Added migration script to modify the notification type column
- Fix tests from data-dot-all#734, some references on the payload of the
notification tasks were wrong
- Small fixes to SES stack: added account in KMS policy and email_id as
input

### [WIP] Testing
Local testing
- [ ] loading of notifications with datasets enabled
- [ ] ...

AWS testing
- [ ] CICD pipeline succeds

### Other remarks
Not for this PR, but as a general note, we should clean up deprecated
ECS tasks

### Relates
- data-dot-all#785 
- data-dot-all#734 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

`N/A` just refactoring


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
# Conflicts:
#	deploy/stacks/backend_stack.py
#	deploy/stacks/backend_stage.py
#	deploy/stacks/lambda_api.py
#	deploy/stacks/pipeline.py
#	template_cdk.json
### Feature or Bugfix
- Feature

### Detail
- read KMS keys with an alias prefixed by the environment resource
prefix
- read KMS keys imported in imported datasets
- restrict pivot role policies to the KMS keys created by data.all and
those imported in the imported datasets
- move kms client from data_sharing to base as it is used in
environments and datasets

### Relates
- data-dot-all#580

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

This PR restricts the IAM policies of the pivot role, following the
least privilege permissions principle

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
### Feature or Bugfix
- Bugfix

### Detail
- Make `hosted_zone_id` optional, code update

### Relates
- data-dot-all#797 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)? N/A
  - Is the input sanitized? N/A
- What precautions are you taking before deserializing the data you
consume? N/A
  - Is injection prevented by parametrizing queries? N/A
  - Have you ensured no `eval` or similar functions are used? N/A
- Does this PR introduce any functionality or component that requires
authorization? N/A
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
N/A
  - Are you logging failed auth attempts? N/A
- Are you using or adding any cryptographic features? N/A
  - Do you use a standard proven implementations? N/A
- Are the used keys controlled by the customer? Where are they stored?
N/A
- Are you introducing any new policies/roles/users? N/A
  - Have you used the least-privilege principle? How? N/A

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license. YES

### Description

Make `hosted_zone_id` optional and provide `HostedZoneId` and `DNSName`
in CloudFormation Stack Output, so users can create their own [Route53
AliasTarget](https://docs.aws.amazon.com/Route53/latest/APIReference/API_AliasTarget.html).

Following validation checks in
`ecs_patterns.ApplicationLoadBalancedFargateService` were considered:
* `frontend_alternate_domain` and `userguide_alternate_domain` have to
be `None` when the `hosted_zone` is `None`, see checks in
[multiple-target-groups-service-base.ts#L463](https://github.com/aws/aws-cdk/blob/c445b8cc6e20d17e4a536f17262646b291a0fe36/packages/aws-cdk-lib/aws-ecs-patterns/lib/base/network-multiple-target-groups-service-base.ts#L463),
or else a `A Route53 hosted domain zone name is required to configure
the specified domain name` error is raised
* for a HTTPS ALB listener, only the `certificate` is ultimately
required, and not the `domainName` or `domainZone`, as per evaluation
logic in
[application-load-balanced-service-base.ts#L509](https://github.com/aws/aws-cdk/blob/c445b8cc6e20d17e4a536f17262646b291a0fe36/packages/aws-cdk-lib/aws-ecs-patterns/lib/base/application-load-balanced-service-base.ts#L509)
### Feature or Bugfix
- Bugfix

### Detail
- Clean up prints and show better exception message when custom_domain
is not provided for SES

### Relates
- v2.1.0

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
# Conflicts:
#	deploy/stacks/backend_stack.py
#	deploy/stacks/backend_stage.py
#	deploy/stacks/lambda_api.py
#	deploy/stacks/pipeline.py
#	template_cdk.json
### Feature or Bugfix
- Feature

### Detail
- read KMS keys with an alias prefixed by the environment resource
prefix
- read KMS keys imported in imported datasets
- restrict pivot role policies to the KMS keys created by data.all and
those imported in the imported datasets
- move kms client from data_sharing to base as it is used in
environments and datasets

### Relates
- data-dot-all#580

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

This PR restricts the IAM policies of the pivot role, following the
least privilege permissions principle

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
### Feature or Bugfix
- Bugfix

### Detail
- Make `hosted_zone_id` optional, code update

### Relates
- data-dot-all#797 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)? N/A
  - Is the input sanitized? N/A
- What precautions are you taking before deserializing the data you
consume? N/A
  - Is injection prevented by parametrizing queries? N/A
  - Have you ensured no `eval` or similar functions are used? N/A
- Does this PR introduce any functionality or component that requires
authorization? N/A
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
N/A
  - Are you logging failed auth attempts? N/A
- Are you using or adding any cryptographic features? N/A
  - Do you use a standard proven implementations? N/A
- Are the used keys controlled by the customer? Where are they stored?
N/A
- Are you introducing any new policies/roles/users? N/A
  - Have you used the least-privilege principle? How? N/A

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license. YES

### Description

Make `hosted_zone_id` optional and provide `HostedZoneId` and `DNSName`
in CloudFormation Stack Output, so users can create their own [Route53
AliasTarget](https://docs.aws.amazon.com/Route53/latest/APIReference/API_AliasTarget.html).

Following validation checks in
`ecs_patterns.ApplicationLoadBalancedFargateService` were considered:
* `frontend_alternate_domain` and `userguide_alternate_domain` have to
be `None` when the `hosted_zone` is `None`, see checks in
[multiple-target-groups-service-base.ts#L463](https://github.com/aws/aws-cdk/blob/c445b8cc6e20d17e4a536f17262646b291a0fe36/packages/aws-cdk-lib/aws-ecs-patterns/lib/base/network-multiple-target-groups-service-base.ts#L463),
or else a `A Route53 hosted domain zone name is required to configure
the specified domain name` error is raised
* for a HTTPS ALB listener, only the `certificate` is ultimately
required, and not the `domainName` or `domainZone`, as per evaluation
logic in
[application-load-balanced-service-base.ts#L509](https://github.com/aws/aws-cdk/blob/c445b8cc6e20d17e4a536f17262646b291a0fe36/packages/aws-cdk-lib/aws-ecs-patterns/lib/base/application-load-balanced-service-base.ts#L509)
### Feature or Bugfix
- Bugfix

### Detail
- Clean up prints and show better exception message when custom_domain
is not provided for SES

### Relates
- v2.1.0

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
Copy link
Contributor

@dlpzx dlpzx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, only some small clarifications

@dlpzx
Copy link
Contributor

dlpzx commented Nov 10, 2023

Final remark on the PR: @noah-paige @anmolsgandhi @anushka-singh
At the moment all sharing mechanisms are enabled and there is no configuration on the config.json. So if a customer wants to restrict a particular way of sharing data in data.all it is not possible at the moment. What do you think?

  • do we see the need of allowing configuration of the sharing mechanisms allowed?
  • and how important is it?

In any case I think we can move forward with this PR and if needed implement the config in another PR

@dlpzx
Copy link
Contributor

dlpzx commented Nov 10, 2023

Testing in AWS:

  • CICD from previous deployment succeeds. Migration script runs successfully.

For each of the following checks we test the 3 different scenarios:
- Created Dataset
- Imported Dataset with SSE-S3 encryption
- Imported Dataset with KMS encryption
- Imported Dataset from multiple buckets with KMS encryption

  • Before doing anything, we check that pre-existing tables and folders shares still work --> check in Athena and Access points calls --> we see the changes in the UI, with the new command to list s3
  • We share the S3 Bucket and Pre-existing tables and folders shares still work --> check in Athena and Access points calls
  • Check S3 Bucket share works - we see all share item and its status after share as SHARE_SUCCEEDED for the bucket and get-object downloading of files in the bucket works

Now, only the case for imported KMS vs SSE-S3 are checked:
Check removing S3 Bucket share - get object gives an AccessDenied error. Share item status is REVOKE_SUCCEEDED.

  • Imported Dataset with SSE-S3 encryption: succeeds but with some error logs (error 1)
  • Imported Dataset with KMS encryption

Check removing folder share - get object gives an AccessDenied error. Share item status is REVOKE_SUCCEEDED

  • Imported Dataset with SSE-S3 encryption: folders are in REVOKE_SUCCEDDED but share request stays in REVOKE_IN_PROGRESS (details and logs in error 2)
  • Imported Dataset with KMS encryption, not tested yet (it will fail like the previous one)

Check re-adding bucket share only

  • Imported Dataset with SSE-S3 encryption
  • Imported Dataset with KMS encryption

Check re-adding folder share only

  • Imported Dataset with SSE-S3 encryption, not tested yet (it will fail with error 3)
  • Imported Dataset with KMS encryption, not tested yet (it will fail with error 3)

Edge cases. Some not tested yet.
Check add folder + s3 share at the same time

  • Created Dataset with KMS encryption --> checked with a role that did not have previous folder share requests to avoid error 3 (get_object works using s3 and s3 access point)
  • Imported Dataset with SSE-S3 encryption

Check remove folder + s3 share at the same time --- failing error 2

  • Created Dataset with KMS encryption
  • Imported Dataset with SSE-S3 encryption

Check adding one and removing the other at the same time

  • Created Dataset with KMS encryption
  • Imported Dataset with SSE-S3 encryption

Errors found

Error 1 ---> In another issue

This error is just cosmetic and it will addressed in #614

  • Approving S3 Bucket sharing log shows failure but the sharing works fine (share item ends in SHARE_SUCCEEDED and we can download objects from S3 bucket):Failed to get kms key id of alias/SSE-S3 : An error occurred (NotFoundException) when calling the DescribeKey operation: Alias arn:aws:kms:eu-west-1:XXXXXXXX:alias/SSE-S3 is not found.
  • Revoking S3 Bucket sharing log shows failure but the sharing revoke works fine (share item ends in REVOKE_SUCCEEDED and we cannot download objects from S3 bucket): Failed to get kms key id of alias/SSE-S3 : An error occurred (NotFoundException) when calling the DescribeKey operation: Alias arn:aws:kms:eu-west-1:XXXXXXX:alias/SSE-S3 is not found.

Error 2 ---> 🧑‍🏭 needs fixing

This is a generic issue, it is not due to pre-existing resources.
Imported Dataset with SSE-S3 encryption: folders are in REVOKE_SUCCEDDED but share request stays in REVOKE_IN_PROGRESS (error 2)
Looking at the logs, it fails in dataall/modules/dataset_sharing/services/data_sharing_service.py:199 when we pass the variable existing_shared_buckets=existing_shared_buckets to the initialization of the ProcessS3AccessPointShare, but that variable is not declared in the class

Screenshot 2023-11-13 at 09 25 26

Error 3 ---> 🧑‍🏭 needs fixing

This is a backwards compatibility issue.
It fails when the Bucket is encrypted with a KMS key. Looking at the code and the logs I think the issue is in this line dataall/modules/dataset_sharing/services/share_managers/s3_access_point_share_manager.py:184

share_manager.add_missing_resources_to_policy_statement(
    kms_key_id,
    kms_target_resources,
    existing_policy["Statement"][1],
    IAM_ACCESS_POINT_ROLE_POLICY
)

We are assuming that the IAM role policy has 2 statements, but in pre-existing IAM roles there is one statement only.

Screenshot 2023-11-13 at 10 02 41

Screenshot 2023-11-13 at 09 47 12

Error 4 ---> In another issue

Sometimes we face again the issue that we had in the past of an access point taking a long time to get created and the put access point policy failure. I just want to note it here, but it is unrelated to this PR and it should be solved in a separate PR.

@dlpzx
Copy link
Contributor

dlpzx commented Nov 13, 2023

I just realized:

  1. the PR is pointing to v2m1m0. Changing the source branch results in some additional git-ing that we need to do
  2. we need to update the documentation in the userguide. How much bandwidth do you have for these? I don't think it will be much

@anushka-singh
Copy link
Contributor Author

Final remark on the PR: @noah-paige @anmolsgandhi @anushka-singh At the moment all sharing mechanisms are enabled and there is no configuration on the config.json. So if a customer wants to restrict a particular way of sharing data in data.all it is not possible at the moment. What do you think?

  • do we see the need of allowing configuration of the sharing mechanisms allowed?
  • and how important is it?

In any case I think we can move forward with this PR and if needed implement the config in another PR

On this comment.
We do have a config for disabling the folder sharing mechanism.
"file_uploads": false, "file_actions": false,
Setting this to false, disables access point sharing.
There is no provision for disabling s3 bucket sharing at the moment though.
I think we dont need the option at the moment since the idea was every dataset should be able to be shared fully with bucket share or folder wise with access point. If in the future, we identify a need for restricting access, we can simply add a config in config.json for it.

@anushka-singh
Copy link
Contributor Author

I just realized:

  1. the PR is pointing to v2m1m0. Changing the source branch results in some additional git-ing that we need to do
  2. we need to update the documentation in the userguide. How much bandwidth do you have for these? I don't think it will be much
  1. Yes, that will be needed. I might need your help with it.
  2. I would like to update the documentation too. Do we have an estimate on how long it will take? I can then maybe try to put in some sprint planning.

@anushka-singh
Copy link
Contributor Author

anushka-singh commented Nov 15, 2023

Testing in AWS:

  • CICD from previous deployment succeeds. Migration script runs successfully.

For each of the following checks we test the 3 different scenarios: - Created Dataset - Imported Dataset with SSE-S3 encryption - Imported Dataset with KMS encryption - Imported Dataset from multiple buckets with KMS encryption

  • Before doing anything, we check that pre-existing tables and folders shares still work --> check in Athena and Access points calls --> we see the changes in the UI, with the new command to list s3
  • We share the S3 Bucket and Pre-existing tables and folders shares still work --> check in Athena and Access points calls
  • Check S3 Bucket share works - we see all share item and its status after share as SHARE_SUCCEEDED for the bucket and get-object downloading of files in the bucket works

Now, only the case for imported KMS vs SSE-S3 are checked: Check removing S3 Bucket share - get object gives an AccessDenied error. Share item status is REVOKE_SUCCEEDED.

  • Imported Dataset with SSE-S3 encryption: succeeds but with some error logs (error 1)
  • Imported Dataset with KMS encryption

Check removing folder share - get object gives an AccessDenied error. Share item status is REVOKE_SUCCEEDED

  • Imported Dataset with SSE-S3 encryption: folders are in REVOKE_SUCCEDDED but share request stays in REVOKE_IN_PROGRESS (details and logs in error 2)
  • Imported Dataset with KMS encryption, not tested yet (it will fail like the previous one)

Check re-adding bucket share only

  • Imported Dataset with SSE-S3 encryption
  • Imported Dataset with KMS encryption

Check re-adding folder share only

  • Imported Dataset with SSE-S3 encryption, not tested yet (it will fail with error 3)
  • Imported Dataset with KMS encryption, not tested yet (it will fail with error 3)

Edge cases. Some not tested yet. Check add folder + s3 share at the same time

  • Created Dataset with KMS encryption --> checked with a role that did not have previous folder share requests to avoid error 3 (get_object works using s3 and s3 access point)
  • Imported Dataset with SSE-S3 encryption

Check remove folder + s3 share at the same time --- failing error 2

  • Created Dataset with KMS encryption
  • Imported Dataset with SSE-S3 encryption

Check adding one and removing the other at the same time

  • Created Dataset with KMS encryption
  • Imported Dataset with SSE-S3 encryption

Errors found

Error 1 ---> In another issue

This error is just cosmetic and it will addressed in #614

  • Approving S3 Bucket sharing log shows failure but the sharing works fine (share item ends in SHARE_SUCCEEDED and we can download objects from S3 bucket):Failed to get kms key id of alias/SSE-S3 : An error occurred (NotFoundException) when calling the DescribeKey operation: Alias arn:aws:kms:eu-west-1:XXXXXXXX:alias/SSE-S3 is not found.
  • Revoking S3 Bucket sharing log shows failure but the sharing revoke works fine (share item ends in REVOKE_SUCCEEDED and we cannot download objects from S3 bucket): Failed to get kms key id of alias/SSE-S3 : An error occurred (NotFoundException) when calling the DescribeKey operation: Alias arn:aws:kms:eu-west-1:XXXXXXX:alias/SSE-S3 is not found.

Error 2 ---> 🧑‍🏭 needs fixing

This is a generic issue, it is not due to pre-existing resources. Imported Dataset with SSE-S3 encryption: folders are in REVOKE_SUCCEDDED but share request stays in REVOKE_IN_PROGRESS (error 2) Looking at the logs, it fails in dataall/modules/dataset_sharing/services/data_sharing_service.py:199 when we pass the variable existing_shared_buckets=existing_shared_buckets to the initialization of the ProcessS3AccessPointShare, but that variable is not declared in the class

Screenshot 2023-11-13 at 09 25 26

Error 3 ---> 🧑‍🏭 needs fixing

This is a backwards compatibility issue. It fails when the Bucket is encrypted with a KMS key. Looking at the code and the logs I think the issue is in this line dataall/modules/dataset_sharing/services/share_managers/s3_access_point_share_manager.py:184

share_manager.add_missing_resources_to_policy_statement(
    kms_key_id,
    kms_target_resources,
    existing_policy["Statement"][1],
    IAM_ACCESS_POINT_ROLE_POLICY
)

We are assuming that the IAM role policy has 2 statements, but in pre-existing IAM roles there is one statement only.

Screenshot 2023-11-13 at 10 02 41

Screenshot 2023-11-13 at 09 47 12

Error 4 ---> In another issue

Sometimes we face again the issue that we had in the past of an access point taking a long time to get created and the put access point policy failure. I just want to note it here, but it is unrelated to this PR and it should be solved in a separate PR.

Error 2 and 3 found here have been fixed in next revision

# Conflicts:
#	backend/dataall/modules/datasets/api/table/resolvers.py
#	backend/migrations/versions/4f3c1d84a628_modify_notifications_column_types.py
#	deploy/stacks/backend_stack.py
#	deploy/stacks/ses_stack.py
#	frontend/src/modules/Environments/views/EnvironmentCreateForm.js
#	template_cdk.json
@dlpzx dlpzx changed the base branch from v2m1m0 to main November 16, 2023 07:29
@dlpzx
Copy link
Contributor

dlpzx commented Nov 16, 2023

Testing in AWS 2:

  • CICD from previous deployment succeeds. Migration script runs successfully.

Before doing anything, we check that pre-existing folders shares still work --> check Access points calls --> we see the changes in the UI, with the new command to list s3

  • Created Dataset
  • Imported Dataset with SSE-S3 encryption
  • Imported Dataset with KMS encryption

We share the S3 Bucket and Pre-existing folders shares still work --> check Access points calls

  • Created Dataset
  • Imported Dataset with SSE-S3 encryption
  • Imported Dataset with KMS encryption
    Note: it actually solves issue with KMS keys access

Check S3 Bucket share works - we see all share item and its status after share as SHARE_SUCCEEDED for the bucket and get-object downloading of files in the bucket works

  • Created Dataset
  • Imported Dataset with SSE-S3 encryption
  • Imported Dataset with KMS encryption
    Note: verified IAM policy in AWS Console

Check removing S3 Bucket share - get object gives an AccessDenied error. Share item status is REVOKE_SUCCEEDED. Access points working as initially (with small bug in KMS)

  • Created Dataset
  • Imported Dataset with SSE-S3 encryption
  • Imported Dataset with KMS encryption

Check removing folder share - get object gives an AccessDenied error. Share item status is REVOKE_SUCCEEDED

  • Created Dataset 🔴 Error 1
  • Imported Dataset with SSE-S3 encryption
  • Imported Dataset with KMS encryption 🔴 Error 1

Check re-adding bucket share only

  • Created Dataset (tested after fixes)
  • Imported Dataset with SSE-S3 encryption
  • Imported Dataset with KMS encryption(tested after fixes)

Check re-adding folder share only

  • Created Dataset(tested after fixes)
  • Imported Dataset with SSE-S3 encryption -->🔴 works but with error 2 (additional permissions to None kms key id)
  • Imported Dataset with KMS encryption(tested after fixes)

---------------------- New shares
Edge cases
Check add folder + s3 share at the same time

  • Created Dataset
  • Imported Dataset with SSE-S3 encryption
  • Imported Dataset with KMS encryption

Check remove folder + s3 share at the same time

  • Created Dataset
  • Imported Dataset with SSE-S3 encryption 🔴 Error 3
  • Imported Dataset with KMS encryption

Check adding one and removing the other at the same time
---> it is not possible by design in the share workflow

Errors found

Error 1

It is the same error as before. For backward compatibility the second statement in the IAM policy does not exist also in the revoke process. The error is in dataall/modules/dataset_sharing/services/share_managers/s3_access_point_share_manager.py:424

        if existing_policy:
            s3_target_resources = [
                f"arn:aws:s3:::{dataset.S3BucketName}",
                f"arn:aws:s3:::{dataset.S3BucketName}/*",
                f"arn:aws:s3:{dataset.region}:{dataset.AwsAccountId}:accesspoint/{access_point_name}",
                f"arn:aws:s3:{dataset.region}:{dataset.AwsAccountId}:accesspoint/{access_point_name}/*"
            ]
            ShareManagerUtils.remove_resource_from_statement(
                existing_policy["Statement"][0],
                s3_target_resources
            )
            if kms_key_id:
                kms_target_resources = [
                    f"arn:aws:kms:{dataset.region}:{dataset.AwsAccountId}:key/{kms_key_id}"
                ]
                ShareManagerUtils.remove_resource_from_statement(
                    existing_policy["Statement"][1],
                    kms_target_resources
                )

Error 2

The functionality works for sharing and revoking because for sharing the extra "arn:aws:kms:eu-west-1:XXXXXXXXX:key/None", does nothing. For the revoke, the access point does not exist so there is no issue.

The issue is in dataall/modules/dataset_sharing/services/share_managers/s3_bucket_share_manager.py:149

In this block (first time policy) we do not check for the kms_key_id

logger.info(
                f'{IAM_S3BUCKET_ROLE_POLICY} does not exists for IAM role {self.target_requester_IAMRoleName}, creating...'
            )
policy = {
                "Version": "2012-10-17",
                "Statement": [
                    {
                        "Effect": "Allow",
                        "Action": [
                            "s3:*"
                        ],
                        "Resource": [
                            f"arn:aws:s3:::{self.bucket_name}",
                            f"arn:aws:s3:::{self.bucket_name}/*"
                        ]
                    },
                    {
                        "Effect": "Allow",
                        "Action": [
                            "kms:*"
                        ],
                        "Resource": [
                            f"arn:aws:kms:{self.bucket_region}:{self.source_account_id}:key/{kms_key_id}"
                        ]
                    }
                ]
            }

Error 3

The biggest issue is that for revoke the following policies still contain resources that should not contain.

This is the dataall-targetDatasetS3Bucket-AccessControlPolicy

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "kms:*"
            ],
            "Resource": [
                "arn:aws:kms:eu-west-1:XXXXXXXXX:key/None",
                "arn:aws:s3:::imported-dataset",
                "arn:aws:s3:::imported-dataset/*"
            ]
        }
    ]
}

And the targetDatasetAccessControlPolicy still contains

"arn:aws:s3:::imported-dataset",
"arn:aws:s3:::imported-dataset/*",
"arn:aws:s3:eu-west-1:XXXXXXXXXXX:accesspoint/XXXXXX-research",
"arn:aws:s3:eu-west-1:XXXXXXXXXXX:accesspoint/XXXXXX-research/*"

@dlpzx
Copy link
Contributor

dlpzx commented Nov 16, 2023

Completed some of the test above that were blocked by error 1 and re-tested with the latest fixes @anushka-singh

Backwards compatibility test
Error 1 test:

  • Created Dataset -- Error 1 fixed
  • Imported Dataset with KMS encryption

Error 2 test:

  • Imported Dataset with SSE-S3 encryption - Approve S3 bucket share - it does not contain an statement with kms/None

Error 3 test:

  • Imported Dataset with SSE-S3 encryption - Revoke S3 Bucket share and access points - it successfully removes items in IAM policy

Copy link
Contributor

@dlpzx dlpzx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! All tests have passed. Great job @anushka-singh

@dlpzx dlpzx merged commit a9bc139 into data-dot-all:main Nov 20, 2023
8 checks passed
anushka-singh pushed a commit to anushka-singh/aws-dataall that referenced this pull request Jun 20, 2024
* Bigdata867 3 (data-dot-all#24)

* Bucket Policy E.1: Modify sharing task routing to trigger a s3 bucket sharing

* Bucket Policy E.1: Modify sharing task routing to trigger a s3 bucket sharing

* Bucket Policy E.1: Modify sharing task routing to trigger a s3 bucket sharing

* Bucket Policy BIGDATA 867: Implement revoke share in data_sharing_service

* Bucket Policy BIGDATA 867: Implement revoke share in data_sharing_service

* trajopadhye- BIGDATA-756 -> Added Tests for Task D and E

* trajopadhye - BIGDATA-756 Corrected file data_sharing_service.py to address revokedStateSM for revoked items

* trajopadhye- BIGDATA-756 - Slight correction in comments

* trajopadhye- BIGDATA-756 Correction on Share Status for revoke share tests

* Addresed changes from the review of PR

* [BIGDATA-625] Implement bucket share processor (data-dot-all#21)

* Implement bucket share processor

* Fix Revoke UI sharetype

* BIGDATA-612 - push source from SD container to CodeCommit.  Initial Makefile and SD yaml configuration.

* Remove synth

* Add force push

* Add default cdk.context.json

* Add param for branchname

* Comments.

* Fix email address

* Add instance specific cdk.context.json

* BIGDATA-612 - truncate the cfn encryption policy prefix so that together with branch name, it will fit within 32 char limit.

* Update screwdriver.yaml

* Change nodejs version in screwdriver Makefile to supported version 16 (data-dot-all#89) (data-dot-all#90)

* Change screwdriver node version to 16

* Remove all non-environment setup steps for testing

* Skip getting AWS credentials for testing

* Fixing npm install version

* Remove extra npm install

* Restore all prior functions.

* Remove AmplifyContext customizations, no longer needed. (data-dot-all#92)

* Change nodejs version in screwdriver Makefile to supported version 16 (data-dot-all#89)

* Change screwdriver node version to 16

* Remove all non-environment setup steps for testing

* Skip getting AWS credentials for testing

* Fixing npm install version

* Remove extra npm install

* Restore all prior functions.

* Remove AmplifyContext customizations, no longer needed. (data-dot-all#91)

* Fix screwdriver yaml for new EMR template step. (data-dot-all#116)

* Bigdata 1397 mvp 3 stagingdeploy 20231129 (data-dot-all#178)

* BIGDATA-1211 - Release notes initial commit

* Mvp3 deploy 20231129 - S3 Bucket share + KMS explosion fix - MERGE FROM OPENSOURCE (data-dot-all#176)

* Enabling S3 bucket share  (data-dot-all#848)

- Feature

- We want to enable bucket sharing along with access point share which
already exists in data all right now.
- A user will be able to request shares at bucket level and at the
folder level with access points.
- Please NOTE: There is some common code between Access point share
managers and processors and S3 Bucket managers and processors. We will
send out a separate PR for that refactoring work at a later time.

- data-dot-all#284
- data-dot-all#823
-
https://github.com/awslabs/aws-dataall/pull/846/files#diff-c1f522a1f50d8bcf7b6e5b2e586e40a8de784caa80345f4e05a6329ae2a372d0

- Contents of this PR have been contributed by @anushka-singh,
@blitzmohit, @rbernotas, @TejasRGitHub

Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: trajopadhye <[email protected]>

* Kms explosion fix (data-dot-all#882)

- Bugfix

- DataAll currently creates one SID per role in the KMS policy attached
to a bucket with RoleID as the SID name.
- We want to collapse these SIDs into one SID.
- Access point and Bucket share will have different SIDs in KMS policy.
- Use role ARN instead of role ID.
- NOTE: if KMS policy was previously created, it will remain the same.
SID will be the user ID and not the KMS decrypt SID created in this PR.
It will not impact any future shares though.
- NOTE: This is to be merged after bucket share PR is merged.

- Tested this on local dev environment and KMS policy now has 1
statement with kms decrypt and using SID of KMS decrypt.

Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: trajopadhye <[email protected]>

* Updated Release Notes 20231201

* Format changes

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: trajopadhye <[email protected]>

* [BIGDATA-1391] - Fix for cannot see all cognito groups when inviting teams (data-dot-all#177)

* trajopadhye | BIGDATA-1391 - Fix for incomplete groups list fetched for invite org and env

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: trajopadhye <[email protected]>

* Bigdata 1397 mvp 3 stagingdeploy 20231129 1 (data-dot-all#180)

* BIGDATA-1211 - Release notes initial commit

* Mvp3 deploy 20231129 - S3 Bucket share + KMS explosion fix - MERGE FROM OPENSOURCE (data-dot-all#176)

* Enabling S3 bucket share  (data-dot-all#848)

- Feature

- We want to enable bucket sharing along with access point share which
already exists in data all right now.
- A user will be able to request shares at bucket level and at the
folder level with access points.
- Please NOTE: There is some common code between Access point share
managers and processors and S3 Bucket managers and processors. We will
send out a separate PR for that refactoring work at a later time.

- data-dot-all#284
- data-dot-all#823
-
https://github.com/awslabs/aws-dataall/pull/846/files#diff-c1f522a1f50d8bcf7b6e5b2e586e40a8de784caa80345f4e05a6329ae2a372d0

- Contents of this PR have been contributed by @anushka-singh,
@blitzmohit, @rbernotas, @TejasRGitHub

Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: trajopadhye <[email protected]>

* Kms explosion fix (data-dot-all#882)

- Bugfix

- DataAll currently creates one SID per role in the KMS policy attached
to a bucket with RoleID as the SID name.
- We want to collapse these SIDs into one SID.
- Access point and Bucket share will have different SIDs in KMS policy.
- Use role ARN instead of role ID.
- NOTE: if KMS policy was previously created, it will remain the same.
SID will be the user ID and not the KMS decrypt SID created in this PR.
It will not impact any future shares though.
- NOTE: This is to be merged after bucket share PR is merged.

- Tested this on local dev environment and KMS policy now has 1
statement with kms decrypt and using SID of KMS decrypt.

Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: trajopadhye <[email protected]>

* Updated Release Notes 20231201

* Format changes

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: trajopadhye <[email protected]>

* [BIGDATA-1391] - Fix for cannot see all cognito groups when inviting teams (data-dot-all#177)

* trajopadhye | BIGDATA-1391 - Fix for incomplete groups list fetched for invite org and env

* Bugfix

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: trajopadhye <[email protected]>

* Bugfix (data-dot-all#181)

* Bugfix

* Bugfix

* [Data 409] Athenz Certs Domain and User Pool Domain Changes (data-dot-all#221) (data-dot-all#222)

* trajopadhye | DATA-409- Code changes for Athenz certs domain and user pool domain

* [Data-413] GA stagingdeploy 20231228 - Fix for email notifications with Athenz.  Auto-create Pivot Role (data-dot-all#224)

* trajopadhye | DATA-412 - Added Athenz configs and Ports in AWS Worker lambda and enabling Auto Create Pivot Role

* DATA-416 - Fix while migrating from manual pivot role to auto created  (data-dot-all#230) (data-dot-all#233)

* trajopadhye | DATA-416 - Fix for environment updates when using auto pivot role. Changing the way KMS keys are specified in env role

* [Data 447] ga stagingdeploy 20240116 (data-dot-all#244)

* [Data-446] Fix for consumption role not showing up

* [Data 415] Dataset import fix for circular dependency error + local dev setup fixes  (data-dot-all#243)

* DATA-428 - Local env fixes

* Data 448 ga stagingdeploy 20240117 (data-dot-all#246)

* trajopadhye | DATA-440 - Adding else if to sync glue tabls in RDS

* Data 461 ga deploy 20240125 (data-dot-all#258)

* DATA-404 - Add git fetch --all to the CodeCommit repo sync

* DATA-420 - Switch from Cognito to Okta on Prod (data-dot-all#254)

DATA-420 - Switch from Cognito to Okta on Prod

* DATA-455: Shares stuck in progress when AWS does not have root access on KMS key (data-dot-all#256)

* Update release notes

* Update release notes

---------

Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>

* Data 466 ga stagingdeploy 20240126 (data-dot-all#263)

* trajoadhye | DATA-456 - Removing Lake Formation SLR (data-dot-all#260)

* Data-405-Adding max 30 sec delay

* Synching Release notes from Staging to y-branch-2-0 (data-dot-all#262)

* [Data 484] stagingdeploy 20240206 (data-dot-all#275)

* fix: adding cdk synth for checkov scans (data-dot-all#264)

* [DATA-452] - Adding Dataset description in shares view (data-dot-all#273)

* Added Release note for DATA-481, DATA-452, DATA-480

* Syncing Release notes (data-dot-all#274)

---------

Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>
Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>

* [Data 607] staging deploy email notification fix (data-dot-all#302)

* Data:604: Add local level false positive management for PSECBUG - 73521 (data-dot-all#299)

* DATA-600 - Fix for share link not present in email notifications

* Merging changes needed for DATA-509 - Updating custom confidentiality values

* DATA - 586 - Adding confidentiality values for custom confidentiality

* Lower casing as suggested here- DATA-375

---------

Co-authored-by: Tejas Rajopadhye <[email protected]>

* Updating release notes for staging deploy (data-dot-all#301)

---------

Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>

* [Data 611] Disable topics dropdown (data-dot-all#304)

* Disabling topics dropdown (data-dot-all#303)

* [Data 619] Stagingdeploy env permission fix  (data-dot-all#307)

* Data:604: Add local level false positive management for PSECBUG - 73521 (data-dot-all#299)

* Data:604: Add local level false positive management for PSECBUG - 73521 (data-dot-all#300)

* Email notification fix + confidentiality levels config  (data-dot-all#298)

* DATA-600 - Fix for share link not present in email notifications

* Merging changes needed for DATA-509 - Updating custom confidentiality values

* Adding confidentiality values for custom confidentiality

* Adding confidentiality configs to config.json.PROD

* Lower casing as suggested here- DATA-375

---------

Co-authored-by: Tejas Rajopadhye <[email protected]>

* Updating release notes for staging deploy (data-dot-all#301)

* Disabling topics dropdown (data-dot-all#303)

* DATA-619 - Fix permission for GET_ORGANIZATION when users are in _data teams (data-dot-all#306)

* Cherry pick for issue with GET_ORG permission after 2.3 release

---------

Co-authored-by: Noah Paige <[email protected]>

---------

Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: Noah Paige <[email protected]>

* [Data 631] Staging deploy  (data-dot-all#310)

* [Data 629] worksheet fix for GET_ENVIRONMENT permission (data-dot-all#309)

* Data690 stagingdeploy 20240425 (data-dot-all#319)

* DATA-680 - Update node repo to 18.x in Makefile.sd

* Data674: Adding auto approval for confidentiality levels (data-dot-all#317)

* Data674: Adding auto approval for confidentiality levels

* Data674: Adding auto approval for confidentiality levels

* Data674: Adding auto approval for confidentiality levels

* Lint fixes

* Lint fixes

* Lint fixes

* Lint fixes

* Lint fixes

* Ensuring Secret Confidentiality Type (Yahoo Confidential and Yahoo Highly Confidential) are never auto-approved

* Use boolean true instead of string

* Update config

* Update release notes

* Update release notes

* Update release notes

---------

Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>

* Update makefile (data-dot-all#320)

* Data690 stagingdeploy 20240425 2 (data-dot-all#321)

* Update makefile

* Reverting nodejs 16 upgrade

* Reverting nodejs 16 upgrade

* Data690 stagingdeploy 20240425 3 (data-dot-all#323)

* DATA-680 - Update node repo to 18.x in Makefile.sd

* Data674: Adding auto approval for confidentiality levels (data-dot-all#317)

* Data674: Adding auto approval for confidentiality levels

* Data674: Adding auto approval for confidentiality levels

* Data674: Adding auto approval for confidentiality levels

* Lint fixes

* Lint fixes

* Lint fixes

* Lint fixes

* Lint fixes

* Ensuring Secret Confidentiality Type (Yahoo Confidential and Yahoo Highly Confidential) are never auto-approved

* Use boolean true instead of string

* Update config

* Bugfix (data-dot-all#322)

* Reverting nodejs 16 upgrade

---------

Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>

* Data690 stagingdeploy 20240425 4 (data-dot-all#325)

* DATA-680 - Update node repo to 18.x in Makefile.sd

* Data674: Adding auto approval for confidentiality levels (data-dot-all#317)

* Data674: Adding auto approval for confidentiality levels

* Data674: Adding auto approval for confidentiality levels

* Data674: Adding auto approval for confidentiality levels

* Lint fixes

* Lint fixes

* Lint fixes

* Lint fixes

* Lint fixes

* Ensuring Secret Confidentiality Type (Yahoo Confidential and Yahoo Highly Confidential) are never auto-approved

* Use boolean true instead of string

* Update config

* Bugfix (data-dot-all#322)

* Blocking autoApproval edit on backend (data-dot-all#324)

* Blocking autoApproval edit on backend

* Lint fix

* Reverting nodejs 18 upgrade

---------

Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>

* Data690 stagingdeploy 20240425 5 (data-dot-all#329)

* DATA-680 - Update node repo to 18.x in Makefile.sd

* Data674: Adding auto approval for confidentiality levels (data-dot-all#317)

* Data674: Adding auto approval for confidentiality levels

* Data674: Adding auto approval for confidentiality levels

* Data674: Adding auto approval for confidentiality levels

* Lint fixes

* Lint fixes

* Lint fixes

* Lint fixes

* Lint fixes

* Ensuring Secret Confidentiality Type (Yahoo Confidential and Yahoo Highly Confidential) are never auto-approved

* Use boolean true instead of string

* Update config

* Bugfix (data-dot-all#322)

* Blocking autoApproval edit on backend (data-dot-all#324)

* Blocking autoApproval edit on backend

* Lint fix

* DATA-680 - Switch node to version 17 in the Screwdriver makefile (data-dot-all#326)

* bugfix (data-dot-all#328)

* Remove nodejs upgrade

---------

Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>

* bugfix (data-dot-all#331)

* Data743 stagingdeploy (data-dot-all#351)

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* Data743: Update verifier task schedule to run nightly (data-dot-all#350)

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* Data743 stagingdeploy (data-dot-all#353)

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* [Data 767] staging deploy (data-dot-all#358)

* Bugfix: timeout error when listing Consumption Roles (data-dot-all#1303)

- Bugfix

- as GraphQL resolvers are 'lazy', for ShareRequest Modal window we
simply don't fetch the managedPolicy property -- no timeout
- managed policies are fetched, when consumption role is selected from
dropdown

- data-dot-all#1288

Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Sofia Sazonova <[email protected]>

* Updated Release notes

---------

Co-authored-by: Sofia Sazonova <[email protected]>
Co-authored-by: Sofia Sazonova <[email protected]>

---------

Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: Sofia Sazonova <[email protected]>
Co-authored-by: Sofia Sazonova <[email protected]>

* data712

* Data712: Persistent emails

* Data712: Persistent emails

* Data712: Persistent emails

* Data712: Persistent emails

* Data712: Persistent emails

* Data712: Persistent emails

* Data712: Persistent emails

* Restore yarn file

* Restore yarn file

* Update config

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: trajopadhye <[email protected]>
Co-authored-by: Mohit Arora <[email protected]>
Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>
Co-authored-by: Raj Chopde <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Sofia Sazonova <[email protected]>
Co-authored-by: Sofia Sazonova <[email protected]>
anushka-singh pushed a commit to anushka-singh/aws-dataall that referenced this pull request Jun 20, 2024
* Bigdata867 3 (data-dot-all#24)

* Bucket Policy E.1: Modify sharing task routing to trigger a s3 bucket sharing

* Bucket Policy E.1: Modify sharing task routing to trigger a s3 bucket sharing

* Bucket Policy E.1: Modify sharing task routing to trigger a s3 bucket sharing

* Bucket Policy BIGDATA 867: Implement revoke share in data_sharing_service

* Bucket Policy BIGDATA 867: Implement revoke share in data_sharing_service

* trajopadhye- BIGDATA-756 -> Added Tests for Task D and E

* trajopadhye - BIGDATA-756 Corrected file data_sharing_service.py to address revokedStateSM for revoked items

* trajopadhye- BIGDATA-756 - Slight correction in comments

* trajopadhye- BIGDATA-756 Correction on Share Status for revoke share tests

* Addresed changes from the review of PR

* [BIGDATA-625] Implement bucket share processor (data-dot-all#21)

* Implement bucket share processor

* Fix Revoke UI sharetype

* BIGDATA-612 - push source from SD container to CodeCommit.  Initial Makefile and SD yaml configuration.

* Remove synth

* Add force push

* Add default cdk.context.json

* Add param for branchname

* Comments.

* Fix email address

* Add instance specific cdk.context.json

* BIGDATA-612 - truncate the cfn encryption policy prefix so that together with branch name, it will fit within 32 char limit.

* Update screwdriver.yaml

* Change nodejs version in screwdriver Makefile to supported version 16 (data-dot-all#89) (data-dot-all#90)

* Change screwdriver node version to 16

* Remove all non-environment setup steps for testing

* Skip getting AWS credentials for testing

* Fixing npm install version

* Remove extra npm install

* Restore all prior functions.

* Remove AmplifyContext customizations, no longer needed. (data-dot-all#92)

* Change nodejs version in screwdriver Makefile to supported version 16 (data-dot-all#89)

* Change screwdriver node version to 16

* Remove all non-environment setup steps for testing

* Skip getting AWS credentials for testing

* Fixing npm install version

* Remove extra npm install

* Restore all prior functions.

* Remove AmplifyContext customizations, no longer needed. (data-dot-all#91)

* Fix screwdriver yaml for new EMR template step. (data-dot-all#116)

* Bigdata 1397 mvp 3 stagingdeploy 20231129 (data-dot-all#178)

* BIGDATA-1211 - Release notes initial commit

* Mvp3 deploy 20231129 - S3 Bucket share + KMS explosion fix - MERGE FROM OPENSOURCE (data-dot-all#176)

* Enabling S3 bucket share  (data-dot-all#848)

- Feature

- We want to enable bucket sharing along with access point share which
already exists in data all right now.
- A user will be able to request shares at bucket level and at the
folder level with access points.
- Please NOTE: There is some common code between Access point share
managers and processors and S3 Bucket managers and processors. We will
send out a separate PR for that refactoring work at a later time.

- data-dot-all#284
- data-dot-all#823
-
https://github.com/awslabs/aws-dataall/pull/846/files#diff-c1f522a1f50d8bcf7b6e5b2e586e40a8de784caa80345f4e05a6329ae2a372d0

- Contents of this PR have been contributed by @anushka-singh,
@blitzmohit, @rbernotas, @TejasRGitHub

Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: trajopadhye <[email protected]>

* Kms explosion fix (data-dot-all#882)

- Bugfix

- DataAll currently creates one SID per role in the KMS policy attached
to a bucket with RoleID as the SID name.
- We want to collapse these SIDs into one SID.
- Access point and Bucket share will have different SIDs in KMS policy.
- Use role ARN instead of role ID.
- NOTE: if KMS policy was previously created, it will remain the same.
SID will be the user ID and not the KMS decrypt SID created in this PR.
It will not impact any future shares though.
- NOTE: This is to be merged after bucket share PR is merged.

- Tested this on local dev environment and KMS policy now has 1
statement with kms decrypt and using SID of KMS decrypt.

Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: trajopadhye <[email protected]>

* Updated Release Notes 20231201

* Format changes

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: trajopadhye <[email protected]>

* [BIGDATA-1391] - Fix for cannot see all cognito groups when inviting teams (data-dot-all#177)

* trajopadhye | BIGDATA-1391 - Fix for incomplete groups list fetched for invite org and env

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: trajopadhye <[email protected]>

* Bigdata 1397 mvp 3 stagingdeploy 20231129 1 (data-dot-all#180)

* BIGDATA-1211 - Release notes initial commit

* Mvp3 deploy 20231129 - S3 Bucket share + KMS explosion fix - MERGE FROM OPENSOURCE (data-dot-all#176)

* Enabling S3 bucket share  (data-dot-all#848)

- Feature

- We want to enable bucket sharing along with access point share which
already exists in data all right now.
- A user will be able to request shares at bucket level and at the
folder level with access points.
- Please NOTE: There is some common code between Access point share
managers and processors and S3 Bucket managers and processors. We will
send out a separate PR for that refactoring work at a later time.

- data-dot-all#284
- data-dot-all#823
-
https://github.com/awslabs/aws-dataall/pull/846/files#diff-c1f522a1f50d8bcf7b6e5b2e586e40a8de784caa80345f4e05a6329ae2a372d0

- Contents of this PR have been contributed by @anushka-singh,
@blitzmohit, @rbernotas, @TejasRGitHub

Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: trajopadhye <[email protected]>

* Kms explosion fix (data-dot-all#882)

- Bugfix

- DataAll currently creates one SID per role in the KMS policy attached
to a bucket with RoleID as the SID name.
- We want to collapse these SIDs into one SID.
- Access point and Bucket share will have different SIDs in KMS policy.
- Use role ARN instead of role ID.
- NOTE: if KMS policy was previously created, it will remain the same.
SID will be the user ID and not the KMS decrypt SID created in this PR.
It will not impact any future shares though.
- NOTE: This is to be merged after bucket share PR is merged.

- Tested this on local dev environment and KMS policy now has 1
statement with kms decrypt and using SID of KMS decrypt.

Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: trajopadhye <[email protected]>

* Updated Release Notes 20231201

* Format changes

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: trajopadhye <[email protected]>

* [BIGDATA-1391] - Fix for cannot see all cognito groups when inviting teams (data-dot-all#177)

* trajopadhye | BIGDATA-1391 - Fix for incomplete groups list fetched for invite org and env

* Bugfix

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: trajopadhye <[email protected]>

* Bugfix (data-dot-all#181)

* Bugfix

* Bugfix

* [Data 409] Athenz Certs Domain and User Pool Domain Changes (data-dot-all#221) (data-dot-all#222)

* trajopadhye | DATA-409- Code changes for Athenz certs domain and user pool domain

* [Data-413] GA stagingdeploy 20231228 - Fix for email notifications with Athenz.  Auto-create Pivot Role (data-dot-all#224)

* trajopadhye | DATA-412 - Added Athenz configs and Ports in AWS Worker lambda and enabling Auto Create Pivot Role

* DATA-416 - Fix while migrating from manual pivot role to auto created  (data-dot-all#230) (data-dot-all#233)

* trajopadhye | DATA-416 - Fix for environment updates when using auto pivot role. Changing the way KMS keys are specified in env role

* [Data 447] ga stagingdeploy 20240116 (data-dot-all#244)

* [Data-446] Fix for consumption role not showing up

* [Data 415] Dataset import fix for circular dependency error + local dev setup fixes  (data-dot-all#243)

* DATA-428 - Local env fixes

* Data 448 ga stagingdeploy 20240117 (data-dot-all#246)

* trajopadhye | DATA-440 - Adding else if to sync glue tabls in RDS

* Data 461 ga deploy 20240125 (data-dot-all#258)

* DATA-404 - Add git fetch --all to the CodeCommit repo sync

* DATA-420 - Switch from Cognito to Okta on Prod (data-dot-all#254)

DATA-420 - Switch from Cognito to Okta on Prod

* DATA-455: Shares stuck in progress when AWS does not have root access on KMS key (data-dot-all#256)

* Update release notes

* Update release notes

---------

Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>

* Data 466 ga stagingdeploy 20240126 (data-dot-all#263)

* trajoadhye | DATA-456 - Removing Lake Formation SLR (data-dot-all#260)

* Data-405-Adding max 30 sec delay

* Synching Release notes from Staging to y-branch-2-0 (data-dot-all#262)

* [Data 484] stagingdeploy 20240206 (data-dot-all#275)

* fix: adding cdk synth for checkov scans (data-dot-all#264)

* [DATA-452] - Adding Dataset description in shares view (data-dot-all#273)

* Added Release note for DATA-481, DATA-452, DATA-480

* Syncing Release notes (data-dot-all#274)

---------

Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>
Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>

* [Data 607] staging deploy email notification fix (data-dot-all#302)

* Data:604: Add local level false positive management for PSECBUG - 73521 (data-dot-all#299)

* DATA-600 - Fix for share link not present in email notifications

* Merging changes needed for DATA-509 - Updating custom confidentiality values

* DATA - 586 - Adding confidentiality values for custom confidentiality

* Lower casing as suggested here- DATA-375

---------

Co-authored-by: Tejas Rajopadhye <[email protected]>

* Updating release notes for staging deploy (data-dot-all#301)

---------

Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>

* [Data 611] Disable topics dropdown (data-dot-all#304)

* Disabling topics dropdown (data-dot-all#303)

* [Data 619] Stagingdeploy env permission fix  (data-dot-all#307)

* Data:604: Add local level false positive management for PSECBUG - 73521 (data-dot-all#299)

* Data:604: Add local level false positive management for PSECBUG - 73521 (data-dot-all#300)

* Email notification fix + confidentiality levels config  (data-dot-all#298)

* DATA-600 - Fix for share link not present in email notifications

* Merging changes needed for DATA-509 - Updating custom confidentiality values

* Adding confidentiality values for custom confidentiality

* Adding confidentiality configs to config.json.PROD

* Lower casing as suggested here- DATA-375

---------

Co-authored-by: Tejas Rajopadhye <[email protected]>

* Updating release notes for staging deploy (data-dot-all#301)

* Disabling topics dropdown (data-dot-all#303)

* DATA-619 - Fix permission for GET_ORGANIZATION when users are in _data teams (data-dot-all#306)

* Cherry pick for issue with GET_ORG permission after 2.3 release

---------

Co-authored-by: Noah Paige <[email protected]>

---------

Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: Noah Paige <[email protected]>

* [Data 631] Staging deploy  (data-dot-all#310)

* [Data 629] worksheet fix for GET_ENVIRONMENT permission (data-dot-all#309)

* Data690 stagingdeploy 20240425 (data-dot-all#319)

* DATA-680 - Update node repo to 18.x in Makefile.sd

* Data674: Adding auto approval for confidentiality levels (data-dot-all#317)

* Data674: Adding auto approval for confidentiality levels

* Data674: Adding auto approval for confidentiality levels

* Data674: Adding auto approval for confidentiality levels

* Lint fixes

* Lint fixes

* Lint fixes

* Lint fixes

* Lint fixes

* Ensuring Secret Confidentiality Type (Yahoo Confidential and Yahoo Highly Confidential) are never auto-approved

* Use boolean true instead of string

* Update config

* Update release notes

* Update release notes

* Update release notes

---------

Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>

* Update makefile (data-dot-all#320)

* Data690 stagingdeploy 20240425 2 (data-dot-all#321)

* Update makefile

* Reverting nodejs 16 upgrade

* Reverting nodejs 16 upgrade

* Data690 stagingdeploy 20240425 3 (data-dot-all#323)

* DATA-680 - Update node repo to 18.x in Makefile.sd

* Data674: Adding auto approval for confidentiality levels (data-dot-all#317)

* Data674: Adding auto approval for confidentiality levels

* Data674: Adding auto approval for confidentiality levels

* Data674: Adding auto approval for confidentiality levels

* Lint fixes

* Lint fixes

* Lint fixes

* Lint fixes

* Lint fixes

* Ensuring Secret Confidentiality Type (Yahoo Confidential and Yahoo Highly Confidential) are never auto-approved

* Use boolean true instead of string

* Update config

* Bugfix (data-dot-all#322)

* Reverting nodejs 16 upgrade

---------

Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>

* Data690 stagingdeploy 20240425 4 (data-dot-all#325)

* DATA-680 - Update node repo to 18.x in Makefile.sd

* Data674: Adding auto approval for confidentiality levels (data-dot-all#317)

* Data674: Adding auto approval for confidentiality levels

* Data674: Adding auto approval for confidentiality levels

* Data674: Adding auto approval for confidentiality levels

* Lint fixes

* Lint fixes

* Lint fixes

* Lint fixes

* Lint fixes

* Ensuring Secret Confidentiality Type (Yahoo Confidential and Yahoo Highly Confidential) are never auto-approved

* Use boolean true instead of string

* Update config

* Bugfix (data-dot-all#322)

* Blocking autoApproval edit on backend (data-dot-all#324)

* Blocking autoApproval edit on backend

* Lint fix

* Reverting nodejs 18 upgrade

---------

Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>

* Data690 stagingdeploy 20240425 5 (data-dot-all#329)

* DATA-680 - Update node repo to 18.x in Makefile.sd

* Data674: Adding auto approval for confidentiality levels (data-dot-all#317)

* Data674: Adding auto approval for confidentiality levels

* Data674: Adding auto approval for confidentiality levels

* Data674: Adding auto approval for confidentiality levels

* Lint fixes

* Lint fixes

* Lint fixes

* Lint fixes

* Lint fixes

* Ensuring Secret Confidentiality Type (Yahoo Confidential and Yahoo Highly Confidential) are never auto-approved

* Use boolean true instead of string

* Update config

* Bugfix (data-dot-all#322)

* Blocking autoApproval edit on backend (data-dot-all#324)

* Blocking autoApproval edit on backend

* Lint fix

* DATA-680 - Switch node to version 17 in the Screwdriver makefile (data-dot-all#326)

* bugfix (data-dot-all#328)

* Remove nodejs upgrade

---------

Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>

* bugfix (data-dot-all#331)

* Data743 stagingdeploy (data-dot-all#351)

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* Data743: Update verifier task schedule to run nightly (data-dot-all#350)

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* Data743 stagingdeploy (data-dot-all#353)

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* [Data 767] staging deploy (data-dot-all#358)

* Bugfix: timeout error when listing Consumption Roles (data-dot-all#1303)

- Bugfix

- as GraphQL resolvers are 'lazy', for ShareRequest Modal window we
simply don't fetch the managedPolicy property -- no timeout
- managed policies are fetched, when consumption role is selected from
dropdown

- data-dot-all#1288

Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Sofia Sazonova <[email protected]>

* Updated Release notes

---------

Co-authored-by: Sofia Sazonova <[email protected]>
Co-authored-by: Sofia Sazonova <[email protected]>

---------

Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: Sofia Sazonova <[email protected]>
Co-authored-by: Sofia Sazonova <[email protected]>

* data712

* Data712: Persistent emails

* Data712: Persistent emails

* Data712: Persistent emails

* Data712: Persistent emails

* Data712: Persistent emails

* Data712: Persistent emails

* Data712: Persistent emails

* Restore yarn file

* Restore yarn file

* Update config

* Data712: Persistent emails

* Data712: Persistent emails

* Data712: Persistent emails

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: trajopadhye <[email protected]>
Co-authored-by: Mohit Arora <[email protected]>
Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>
Co-authored-by: Raj Chopde <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Sofia Sazonova <[email protected]>
Co-authored-by: Sofia Sazonova <[email protected]>
anushka-singh pushed a commit to anushka-singh/aws-dataall that referenced this pull request Jun 20, 2024
* Bigdata867 3 (data-dot-all#24)

* Bucket Policy E.1: Modify sharing task routing to trigger a s3 bucket sharing

* Bucket Policy E.1: Modify sharing task routing to trigger a s3 bucket sharing

* Bucket Policy E.1: Modify sharing task routing to trigger a s3 bucket sharing

* Bucket Policy BIGDATA 867: Implement revoke share in data_sharing_service

* Bucket Policy BIGDATA 867: Implement revoke share in data_sharing_service

* trajopadhye- BIGDATA-756 -> Added Tests for Task D and E

* trajopadhye - BIGDATA-756 Corrected file data_sharing_service.py to address revokedStateSM for revoked items

* trajopadhye- BIGDATA-756 - Slight correction in comments

* trajopadhye- BIGDATA-756 Correction on Share Status for revoke share tests

* Addresed changes from the review of PR

* [BIGDATA-625] Implement bucket share processor (data-dot-all#21)

* Implement bucket share processor

* Fix Revoke UI sharetype

* BIGDATA-612 - push source from SD container to CodeCommit.  Initial Makefile and SD yaml configuration.

* Remove synth

* Add force push

* Add default cdk.context.json

* Add param for branchname

* Comments.

* Fix email address

* Add instance specific cdk.context.json

* BIGDATA-612 - truncate the cfn encryption policy prefix so that together with branch name, it will fit within 32 char limit.

* Update screwdriver.yaml

* Change nodejs version in screwdriver Makefile to supported version 16 (data-dot-all#89) (data-dot-all#90)

* Change screwdriver node version to 16

* Remove all non-environment setup steps for testing

* Skip getting AWS credentials for testing

* Fixing npm install version

* Remove extra npm install

* Restore all prior functions.

* Remove AmplifyContext customizations, no longer needed. (data-dot-all#92)

* Change nodejs version in screwdriver Makefile to supported version 16 (data-dot-all#89)

* Change screwdriver node version to 16

* Remove all non-environment setup steps for testing

* Skip getting AWS credentials for testing

* Fixing npm install version

* Remove extra npm install

* Restore all prior functions.

* Remove AmplifyContext customizations, no longer needed. (data-dot-all#91)

* Fix screwdriver yaml for new EMR template step. (data-dot-all#116)

* Bigdata 1397 mvp 3 stagingdeploy 20231129 (data-dot-all#178)

* BIGDATA-1211 - Release notes initial commit

* Mvp3 deploy 20231129 - S3 Bucket share + KMS explosion fix - MERGE FROM OPENSOURCE (data-dot-all#176)

* Enabling S3 bucket share  (data-dot-all#848)

- Feature

- We want to enable bucket sharing along with access point share which
already exists in data all right now.
- A user will be able to request shares at bucket level and at the
folder level with access points.
- Please NOTE: There is some common code between Access point share
managers and processors and S3 Bucket managers and processors. We will
send out a separate PR for that refactoring work at a later time.

- data-dot-all#284
- data-dot-all#823
-
https://github.com/awslabs/aws-dataall/pull/846/files#diff-c1f522a1f50d8bcf7b6e5b2e586e40a8de784caa80345f4e05a6329ae2a372d0

- Contents of this PR have been contributed by @anushka-singh,
@blitzmohit, @rbernotas, @TejasRGitHub

Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: trajopadhye <[email protected]>

* Kms explosion fix (data-dot-all#882)

- Bugfix

- DataAll currently creates one SID per role in the KMS policy attached
to a bucket with RoleID as the SID name.
- We want to collapse these SIDs into one SID.
- Access point and Bucket share will have different SIDs in KMS policy.
- Use role ARN instead of role ID.
- NOTE: if KMS policy was previously created, it will remain the same.
SID will be the user ID and not the KMS decrypt SID created in this PR.
It will not impact any future shares though.
- NOTE: This is to be merged after bucket share PR is merged.

- Tested this on local dev environment and KMS policy now has 1
statement with kms decrypt and using SID of KMS decrypt.

Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: trajopadhye <[email protected]>

* Updated Release Notes 20231201

* Format changes

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: trajopadhye <[email protected]>

* [BIGDATA-1391] - Fix for cannot see all cognito groups when inviting teams (data-dot-all#177)

* trajopadhye | BIGDATA-1391 - Fix for incomplete groups list fetched for invite org and env

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: trajopadhye <[email protected]>

* Bigdata 1397 mvp 3 stagingdeploy 20231129 1 (data-dot-all#180)

* BIGDATA-1211 - Release notes initial commit

* Mvp3 deploy 20231129 - S3 Bucket share + KMS explosion fix - MERGE FROM OPENSOURCE (data-dot-all#176)

* Enabling S3 bucket share  (data-dot-all#848)

- Feature

- We want to enable bucket sharing along with access point share which
already exists in data all right now.
- A user will be able to request shares at bucket level and at the
folder level with access points.
- Please NOTE: There is some common code between Access point share
managers and processors and S3 Bucket managers and processors. We will
send out a separate PR for that refactoring work at a later time.

- data-dot-all#284
- data-dot-all#823
-
https://github.com/awslabs/aws-dataall/pull/846/files#diff-c1f522a1f50d8bcf7b6e5b2e586e40a8de784caa80345f4e05a6329ae2a372d0

- Contents of this PR have been contributed by @anushka-singh,
@blitzmohit, @rbernotas, @TejasRGitHub

Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: trajopadhye <[email protected]>

* Kms explosion fix (data-dot-all#882)

- Bugfix

- DataAll currently creates one SID per role in the KMS policy attached
to a bucket with RoleID as the SID name.
- We want to collapse these SIDs into one SID.
- Access point and Bucket share will have different SIDs in KMS policy.
- Use role ARN instead of role ID.
- NOTE: if KMS policy was previously created, it will remain the same.
SID will be the user ID and not the KMS decrypt SID created in this PR.
It will not impact any future shares though.
- NOTE: This is to be merged after bucket share PR is merged.

- Tested this on local dev environment and KMS policy now has 1
statement with kms decrypt and using SID of KMS decrypt.

Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: trajopadhye <[email protected]>

* Updated Release Notes 20231201

* Format changes

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: trajopadhye <[email protected]>

* [BIGDATA-1391] - Fix for cannot see all cognito groups when inviting teams (data-dot-all#177)

* trajopadhye | BIGDATA-1391 - Fix for incomplete groups list fetched for invite org and env

* Bugfix

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: trajopadhye <[email protected]>

* Bugfix (data-dot-all#181)

* Bugfix

* Bugfix

* [Data 409] Athenz Certs Domain and User Pool Domain Changes (data-dot-all#221) (data-dot-all#222)

* trajopadhye | DATA-409- Code changes for Athenz certs domain and user pool domain

* [Data-413] GA stagingdeploy 20231228 - Fix for email notifications with Athenz.  Auto-create Pivot Role (data-dot-all#224)

* trajopadhye | DATA-412 - Added Athenz configs and Ports in AWS Worker lambda and enabling Auto Create Pivot Role

* DATA-416 - Fix while migrating from manual pivot role to auto created  (data-dot-all#230) (data-dot-all#233)

* trajopadhye | DATA-416 - Fix for environment updates when using auto pivot role. Changing the way KMS keys are specified in env role

* [Data 447] ga stagingdeploy 20240116 (data-dot-all#244)

* [Data-446] Fix for consumption role not showing up

* [Data 415] Dataset import fix for circular dependency error + local dev setup fixes  (data-dot-all#243)

* DATA-428 - Local env fixes

* Data 448 ga stagingdeploy 20240117 (data-dot-all#246)

* trajopadhye | DATA-440 - Adding else if to sync glue tabls in RDS

* Data 461 ga deploy 20240125 (data-dot-all#258)

* DATA-404 - Add git fetch --all to the CodeCommit repo sync

* DATA-420 - Switch from Cognito to Okta on Prod (data-dot-all#254)

DATA-420 - Switch from Cognito to Okta on Prod

* DATA-455: Shares stuck in progress when AWS does not have root access on KMS key (data-dot-all#256)

* Update release notes

* Update release notes

---------

Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>

* Data 466 ga stagingdeploy 20240126 (data-dot-all#263)

* trajoadhye | DATA-456 - Removing Lake Formation SLR (data-dot-all#260)

* Data-405-Adding max 30 sec delay

* Synching Release notes from Staging to y-branch-2-0 (data-dot-all#262)

* [Data 484] stagingdeploy 20240206 (data-dot-all#275)

* fix: adding cdk synth for checkov scans (data-dot-all#264)

* [DATA-452] - Adding Dataset description in shares view (data-dot-all#273)

* Added Release note for DATA-481, DATA-452, DATA-480

* Syncing Release notes (data-dot-all#274)

---------

Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>
Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>

* [Data 607] staging deploy email notification fix (data-dot-all#302)

* Data:604: Add local level false positive management for PSECBUG - 73521 (data-dot-all#299)

* DATA-600 - Fix for share link not present in email notifications

* Merging changes needed for DATA-509 - Updating custom confidentiality values

* DATA - 586 - Adding confidentiality values for custom confidentiality

* Lower casing as suggested here- DATA-375

---------

Co-authored-by: Tejas Rajopadhye <[email protected]>

* Updating release notes for staging deploy (data-dot-all#301)

---------

Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>

* [Data 611] Disable topics dropdown (data-dot-all#304)

* Disabling topics dropdown (data-dot-all#303)

* [Data 619] Stagingdeploy env permission fix  (data-dot-all#307)

* Data:604: Add local level false positive management for PSECBUG - 73521 (data-dot-all#299)

* Data:604: Add local level false positive management for PSECBUG - 73521 (data-dot-all#300)

* Email notification fix + confidentiality levels config  (data-dot-all#298)

* DATA-600 - Fix for share link not present in email notifications

* Merging changes needed for DATA-509 - Updating custom confidentiality values

* Adding confidentiality values for custom confidentiality

* Adding confidentiality configs to config.json.PROD

* Lower casing as suggested here- DATA-375

---------

Co-authored-by: Tejas Rajopadhye <[email protected]>

* Updating release notes for staging deploy (data-dot-all#301)

* Disabling topics dropdown (data-dot-all#303)

* DATA-619 - Fix permission for GET_ORGANIZATION when users are in _data teams (data-dot-all#306)

* Cherry pick for issue with GET_ORG permission after 2.3 release

---------

Co-authored-by: Noah Paige <[email protected]>

---------

Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: Noah Paige <[email protected]>

* [Data 631] Staging deploy  (data-dot-all#310)

* [Data 629] worksheet fix for GET_ENVIRONMENT permission (data-dot-all#309)

* Data690 stagingdeploy 20240425 (data-dot-all#319)

* DATA-680 - Update node repo to 18.x in Makefile.sd

* Data674: Adding auto approval for confidentiality levels (data-dot-all#317)

* Data674: Adding auto approval for confidentiality levels

* Data674: Adding auto approval for confidentiality levels

* Data674: Adding auto approval for confidentiality levels

* Lint fixes

* Lint fixes

* Lint fixes

* Lint fixes

* Lint fixes

* Ensuring Secret Confidentiality Type (Yahoo Confidential and Yahoo Highly Confidential) are never auto-approved

* Use boolean true instead of string

* Update config

* Update release notes

* Update release notes

* Update release notes

---------

Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>

* Update makefile (data-dot-all#320)

* Data690 stagingdeploy 20240425 2 (data-dot-all#321)

* Update makefile

* Reverting nodejs 16 upgrade

* Reverting nodejs 16 upgrade

* Data690 stagingdeploy 20240425 3 (data-dot-all#323)

* DATA-680 - Update node repo to 18.x in Makefile.sd

* Data674: Adding auto approval for confidentiality levels (data-dot-all#317)

* Data674: Adding auto approval for confidentiality levels

* Data674: Adding auto approval for confidentiality levels

* Data674: Adding auto approval for confidentiality levels

* Lint fixes

* Lint fixes

* Lint fixes

* Lint fixes

* Lint fixes

* Ensuring Secret Confidentiality Type (Yahoo Confidential and Yahoo Highly Confidential) are never auto-approved

* Use boolean true instead of string

* Update config

* Bugfix (data-dot-all#322)

* Reverting nodejs 16 upgrade

---------

Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>

* Data690 stagingdeploy 20240425 4 (data-dot-all#325)

* DATA-680 - Update node repo to 18.x in Makefile.sd

* Data674: Adding auto approval for confidentiality levels (data-dot-all#317)

* Data674: Adding auto approval for confidentiality levels

* Data674: Adding auto approval for confidentiality levels

* Data674: Adding auto approval for confidentiality levels

* Lint fixes

* Lint fixes

* Lint fixes

* Lint fixes

* Lint fixes

* Ensuring Secret Confidentiality Type (Yahoo Confidential and Yahoo Highly Confidential) are never auto-approved

* Use boolean true instead of string

* Update config

* Bugfix (data-dot-all#322)

* Blocking autoApproval edit on backend (data-dot-all#324)

* Blocking autoApproval edit on backend

* Lint fix

* Reverting nodejs 18 upgrade

---------

Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>

* Data690 stagingdeploy 20240425 5 (data-dot-all#329)

* DATA-680 - Update node repo to 18.x in Makefile.sd

* Data674: Adding auto approval for confidentiality levels (data-dot-all#317)

* Data674: Adding auto approval for confidentiality levels

* Data674: Adding auto approval for confidentiality levels

* Data674: Adding auto approval for confidentiality levels

* Lint fixes

* Lint fixes

* Lint fixes

* Lint fixes

* Lint fixes

* Ensuring Secret Confidentiality Type (Yahoo Confidential and Yahoo Highly Confidential) are never auto-approved

* Use boolean true instead of string

* Update config

* Bugfix (data-dot-all#322)

* Blocking autoApproval edit on backend (data-dot-all#324)

* Blocking autoApproval edit on backend

* Lint fix

* DATA-680 - Switch node to version 17 in the Screwdriver makefile (data-dot-all#326)

* bugfix (data-dot-all#328)

* Remove nodejs upgrade

---------

Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>

* bugfix (data-dot-all#331)

* Data743 stagingdeploy (data-dot-all#351)

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* Data743: Update verifier task schedule to run nightly (data-dot-all#350)

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* Data743 stagingdeploy (data-dot-all#353)

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* Update verifier task schedule to run nightly

* [Data 767] staging deploy (data-dot-all#358)

* Bugfix: timeout error when listing Consumption Roles (data-dot-all#1303)

- Bugfix

- as GraphQL resolvers are 'lazy', for ShareRequest Modal window we
simply don't fetch the managedPolicy property -- no timeout
- managed policies are fetched, when consumption role is selected from
dropdown

- data-dot-all#1288

Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Sofia Sazonova <[email protected]>

* Updated Release notes

---------

Co-authored-by: Sofia Sazonova <[email protected]>
Co-authored-by: Sofia Sazonova <[email protected]>

---------

Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: Sofia Sazonova <[email protected]>
Co-authored-by: Sofia Sazonova <[email protected]>

* data712

* Data712: Persistent emails

* Data712: Persistent emails

* Data712: Persistent emails

* Data712: Persistent emails

* Data712: Persistent emails

* Data712: Persistent emails

* Data712: Persistent emails

* Restore yarn file

* Restore yarn file

* Update config

* Data712: Persistent emails

* Data712: Persistent emails

* Data712: Persistent emails

* Data712: update import

* Data712: update import

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: trajopadhye <[email protected]>
Co-authored-by: Mohit Arora <[email protected]>
Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>
Co-authored-by: Raj Chopde <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Sofia Sazonova <[email protected]>
Co-authored-by: Sofia Sazonova <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants