Skip to content

Commit

Permalink
v2.1.0 features (#840)
Browse files Browse the repository at this point in the history
### Feature or Bugfix
- Feature
- Bugfix
- Refactoring

### Detail

#### Features
* Limit pivot role S3 permissions by @dlpzx in
#780
* Limit pivot role KMS permissions by @dlpzx in
#830
* Add configurable session timeout to IDP by @manjulaK in
#786
* Allow to submit a share when you are both an approver and a requester
by @zsaltys in #793
* Redirect upon creating a share request by @zsaltys in
#799
* Handle Pre-filtering of tables by @anushka-singh in
#811
* Email Notification on Share Workflow - Issue - 734 by @TejasRGitHub in
#818
* Refactor notifications from core to modules by @dlpzx in
#822
* Add frontend and backend feature flags by @zsaltys in
#817
* Make hosted_zone_id optional by @lorchda in
#812

#### Fixes
* Add Additional Error Messages for KMS Key lookup on imported dataset
by @noah-paige in #748
* Handle Environment Import of IAM service roles by @noah-paige in
#749
* Build Compliant Names for Opensearch Resources by @noah-paige in
#750
* Update Lambda runtime by @nikpodsh in
#782
* Ensure valid environments for share request and other objects creation
by @dlpzx in #781
* Fix shell true semgrep by @dlpzx in
#760
* Add condition when there are no public subnets by @lorchda in
#794
* Remove unused variable by @zsaltys in
#815
* Check other share exists before clean up by @noah-paige in
#769


### Relates
- v2.1.0 minor release

## New Contributors
* @manjulaK made their first contribution in
#786
* @zsaltys made their first contribution in
#793
* @anushka-singh made their first contribution in
#811
* @TejasRGitHub made their first contribution in
#818

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: trajopadhye <[email protected]>
  • Loading branch information
11 people authored Nov 8, 2023
1 parent b60dab8 commit f917a7a
Show file tree
Hide file tree
Showing 150 changed files with 2,443 additions and 952 deletions.
1 change: 1 addition & 0 deletions .github/workflows/bandit.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ on:
pull_request:
branches:
- main
- v2m*

permissions:
contents: read
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/cdk-nag.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ on:
- "deploy/**"
branches:
- main
- v2m*

permissions:
contents: read
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/coverage.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ on:
- main
- release/*
- main-v2
- v2m*

jobs:
run-tests:
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/eslint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ on:
- main
- release/*
- main-v2
- v2m*

permissions:
contents: read
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ on:
- main
- release/*
- main-v2
- v2m*

jobs:
lint:
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/npm-audit.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ on:
- main
- release/*
- main-v2
- v2m*

permissions:
contents: read
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/semgrep.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ on:
- main
- release/*
- main-v2
- v2m*

permissions:
contents: read
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/static-checking.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ on:
- main
- release/*
- main-v2
- v2m*

jobs:
Check:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/validate-db-schema.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,10 @@ on:
- main
- release/*
- main-v2
- v2m*

env:
envname: local
schema_name: validation

jobs:
run-tests:
Expand Down
54 changes: 54 additions & 0 deletions backend/dataall/base/aws/cognito.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
import os
import logging
import boto3

from dataall.base.utils.IdentityProvider import IdentityProvider

log = logging.getLogger(__name__)


class Cognito(IdentityProvider):

def __init__(self):
self.client = boto3.client('cognito-idp', region_name=os.getenv('AWS_REGION', 'eu-west-1'))

def get_user_emailids_from_group(self, groupName):
try:
envname = os.getenv('envname', 'local')
parameter_path = f'/dataall/{envname}/cognito/userpool'
ssm = boto3.client('ssm', region_name=os.getenv('AWS_REGION', 'eu-west-1'))
user_pool_id = ssm.get_parameter(Name=parameter_path)['Parameter']['Value']
cognito_user_list = self.client.list_users_in_group(UserPoolId=user_pool_id, GroupName=groupName)["Users"]
group_email_ids = []
attributes = []
# Make a flat list
[attributes.extend(x['Attributes']) for x in cognito_user_list]
# Extract all the email-ids
group_email_ids.extend([x['Value'] for x in attributes if x['Name'] == 'email'])

except Exception as e:
envname = os.getenv('envname', 'local')
if envname in ['local', 'dkrcompose']:
log.error('Local development environment does not support Cognito')
return ['[email protected]']
log.error(
f'Failed to get email ids for Cognito group {groupName} due to {e}'
)
raise e
else:
return group_email_ids

@staticmethod
def list_cognito_groups(envname: str, region: str):
try:
parameter_path = f'/dataall/{envname}/cognito/userpool'
ssm = boto3.client('ssm', region_name=region)
user_pool_id = ssm.get_parameter(Name=parameter_path)['Parameter']['Value']
cognito = boto3.client('cognito-idp', region_name=region)
groups = cognito.list_groups(UserPoolId=user_pool_id)['Groups']
except Exception as e:
log.error(
f'Failed to list groups of user pool {user_pool_id} due to {e}'
)
else:
return groups
16 changes: 16 additions & 0 deletions backend/dataall/base/aws/iam.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,22 @@ def get_role(account_id: str, role_arn: str, role=None):
else:
return response["Role"]

@staticmethod
def get_role_arn_by_name(account_id: str, role_name: str, role=None):
log.info(f"Getting IAM role name= {role_name}")
try:
iamcli = IAM.client(account_id=account_id, role=role)
response = iamcli.get_role(
RoleName=role_name
)
except Exception as e:
log.error(
f'Failed to get role {role_name} due to: {e}'
)
return None
else:
return response["Role"]["Arn"]

@staticmethod
def update_role_policy(
account_id: str,
Expand Down
59 changes: 59 additions & 0 deletions backend/dataall/base/aws/ses.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
import logging
import os
import boto3

log = logging.getLogger(__name__)


class Ses:

def __init__(self, fromEmailId: str = None):
self.fromEmailId = fromEmailId
self.client = boto3.client('sesv2', region_name=os.getenv('AWS_REGION', 'eu-west-1'))

@staticmethod
def get_ses_client():
# Create SES client
fromEmailId = os.getenv('email_sender_id', 'none')
if fromEmailId != 'none':
return Ses(fromEmailId)
else:
raise Exception('email_sender_id environment variable is not set')

def send_email(self, toList, message, subject):
# Get the SES client
# Send the email
try:
ses_client = self.client
destination_dict = {
'ToAddresses': toList,
}
body_dict = {
'Text': {
'Data': message,
'Charset': 'UTF-8'
}
}
subject_dict = {
'Data': subject,
'Charset': 'UTF-8'
}
message_dict = {
'Subject': subject_dict,
'Body': body_dict
}

return ses_client.send_email(
FromEmailAddress=self.fromEmailId,
Destination=destination_dict,
Content={
'Simple': message_dict
}
)
except Exception as e:
envname = os.getenv('envname', 'local')
if envname in ['local', 'dkrcompose']:
log.error('Local development environment does not support SES notifications')
return True
log.error(f'Error while sending email {e})')
raise e
56 changes: 41 additions & 15 deletions backend/dataall/base/cdkproxy/cdk_cli_wrapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
from dataall.base.aws.sts import SessionHelper
from dataall.base.db import Engine
from dataall.base.utils.alarm_service import AlarmService
from dataall.base.utils.shell_utils import CommandSanitizer

logger = logging.getLogger('cdksass')

Expand All @@ -44,12 +45,14 @@ def aws_configure(profile_name='default'):
print('..............................................')
print(' Running configure ')
print('..............................................')
print(f"AWS_CONTAINER_CREDENTIALS_RELATIVE_URI: {os.getenv('AWS_CONTAINER_CREDENTIALS_RELATIVE_URI')}")
cmd = ['curl', '169.254.170.2$AWS_CONTAINER_CREDENTIALS_RELATIVE_URI']
process = subprocess.run(' '.join(cmd), text=True, shell=True, encoding='utf-8', capture_output=True) # nosec # nosemgrep
AWS_CONTAINER_CREDENTIALS_RELATIVE_URI = os.getenv('AWS_CONTAINER_CREDENTIALS_RELATIVE_URI')
cmd = ['curl', f'169.254.170.2{AWS_CONTAINER_CREDENTIALS_RELATIVE_URI}']
process = subprocess.run(cmd, text=True, shell=False, encoding='utf-8', capture_output=True)
creds = None
if process.returncode == 0:
creds = ast.literal_eval(process.stdout)
else:
logger.error(f'Failed clean curl credentials due to {str(process.stderr)}')

return creds

Expand Down Expand Up @@ -130,6 +133,16 @@ def deploy_cdk_stack(engine: Engine, stackid: str, app_path: str = None, path: s
app_path = app_path or './app.py'

logger.info(f'app_path: {app_path}')
input_args = [
stack.name,
stack.accountid,
stack.region,
stack.stack,
stack.targetUri
]

CommandSanitizer(input_args)

cmd = [
'' '. ~/.nvm/nvm.sh &&',
'cdk',
Expand Down Expand Up @@ -157,9 +170,11 @@ def deploy_cdk_stack(engine: Engine, stackid: str, app_path: str = None, path: s
f'"{sys.executable} {app_path}"',
'--verbose',
]

logger.info(f"Running command : \n {' '.join(cmd)}")

# This command is too complex to be executed as a list of commands. We need to run it with shell=True
# However, the input arguments have to be sanitized with the CommandSanitizer

process = subprocess.run( # nosemgrep
' '.join(cmd), # nosemgrep
text=True, # nosemgrep
Expand Down Expand Up @@ -208,17 +223,28 @@ def describe_stack(stack, engine: Engine = None, stackid: str = None):


def cdk_installed():
cmd = ['. ~/.nvm/nvm.sh && cdk', '--version']
logger.info(f"Running command {' '.join(cmd)}")

subprocess.run( # nosemgrep
cmd, # nosemgrep
text=True, # nosemgrep
shell=True, # nosec # nosemgrep
encoding='utf-8', # nosemgrep
stdout=subprocess.PIPE, # nosemgrep
stderr=subprocess.PIPE, # nosemgrep
cwd=os.path.dirname(__file__), # nosemgrep
cmd1 = ['.', '~/.nvm/nvm.sh']
logger.info(f"Running command {' '.join(cmd1)}")
subprocess.run(
cmd1,
text=True,
shell=False,
encoding='utf-8',
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
cwd=os.path.dirname(__file__),
)

cmd2 = ['cdk', '--version']
logger.info(f"Running command {' '.join(cmd2)}")
subprocess.run(
cmd2,
text=True,
shell=False,
encoding='utf-8',
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
cwd=os.path.dirname(__file__),
)


Expand Down
2 changes: 1 addition & 1 deletion backend/dataall/base/cdkproxy/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
aws-cdk-lib==2.83.1
aws-cdk-lib==2.99.0
boto3==1.24.85
boto3-stubs==1.24.85
botocore==1.27.85
Expand Down
5 changes: 2 additions & 3 deletions backend/dataall/base/db/connection.py
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,6 @@ def drop_schema_if_exists(engine, envname):


def get_engine(envname=ENVNAME):
schema = os.getenv('schema_name', envname)
if envname not in ['local', 'pytest', 'dkrcompose']:
param_store = Parameter()
credential_arn = param_store.get_parameter(env=envname, path='aurora/dbcreds')
Expand All @@ -120,7 +119,7 @@ def get_engine(envname=ENVNAME):
'db': database,
'user': user,
'pwd': pwd,
'schema': schema,
'schema': envname,
}
else:
hostname = 'db' if envname == 'dkrcompose' else 'localhost'
Expand All @@ -129,7 +128,7 @@ def get_engine(envname=ENVNAME):
'db': 'dataall',
'user': 'postgres',
'pwd': 'docker',
'schema': schema,
'schema': envname,
}
return Engine(DbConfig(**db_params))

Expand Down
8 changes: 8 additions & 0 deletions backend/dataall/base/utils/IdentityProvider.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
import abc


class IdentityProvider:

@abc.abstractmethod
def get_user_emailids_from_group(self, groupName):
raise NotImplementedError
Loading

0 comments on commit f917a7a

Please sign in to comment.