Adding AWS HealthOmics as a Module in "Play" tools #954

ironspur5 · 2024-01-09T17:00:32Z

Feature or Bugfix

Feature

Background

Currently, data.all has integrations to AWS services such as SageMaker, Athena, and QuickSight through the “Play” modules/tools Notebooks, Worksheets, and Dashboards, respectively. This is valuable and convenient for end users who want to process and share their data but may not want to or know how to use these services in the AWS Console.

Researchers, scientists, and bioinformaticians often fall into this end user category. One such AWS service that is popular amongst the research community is AWS HealthOmics. HealthOmics helps users process and generate insights from genomics and other biological data stored in the cloud. Raw genomic data can be sent to a HealthOmics managed workflow (aka Ready2Run workflow) that can perform various tasks such as quality control, read alignment, transcript assembly, and gene expression quantification. The output can then be stored in a manageable format for querying/visualizing/sharing.

AWS HealthOmics Integration

This feature contains both modularized backend and frontend changes for adding HealthOmics as a “Play” module/tool to data.all. It specifically adds the capability to view and instantiate HealthOmics Ready2Run workflows as runs that can output and save omic data as data.all Datasets.

Consumption Patterns

data.all Worksheets: Users can use Worksheets to make data easier to query and combine with other forms of health data.
data.all Notebooks/Studio: Users can Notebooks and Studio to build, train, and deploy novel machine learning algorithms on the multiomic and multimodal data.
data.all Dashboards: Users can use the transformed data in Dashboards for advanced analytics and visualizations.

Considerations

Linked Environment and Dataset Region: The HealthOmics run must be performed in a data.all Linked Environment that is located in an AWS Region that supports AWS HealthOmics. Similarly, the data.all source and destination Dataset must live in same AWS Region as where the user will perform the HealthOmics run.
Ready2Run Workflow Support: Currently, only Ready2Run workflows are supported. Ready2Run are pre-built workflows designed by industry leading third-party software companies like Sentieon, Inc. and NVIDIA, as well as common open-source pipelines such as AlphaFold for protein structure prediction. Ready2Run workflows do not require the management of software tools or workflow scripts. Bring your own, also known as Private, workflows where you bring a custom workflow script, are not yet supported. Please note that some Ready2Run workflows require a subscription/license from the software provider to run.

User Journey

This example user journey depicts an end-to-end process from viewing available HealthOmics Ready2Run workflows to instantiating a run and viewing its output in a data.all Worksheet.

Initiation:
- User navigates to the "Omics" section within data.all and browses Ready2Run workflows

User can also search for a specific workflow directly

After clicking on a workflow, users see a detailed view of it with a full description of what it does

Creation: After clicking on a workflow, users see a detailed view and hit “Create Run”. Users fill in the run creation form with the following parameters:

Workflow ID: Immutable ID of the Ready2Run workflow
* Run Name: Customizable name of the run user will submit
* Environment: data.all Environment AWS Account where the HealthOmics run will be (NOTE: the Environment must be in an AWS Region supported by HealthOmics, ex: N. Virginia or London)
* Region: Pre-populated from the Environment and immutable Region where the run will be
* Owners: data.all group who owns the run
* Select S3 Output Destination: data.all Dataset where the output omics data will reside (NOTE: please create this prior to kicking off a run)
* Run Parameters: JSON parameter input in the format expected by the Ready2Run workflow. It will be pre-populated with the correct fields, and users will paste in their data in the appropriate fields. For example, the raw input data in S3 that will be processed in the run. (NOTE: the input data does not have to be in a data.all Dataset, as long as it is accessible. For example, raw genomic data may be hosted publicly on the AWS Registry of Open Data, and the S3 URI can be provided in a field here)
History:
- Users navigate to Run tab at the top to view a history of the data.all-initiated Ready2Run workflows they’ve kicked off. (NOTE: run history deletion is still in progress)

Data Consumption:
- In Worksheets:
  - Users can select (or create) a new Worksheet.

Users can then query the data using SQL

Relates

Github Issue - HealthOmics module in "Play" tools #563

Security

Please answer the questions below briefly where applicable, or write N/A. Based on
OWASP 10.

Does this PR introduce or modify any input fields or queries - this includes
fetching data from storage outside the application (e.g. a database, an S3 bucket)? Yes
- Is the input sanitized? Yes
- What precautions are you taking before deserializing the data you consume? N/A
- Is injection prevented by parametrizing queries? N/A
- Have you ensured no eval or similar functions are used? N/A
Does this PR introduce any functionality or component that requires authorization? Yes
- How have you ensured it respects the existing AuthN/AuthZ mechanisms? Yes
- Are you logging failed auth attempts? N/A
Are you using or adding any cryptographic features? No
- Do you use a standard proven implementations?
- Are the used keys controlled by the customer? Where are they stored?
Are you introducing any new policies/roles/users? Yes
- Have you used the least-privilege principle? How? Yes, through scoped policies added to the role

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Syncs modularization main branch with main (as there is a bug fix that is useful to have) --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: dlpzx <[email protected]> Co-authored-by: wolanlu <[email protected]>

This is a draft PR for showing purposes. There are still some minor issues that needs to be addressed. ### Feature or Bugfix - Refactoring ### Detail There are following changes under this PR: 1. Modularization + Refactoring of notebooks There are new modules that will play a major role in the future refactoring: * Core = contains the code need for application to operate correctly * Common = common code for all modules * Modules = the plugin/feature that can be inserted into the system (at the moment only notebooks) The other part that is related to modularization is the creation of environment parameters. Environment parameter will replace all hardcoded parameters of the environment configuration. There is a new file - config.json that allows you to configure an application configuration. All existing parameters will be migrated via db migration in AWS 2. Extracting permissions and request context (Optional for the modularization) Engine, user, and user groups had been passed as a parameter of context in the request. This had forced to pass a lot of parameters to other methods that weren't even needed. This information should be as a scope of the request session. There is a new way to retrieve the information using `RequestContext.` There is also a new way to use permission checks that require less parameters and make code cleaner. The old way was marked as deprecated 3. Restructure of the code (Optional for the modularization) Since the modularization will touch all the places in the API code it can be a good change to set a new structure of the code. There are small re-organization in notebook module to address * Allocating the resources before the validating parameters * Not clear responsibility of the classes * Mixed layers There are new structure : - resolvers = validate and pass code to service layer - service layer = bisnesss logic - repositories = database logic (all queries should be placed here) - aws = contains a wrapper client upon boto3 - cdk = all logic related to create stacks or code for ecs - tasks = code that will be executed in AWS lambda (short-living tasks) All names can be changed. ### Relates [data-dot-all#295](data-dot-all#295) By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: dlpzx <[email protected]>

### Feature or Bug-fix - Refactoring ### Detail For the development dockerfile in the frontend, instead of starting from amazon linux image and installing everything manually, it's a lot simpler to start from node image (that is also from ECR for security) and have all node-related environment pre-installed and just focus the dockerfile on project-related configuration. This PR does not introduce any update in the logic, it just updates the development dockerfile and adds a `.dockerignore` file for frontend to ignore `node_modules` and `build` folders. --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

) ### Feature or Bug-fix - Refactoring ### Details This PR does 2 things: - Makes the styling consistent across the project - Removes dead code (unused variables, functions, and imports) --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

### Feature or Bug-fix - Refactoring ### Detail My recent PR data-dot-all#394 has a lot of changes but all of these changes are to styles only and not the code itself, this way the blame history for all the changed files is lost (it will always blame me). This PR adds the file`.git-blame-ignore-revs` which contains the id of the styling commit and the command that you need to run once to fix the blame history for you. Also Github understands that file and will show the history correctly. This PR is based on [this article](https://www.stefanjudis.com/today-i-learned/how-to-exclude-commits-from-git-blame/) and is suggested by Raul. This file works for the whole repository, and in the future one only needs to add ids of styling commits and the history will be preserved correctly. --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

### Feature or Bugfix - Refactoring (Modularization) ### Relates - Related issues data-dot-all#295 and data-dot-all#412 ### Short Summary First part of migration of `Dataset` (`DatasetTableColumn`) TL;DR :) ### Long Summary Datasets are huge. It's one of the central modules that's spread everywhere across the application. Migrating the entire Dataset piece would be very difficult task and, more importantly, even more difficult to review. Therefore, I decided to break down this work into "small" steps to make it more convenient to review. Dataset's API consist of the following items: * `Dataset` * `DatasetTable` * `DatasetTableColumn` * `DatasetLocation` * `DatasetProfiling` In this PR, there is only creation of `Dataset` module and migration of `DatasetTableColumn` (and some related to it pieces). Why? Because the plan was to migrate it, to see what issues would come up along with it and to address them here. The refactoring of `DatasetTableColumn` will be in other PR. The issues: 1) Glossaries 2) Feed 3) Long tasks for datasets 4) Redshift Glossaries rely on GraphQL UNION of different type (including datasets). Created an abstraction for glossary registration. There was an idea to change frontend, but it'd require a lot of work to do this Feed: same as glossaries. Solved the similar way. For feed, changing frontend API is more feasible, but I wanted it to be consistent with glossaries Long tasks for datasets. They migrated into tasks folder and doesn't require a dedicated loading for its code (at least for now). But there are two concerns: 1) The deployment uses a direct module folder references to run them (e.g. `dataall.modules.datasets....`, so basically when a module is deactivated, then we shouldn't deploy this tasks as well). I left a TODO for it to address in future (when we migrate all modules), but we should bear in mind that it might lead to inconsistencies. 2) There is a reference to `redshift` from long-running tasks = should be address in `redshift` module Redshift: it has some references to `datasets`. So there will be either dependencies among modules or small code duplication (if `redshift` doesn't rely hard on `datasets`) = will be addressed in `redshift` module Other changes: Fixed and improved some tests Extracted glue handler code that related to `DatasetTableColumn` Renamed import mode from tasks to handlers for async lambda. A few hacks that will go away with next refactoring :) Next steps: [Part2 ](nikpodsh#1) in preview :) Extract rest of datasets functionality (perhaps, in a few steps) Refactor extractor modules the same way as notebooks Extract tests to follow the same structure. By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

### Feature or Bugfix - Refactoring ### Detail Refactoring of DatasetProfilingRun ### Relates - data-dot-all#295 and data-dot-all#412 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

### Feature or Bugfix - Refactoring ### Detail Refactoring of the third part of dataset: `DatasetStorageLocation` Introduced the Indexers: this code was migrated from the `indexers.py` and put into modules. Removed unused alarms (which didn't call actual alarm code) Introduced `DatasetShareService` but it seems it will be migrated to share module. All `DatasetXXXServices` will be split onto Services (business logic) and Repositories( DAO layer) in future parts ### Relates - data-dot-all#412 and data-dot-all#295 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

### Feature or Bugfix - Refactoring ### Detail Refactoring of DatasetTable: Get rid of ElasticSearch connection for every request. Created a lazy way to establish connection. ### Relates data-dot-all#412 and data-dot-all#295 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: dbalintx <[email protected]>

### Feature or Bugfix - Refactoring ### Detail Refactoring of the Dataset entity and related to it code. Refactoring for Votes Introduced DataPolicy (the same way as ServicePolicy was used used) Extracted dataset related permissions. Used new `has_tenant_permission` instead of `has_tenant_perm` that allows not to pass unused parameters ### Relates data-dot-all#412 and data-dot-all#295 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

### Feature or Bugfix - Refactoring ### Detail Creation of the mechanism to import dependencies for modules. It should help easier extract common code between modules. Now, one module can refer to another module in order to import it. Deleted `common` folder since it has been replaced replaced. Added some checks if the module was unintentionally imported + logging By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Modularization of Worksheets - Moved Worksheet related code to its own new module - Merged AthenaQueryResult object into Worksheets - Worksheet related permissions moved to new module - Removed worksheet sharing related (unused) code from the entire repo By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

…ata-dot-all#463) Merge of main (v1.5.2) -> modularization main By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: dlpzx <[email protected]> Co-authored-by: wolanlu <[email protected]> Co-authored-by: Amr Saber <[email protected]> Co-authored-by: Noah Paige <[email protected]> Co-authored-by: kukushking <[email protected]> Co-authored-by: Dariusz Osiennik <[email protected]> Co-authored-by: Dennis Goldner <[email protected]> Co-authored-by: Abdulrahman Kaitoua <[email protected]> Co-authored-by: akaitoua-sa <[email protected]> Co-authored-by: Gezim Musliaj <[email protected]> Co-authored-by: Rick Bernotas <[email protected]> Co-authored-by: David Mutune Kimengu <[email protected]>

### Feature or Bugfix - Refactoring ### Detail First part of the modularization of ML Studio - Created `SageMakerStudioService` and move `resolvers` business logic into the service - Redefined `resolvers` and add request validators - Redefined `db` operations and added them into `SageMakerRepository` - Created `sagemaker_studio_client` with Sagemaker Studio SDK calls - Created `ec2_client` with EC2 SDK calls - Added CloudFormation stack in the `cdk` package of the module - Added CloudFormation stack as an environment extension in the `cdk` package of the module - Modified environment creation and edit api calls and views to read `mlStudiosEnabled` from environment parameters table - Fixed tests and added migration scripts Additional: - Standardized naming of functions and objects to "sagemaker_studio_user" and avoid legacy "Notebook" references - Removed unused api definition for `getSagemakerStudioUserApps` - Cleaned up unused frontend views and API methods - Split migration scripts for environments, worksheets and mlstudio - linting Worksheets module - prettier frontend ### Relates data-dot-all#448 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: wolanlu <[email protected]> Co-authored-by: Amr Saber <[email protected]> Co-authored-by: Noah Paige <[email protected]> Co-authored-by: kukushking <[email protected]> Co-authored-by: Dariusz Osiennik <[email protected]> Co-authored-by: Dennis Goldner <[email protected]> Co-authored-by: Nikita Podshivalov <[email protected]> Co-authored-by: Abdulrahman Kaitoua <[email protected]> Co-authored-by: akaitoua-sa <[email protected]> Co-authored-by: Gezim Musliaj <[email protected]> Co-authored-by: Rick Bernotas <[email protected]> Co-authored-by: David Mutune Kimengu <[email protected]> Co-authored-by: nikpodsh <[email protected]> Co-authored-by: dbalintx <[email protected]>

@dbalintx

### Feature or Bugfix - Feature ### Detail - For ML Studio and Environments, the original procedure of synthesizing the template and testing the resulting json has been replaced by the `cdk assertions` library recommended in the [documentation](https://docs.aws.amazon.com/cdk/v2/guide/testing.html#testing_getting_started) of CDK. - In the Environment cdk testing, the `extent` method of the `EnvironmentStackExtension` subclasses has been mocked. Now it only tests the `EnvironmentSetup` stack as if no extensions were registered. - In the MLStudio cdk testing, this PR adds a test for the `SageMakerDomainExtension` mocking the environment stack. It tests the `SageMakerDomainExtension` as a standalone. Open question: 1) The rest of cdk stacks (notebooks, pipelines, datasets...) are tested using the old method of printing the json template. Should I go ahead and migrate to using `cdk assertions` library? If so, I thought of doing it for notebooks only and sync on the testing for datasets and pipelines with @dbalintx and @nikpodsh. 2) With the `cdk assertions` library we can test the resources properties more in depth. I added some tests on the environment stack but we could add more asserts in all stacks. I will add more tests on the MLStudio stack and I made a note in the GitHub project to review this once modularization is complete. ### Relates - data-dot-all#295 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

…that are inactive (data-dot-all#522) Enabled the feature to turn off tests whenever a module if inactive.

Modularization of the sharing. ### Detail Migrated the sharing part (including tasks) Removed unused methods for datasets There are a few issues that needs to be addressed before merging this PR. But it can be view, since we don't expect major changes here. ### Related data-dot-all#295 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dlpzx <[email protected]> Co-authored-by: kukushking <[email protected]> Co-authored-by: Dariusz Osiennik <[email protected]> Co-authored-by: Noah Paige <[email protected]> Co-authored-by: Dennis Goldner <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Abdulrahman Kaitoua <[email protected]> Co-authored-by: akaitoua-sa <[email protected]> Co-authored-by: Gezim Musliaj <[email protected]> Co-authored-by: Rick Bernotas <[email protected]> Co-authored-by: David Mutune Kimengu <[email protected]> Co-authored-by: dbalintx <[email protected]>

Modularization of data pipelines Changes: - Relevant files moved from dataall/Objects/api and from dataall/db to the newly created module - Relevant permissions extracted to the newly created module and are being used with the new decorators - Functions interacting with the DB were outsourced to repository - extracted and moved Datapipelines related code from core CDK files to the new module dataall/cdkproxy/cdk_cli_wrapper.py dataall/cdkproxy/stacks/pipeline.py dataall/cdkproxy/cdkpipeline/cdk_pipeline.py service policies - extracted and moved Datapipelines related code from core AWS handlers to the new module dataall/aws/handlers/stepfunction.py dataall/aws/handlers/codecommit.py dataall/aws/handlers/codepipeline.py dataall/aws/handlers/glue.py - added module interface for the module - unit tests updated By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

### Feature or Bugfix Refactoring ### Detail Refactoring of the dashboards ### Relates data-dot-all#509 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. - [x] TODO: There was a huge merge conflict. I am testing the changes after the merge --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Abdulrahman Kaitoua <[email protected]> Co-authored-by: akaitoua-sa <[email protected]> Co-authored-by: dlpzx <[email protected]> Co-authored-by: Gezim Musliaj <[email protected]> Co-authored-by: kukushking <[email protected]> Co-authored-by: Rick Bernotas <[email protected]> Co-authored-by: David Mutune Kimengu <[email protected]> Co-authored-by: dbalintx <[email protected]>

Removing all Redshift related code from backend, frontend and docs. Cleaned up dataall/core folder structure ### Feature or Bugfix  - Feature - Bugfix - Refactoring ### Detail - <feature1 or bug1> - <feature2 or bug2> ### Relates - <URL or Ticket> By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Added missed permission checks. This PR depends on data-dot-all#537. Should be review after the dashboard modularization By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Abdulrahman Kaitoua <[email protected]> Co-authored-by: akaitoua-sa <[email protected]> Co-authored-by: dlpzx <[email protected]> Co-authored-by: Gezim Musliaj <[email protected]> Co-authored-by: kukushking <[email protected]> Co-authored-by: Rick Bernotas <[email protected]> Co-authored-by: David Mutune Kimengu <[email protected]> Co-authored-by: dbalintx <[email protected]>

…o modularization/omics-module

backend/dataall/modules/omics/db/omics_repository.py

noah-paige · 2024-05-31T19:05:53Z

backend/dataall/modules/omics/aws/omics_client.py

backend/dataall/core/environment/cdk/env_role_core_policies/service_policy.py

…is week

…zation/omics-module # Conflicts: # config.json # frontend/yarn.lock

…cs and worksheets

backend/dataall/modules/omics/tasks/omics_workflows_fetcher.py

noah-paige · 2024-06-07T20:23:31Z

A final suggestion left on omics_workflow_fetcher task

And 1 last thought - should we restrict omics runs to follow a pattern using NamingConventionService similar to how we do with other resources?

If so think we can also then restrict the pivot role permissions to arn:aws:omics:{self.region}:{self.account}:run/{self.env_resource_prefix}* (right now it is wildcard on name)

If we do not implement naming convention like above - we may want to remove resource prefix restriction on env group role
arn:aws:omics:{self.region}:{self.account}:run/{self.resource_prefix}* (currently implemented as such in backend/dataall/modules/omics/cdk/env_role_omics_policy.py

backend/dataall/modules/omics/cdk/env_role_omics_policy.py

ironspur5 · 2024-06-10T15:28:04Z

A final suggestion left on omics_workflow_fetcher task

And 1 last thought - should we restrict omics runs to follow a pattern using NamingConventionService similar to how we do with other resources?
* If so think we can also then restrict the pivot role permissions to `arn:aws:omics:{self.region}:{self.account}:run/{self.env_resource_prefix}*` (right now it is wildcard on name)
If we do not implement naming convention like above - we may want to remove resource prefix restriction on env group role arn:aws:omics:{self.region}:{self.account}:run/{self.resource_prefix}* (currently implemented as such in backend/dataall/modules/omics/cdk/env_role_omics_policy.py

I am not 100% certain, but I also don't think the self.env_resource_prefix is actually doing anything. Run ARNs don't include any naming/prefix in their format, just the Run ID. I think I put this in there to match what other modules looked like. But the tag key condition to prevent unauthorized run access is really what I was intending for.

tests/modules/omics/conftest.py

tests/modules/omics/test_omics_workflow_fetcher.py

backend/dataall/modules/omics/aws/omics_client.py

backend/dataall/modules/omics/services/omics_service.py

backend/dataall/modules/omics/cdk/pivot_role_omics_policy.py

…zation/omics-module

noah-paige

thanks again for all you hard work here @ironspur5 @dlpzx + team

Final changes look good - approving!

anmolsgandhi · 2024-06-14T13:45:54Z

Great work on this initiative team - the collaboration and coordination across the teams, all focused on delivering the best possible experience for our customers, has been awesome. Let's continue this collaboration for other initiatives as well.

ironspur5 · 2024-06-14T14:34:19Z

Thank you for all your support in getting this added, team!!

AmrSaber and others added 30 commits March 30, 2023 13:08

Added boilerplate for omics module

b26956b

Adding Omics API

ec45f74

Merge branch 'modularization-main' into modularization/omics-module

2b07213

Worked on omics-module - still in-progress

a9ab725

modularization: Disable and skip module test directories for modules …

f599a8a

…that are inactive (data-dot-all#522) Enabled the feature to turn off tests whenever a module if inactive.

Added boilerplate for omics module

1dd39e2

Adding Omics API

8a28d30

Worked on omics-module - still in-progress

1bfcbb9

Merge remote-tracking branch 'origin/modularization/omics-module' int…

ff3eb85

…o modularization/omics-module

Update with latest design drafts, added TODOs

1cba7d8

noah-paige reviewed May 31, 2024

View reviewed changes

backend/dataall/modules/omics/db/omics_repository.py Outdated Show resolved Hide resolved

noah-paige reviewed May 31, 2024

View reviewed changes

backend/dataall/modules/omics/db/omics_repository.py Show resolved Hide resolved

noah-paige reviewed May 31, 2024

View reviewed changes

backend/dataall/modules/omics/aws/omics_client.py Outdated Show resolved Hide resolved

noah-paige reviewed Jun 3, 2024

View reviewed changes

backend/dataall/core/environment/cdk/env_role_core_policies/service_policy.py Outdated Show resolved Hide resolved

Patrick Guha and others added 5 commits June 5, 2024 10:36

Addressing bulk of Noah's PR comments; will address the rest later th…

2387e0b

…is week

Merge remote-tracking branch 'refs/remotes/origin/main' into modulari…

5c35cd0

…zation/omics-module # Conflicts: # config.json # frontend/yarn.lock

Fix migrations, PR comments and merge format

a14c177

Move listS3DatasetsOwnedByEnvGroup to global services - shared by omi…

0753a28

…cs and worksheets

Add query to listOmicsEnvironments in ECS task

0ebf4f3

noah-paige reviewed Jun 7, 2024

View reviewed changes

backend/dataall/modules/omics/tasks/omics_workflows_fetcher.py Show resolved Hide resolved

noah-paige reviewed Jun 7, 2024

View reviewed changes

backend/dataall/modules/omics/cdk/env_role_omics_policy.py Outdated Show resolved Hide resolved

Patrick Guha and others added 2 commits June 10, 2024 10:54

scoping pivot role permissions for Omics testing

a0951bc

no environments case

f255f48

noah-paige reviewed Jun 10, 2024

View reviewed changes

tests/modules/omics/conftest.py Outdated Show resolved Hide resolved

noah-paige reviewed Jun 10, 2024

View reviewed changes

tests/modules/omics/test_omics_workflow_fetcher.py Outdated Show resolved Hide resolved

noah-paige reviewed Jun 10, 2024

View reviewed changes

backend/dataall/modules/omics/aws/omics_client.py Outdated Show resolved Hide resolved

noah-paige reviewed Jun 10, 2024

View reviewed changes

backend/dataall/modules/omics/services/omics_service.py Outdated Show resolved Hide resolved

noah-paige reviewed Jun 11, 2024

View reviewed changes

backend/dataall/modules/omics/cdk/pivot_role_omics_policy.py Outdated Show resolved Hide resolved

Patrick Guha and others added 5 commits June 11, 2024 10:25

addressing misc PR comments

94a2797

Merge remote-tracking branch 'refs/remotes/origin/main' into modulari…

f7deeb9

…zation/omics-module

Fix tests and enable in config to have tests always enabled in PR

3ce3085

Disable omics from config.json

5346eee

linting

d39fc19

noah-paige approved these changes Jun 14, 2024

View reviewed changes

dlpzx merged commit 7a264ee into data-dot-all:main Jun 14, 2024
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding AWS HealthOmics as a Module in "Play" tools #954

Adding AWS HealthOmics as a Module in "Play" tools #954

ironspur5 commented Jan 9, 2024 •

edited

Loading

noah-paige commented May 31, 2024 •

edited

Loading

noah-paige commented Jun 7, 2024

ironspur5 commented Jun 10, 2024

noah-paige left a comment

anmolsgandhi commented Jun 14, 2024

ironspur5 commented Jun 14, 2024

Adding AWS HealthOmics as a Module in "Play" tools #954

Adding AWS HealthOmics as a Module in "Play" tools #954

Conversation

ironspur5 commented Jan 9, 2024 • edited Loading

Feature or Bugfix

Background

AWS HealthOmics Integration

Consumption Patterns

Considerations

User Journey

Relates

Security

noah-paige commented May 31, 2024 • edited Loading

noah-paige commented Jun 7, 2024

ironspur5 commented Jun 10, 2024

noah-paige left a comment

Choose a reason for hiding this comment

anmolsgandhi commented Jun 14, 2024

ironspur5 commented Jun 14, 2024

ironspur5 commented Jan 9, 2024 •

edited

Loading

noah-paige commented May 31, 2024 •

edited

Loading