Skip to content

Shreya's Notes InnerEye (Issue 148)

shreyasingh1 edited this page Mar 11, 2021 · 5 revisions

1/31/21 - Met with Alisha and Jacopo to discuss Alisha's progress with the issue this past fall

Based on her progress, Jacopo and I were able to plan some immediate next steps:

  • We need to gain access to a JHU azure account
  • We need to run through all the above setup steps and be able to follow the sample classification and sample segmentation tutorials
  • Then, we will manually upload the raw benchmarking data to Azure and attempt to run the Lung Segmentation or Glaucoma Classification models on that data.
  • Note that this will involve converting from .tif format (our data) to NIFTI format (what InnerEye seems compatible with). We could use an approach like the one linked here.

For our longer term goals:

  • We want to be able to use Azure computation credits without storing the data on Azure. We need to write a script that can port data over from AWS and feed it into InnerEye.
  • If segmentation/classification performance on the brainlit benchmarking data is not great because it is a difficult transfer learning task, we can test InnerEye on other potential datasets. (This could include brain1/2/3 and data that isn't MouseLight data)
  • Lastly, if manually uploading the benchmarking data and/or porting the data from AWS does not seem feasible, we can can shift gears and work on making the CloudVolume package compatible with Azure, as detailed by this issue. We could then use CloudVolume to preprocess and upload the MouseLight data directly to Azure and run InnerEye from there.

2/9/21 - Pro/Con list for 3 data methods

Method 1: Copying/mirroring data from S3 to Azure using rclone

Pros:

  • Suggested by WeiWei Yang from MSR - it seems she got something similar to work successfully
  • Wouldn't require moving the data from S3

Cons:

  • Data needs to be converted from .tif to NIFTI

Method 2: Running scripts on Azure that reference the data in S3

Pros:

  • Wouldn't require moving the data from S3

Cons:

  • It's unclear what steps to take to achieve this
  • Data needs to be converted from .tif to NIFTI

Method 3: Directly uploading data to Azure

Pros:

  • InnerEye models are more likely to work if data is stored directly on Azure

Cons:

  • Need to PR Azure-compatibility into CloudVolume based on this issue
  • Data needs to be converted from .tif to NIFTI

3/9/21 - Helpful Links for Working with Azure

How to create a Linux VM in Azure and run local scripts on it with limited overhead. HERE

How to create and run experiments in the Azure ML environment. More seamless but with more overhead. HERE

Using AzureML Compute Clusters through local script. HERE