Skip to content

Setting up an AWS instance with the CBW AMI

Robyn Wright edited this page Jul 7, 2024 · 6 revisions

This page has instructions for setting up an Amazon Web Services instance using the Amazon Machine Image created for the Canadian Bioinformatics Workshops (CBW) in 2024. You can see more information on these workshops on the CBW general pages for the Beginner and Advanced Microbiome Analysis workshops as well as the CBW tutorial pages for the Beginner and Advanced Microbiome Analysis workshops.

To start this, you will need an AWS account.

Please note before starting that using the AWS instances costs money. You can see the associated costs with different types of instances here (note that we are suggesting you use the t2.xlarge instance for this tutorial).

It is therefore really important that you Stop or Terminate the instance when you aren't using it.

Launching an instance

Now, we will launch the image (these instructions are modified from here):

  1. Go to the EC2 console
  2. Choose EC2 Dashboard and then click "Launch Instance".
  3. Under Names and tags, for Name, enter a name for your instance, e.g. "CBW-2024-AMI" (note that it doesn't matter what this is as long as you can identify it)
  4. In Application and OS Images (Amazon Machine Image), search "CBW_MIC_240503". Click on "Select" when this comes up. (If you are in the ICG workshop, search for and choose "CBW-ICG-2024-AMI").
  5. In the instance type select "t2.xlarge" - this is the type of instance that we used in the workshop and has 4 vCPU and 16GB memory.
  6. Under Key pair (login), click on "Create new key pair".
  7. Add a name for your key pair (I went with "cbw-2024-ami").
  8. Keep the type as RSA and choose .pem or .ppk depending on where you plan to use this - typically, Mac users will use .pem and Windows users will use .ppk.
  9. Click "Create key pair". This should automatically download the created key pair file - move it to the location on your computer where you want to keep it.
  10. If you will use Mac/Linux then go to the Terminal, navigate to the location of your key pair file and enter the following command: chmod 400 cbw-2024-ami.pem (note: if you gave your key file name a different name, you'll need to modify it in the command)
  11. Now you can choose your security group (see below instructions on how to create one). If you just choose to create one and leave the options as default, this will allow access from anywhere.
  12. Now go to Configure storage. You will need to add enough storage here that you can store the CourseData here. For BMB, you should be fine with the default 100GB. As this storage costs money, I recommend first launching an image for BMB with lower storage, and then launching a different one with more storage for AMB. Keep gp3 for both of these.
  13. Now click "Launch instance".
  14. Choose View all instances to check the status of your instance. It may take a few minutes for the instance to launch. Please get in touch if you are unsure.
  15. When the instance has launched, the status will change from "Pending" or to "Running" and the Status check has changed to "2/2 checks passed". This may take a few minutes and you may need to refresh the page. Now you can log in to the instance (see below).
  16. If you are going to want to use RStudio, see the details below on (#Setting-up-a-security-group-for-RStudio-server) - you can do this while your instance starts up.

Creating a security group

  1. Open the EC2 console.
  2. In the navigation pane, choose Network & Security > Security Groups.
  3. Choose "Create security group" (top right orange button).
  4. Enter a name for the security group. Amazon recommends that you add the region that you will use to this name. For example, I have been using the US East region and named my group the following: "RW_SG_useast1-cbw_2024-ami". Also add a description to this, e.g. "allow me to access with SSH"
  5. Select the default VPC.
  6. You now have to add some inbound rules so that you will be able to access an instance that you create with this security group. There is guidance on the suggested rules for different use cases here, but I used the following:
    1. Choose RDP from the drop down list. In source, choose "My IP". Click add rule.
    2. Choose SSH from the drop down list. In source click "My IP" - this should automatically add your IP address. Click "Create security group".

Setting up a security group for RStudio server

If you are going to want to use RStudio for server then you will need to allow some port access. On the lefthand panel, go to "Network & Security" > "Security Groups". Before clicking on this, look in "Security" within your instance details and find out what the name of your security group is (it will probably be something like "launch-wizard-1" unless you set up your own security group).

  1. Find this security group in Security Groups. You can keep the outbound rules as they are, but we'll need to modify the inbound rules so click on "Inbound rules" > "Edit inbound rules"
  2. Click "Add rule" at the bottom, and add four rules with these settings:
  3. Custom TCP, Port range 8000-9000, Source Anywhere-IPv4
  4. Custom TCP, Port range 8787, Source Anywhere-IPv4.
  5. HTTP, Port range 80, Source Anywhere-IPv4.
  6. HTTPS, Port range 443, Source Anywhere-IPv4.

Click "Save rules" at the bottom of the page.

Logging into the instance

Assuming that you have followed the above instructions to launch an instance, now you can log into it. Note that you'll need to check in the Instances page to see that it is ready! This section is modified from the Connect to your Linux instance AWS instructions. We will use the ssh instructions here, but if you wish to do this a different way, please follow the AWS instructions at the link.

Mac

  1. Open a new terminal window.
  2. Log in with the following information:
ssh -i /path/key-pair-name.pem ubuntu@instance-public-dns-name

Your /path/key-pair-name.pem should be the location of your key file and instance-public-dns-name can be found under Public IPv4 DNS in the Instance details. For me, this command looked like this:

ssh -i cbw-2024-ami.pem [email protected]
  1. You may need to type "yes" to say that you want to continue connecting, and then you should see that you have been logged into the instance.

Windows

For Windows, we’ll be logging in with Putty. So next, go to the “Session” category and fill in the “Host Name (or IP address)” field with:

ubuntu@instance-public-dns-name

instance-public-dns-name can be found under Public IPv4 DNS in the Instance details. For me, this looked like [email protected].

Now, in the “Connection” category, click the + next to “SSH”. Click the + next to “Auth” and then click “Credentials”. In the “Private key file for authentication” field, click browse and find your .ppk file.

Now, click on the “Session” category again. In the “Saved sessions” field, type “Amazon node” and click Save.

Now that Putty is set up, all you have to do is start putty and double-click on “Amazon node” to login. Now you should see something like (base) ubuntu@ip-10-0-1-199:~$ to the left of your cursor. Great, you’re connected!

Stopping or terminating the instance

  1. View your active instances.
  2. Select the instance that you have been using and click "Instance state".
  3. Choose "Stop instance" or "Terminate instance" and click "Stop" or "Terminate", respectively.

When you choose to stop the instance, you can restart this at a later time. Note that you will still pay for any storage used, but you won't pay for the use of the instance itself. If you terminate the instance, you will need to go through the same steps as above to make a new instance if you want to come back to this again.

Download all data needed

BMB:

rm CourseData
rm workspace
mkdir CourseData
mkdir workspace
cd CourseData
wget https://hpc4health.ca/cbw/2024/MIC/BMB.tar.gz --no-check-certificate
tar -xvf BMB.tar.gz
rm BMB.tar.gz
mv BMB_data/* .

Now you should be able to start working through the workshop materials, starting with Beginner Module 1.

Clone this wiki locally