Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reaper: Add aws-nuke integration #39

Merged
merged 2 commits into from
Jul 31, 2024
Merged

reaper: Add aws-nuke integration #39

merged 2 commits into from
Jul 31, 2024

Conversation

darkowlzz
Copy link
Contributor

@darkowlzz darkowlzz commented May 21, 2024

The existing aws provider in reaper uses the Resource Groups Tagging API, which doesn't support listing every resource to the extent it's needed to find and delete them. In addition, the way the status of the resources are reported to the CLI, it's difficult to reliably delete resources. These issues are better handled by aws-nuke. aws-nuke is written in Go and can be imported and embedded in reaper, to provide a consistent and coherent test-infra resource management tooling. But the default CLI implementation of aws-nuke makes it hard to integrate with reaper. Most of the necessary scan and delete code is implemented as part of the CLI package itself. In order to modify aws-nuke, some parts of it are copied in an internal package with minimal changes and extended in a separate file. aws-nuke being MIT license, the minor modification to the copied file is explicitly noted in the copied file.

Package tools/reaper/awsnukemod contains the copied file, modifications and extensions to it along with relevant tests and docs. This is used by the reaper main package to implement aws-nuke provider, which would replace the existing aws provider in the future, unless there's a good use case for it.

To keep the usage simple, aws-nuke provider uses the aws CLI to infer certain details like the AWS account ID for the configured IAM principal. aws CLI is still required when using aws-nuke provider. The only required flag it adds is awsregions, which is used by aws-nuke to decide which aws regions to scan and nuke.

Example usage:

$ go run ./ -provider aws-nuke -awsregions "us-east-1,us-east-2" -tags 'environment=dev' -retention-period=1m
Total resources found: 40
IAMOpenIDConnectProvider: arn:aws:iam::111111111:oidc-provider/oidc.eks.us-east-2.amazonaws.com/id/67BEB20F2D6450F29184D2DBF121D3DD
IAMRole: blue-eks-node-group-20240521195851999200000003
IAMRole: flux-test-32617-cluster-20240521195851995900000001
IAMRole: green-eks-node-group-20240521195851995900000002
IAMRolePolicyAttachment: blue-eks-node-group-20240521195851999200000003 -> AmazonEKS_CNI_Policy
IAMRolePolicyAttachment: blue-eks-node-group-20240521195851999200000003 -> AmazonEC2ContainerRegistryReadOnly
IAMRolePolicyAttachment: blue-eks-node-group-20240521195851999200000003 -> AmazonEKSWorkerNodePolicy
IAMRolePolicyAttachment: flux-test-32617-cluster-20240521195851995900000001 -> AmazonEKSClusterPolicy
IAMRolePolicyAttachment: flux-test-32617-cluster-20240521195851995900000001 -> AmazonEKSVPCResourceController
IAMRolePolicyAttachment: green-eks-node-group-20240521195851995900000002 -> AmazonEKS_CNI_Policy
IAMRolePolicyAttachment: green-eks-node-group-20240521195851995900000002 -> AmazonEC2ContainerRegistryReadOnly
IAMRolePolicyAttachment: green-eks-node-group-20240521195851995900000002 -> AmazonEKSWorkerNodePolicy
ECRRepository: Repository: flux-test-32617-cross-reg
EC2Address: 18.221.149.82
EKSCluster: flux-test-32617
EC2NATGateway: nat-070a88d8813296dee
EC2Volume: vol-0cc120f59f8886804
EC2Volume: vol-088650a2dd6dc123f
EC2LaunchTemplate: blue-20240521201317739700000011
EC2LaunchTemplate: green-20240521201317753200000013
EC2Instance: i-092d996c4df1c86ca
EC2Instance: i-067498c5492952c39
EC2VPC: vpc-0af857e63edece700
EC2Subnet: subnet-0813ee0185ff9cbaf
EC2Subnet: subnet-094f686659e50a3b5
EC2Subnet: subnet-0bf8ffd7ef6366b3d
EC2Subnet: subnet-0fd6e9e61cdb9811b
EC2Subnet: subnet-03515e6c44987bba5
EC2Subnet: subnet-02c0f74a70074e710
EC2InternetGateway: igw-041f4e41ec951efa2
EC2RouteTable: rtb-02bdc6b2aafde75d0
EC2RouteTable: rtb-0ddbbc0cf94f9f9b7
EC2SecurityGroup: sg-03ce36ee65f9e9623
EC2SecurityGroup: sg-0c4aaf206aa08091b
EC2SecurityGroup: sg-0fed414d03a16d565
ECRRepository: Repository: test-app-flux-test-32617
ECRRepository: Repository: flux-test-32617
EKSNodegroups: flux-test-32617:blue-20240521201320076900000015
EKSNodegroups: flux-test-32617:green-20240521201320078200000017
EC2InternetGatewayAttachment: igw-041f4e41ec951efa2 -> vpc-0af857e63edece700
2024/05/22 01:50:10 resources found but not deleted
exit status 1

The above output is a result of translating the observed resources from aws-nuke to reaper's resource data type. For more detailed resource listing, for debugging needs, the JSON output can be used:

$ go run ./ -provider aws-nuke -awsregions "us-east-1,us-east-2" -tags 'environment=dev' -retention-period=1m -ojson
[
  {
    "name": "arn:aws:iam::111111111:oidc-provider/oidc.eks.us-east-2.amazonaws.com/id/67BEB20F2D6450F29184D2DBF121D3DD",
    "type": "IAMOpenIDConnectProvider",
    "location": "global",
    "tags": {
      "Arn": "arn:aws:iam::111111111:oidc-provider/oidc.eks.us-east-2.amazonaws.com/id/67BEB20F2D6450F29184D2DBF121D3DD",
      "tag:Name": "flux-test-32617-eks-irsa",
      "tag:createdat": "x2024-05-21_19h31m47s",
      "tag:createdby": "darkowlzz",
      "tag:environment": "dev",
      "tag:test": "true"
    },
    "resourceGroup": ""
  },
  {
    "name": "blue-eks-node-group-20240521195851999200000003",
    "type": "IAMRole",
    "location": "global",
    "tags": {
      "CreateDate": "2024-05-21T19:58:52Z",
      "LastUsedDate": "2024-05-21T20:16:06Z",
      "Name": "blue-eks-node-group-20240521195851999200000003",
      "Path": "/",
      "tag:createdat": "x2024-05-21_19h31m47s",
      "tag:createdby": "darkowlzz",
      "tag:environment": "dev",
      "tag:test": "true"
    },
    "resourceGroup": ""
  },
...

The tags shows all the tags and other properties of the resources, which may not be accurate but a direct conversion from aws-nuke observed resource type. Correcting them is not of concern for our needs here.
The resourceGroup is empty as unlike azure and GCP, the AWS resources don't have groups. They are grouped using the tags.

aws-nuke.go contains getAWSNukeConfig() which returns a Go version of the following aws-nuke configuration for the given account ID and regions:

regions:
- "global"
- "us-east-2"
- "us-east-1"

account-blocklist:
- "999999999999"

accounts:
  "111111111":
    resource-types:
      targets:
      - EC2VPC
      - EC2SecurityGroup
      - EC2LaunchTemplate
      - EC2RouteTable
      - EC2NetworkInterface
      - ECRRepository
      - EC2Volume
      - EKSNodegroups
      - EC2Subnet
      - KMSAlias
      - KMSKey
      - AutoScalingGroup
      - EC2Address
      - EKSCluster
      - EC2InternetGatewayAttachment
      - EC2InternetGateway
      - EC2Instance
      - EC2NATGateway
      - CloudWatchLogsLogGroup
      - IAMRole
      - IAMRolePolicy
      - IAMRolePolicyAttachment
      - IAMPolicy
      - IAMOpenIDConnectProvider

    filters:
      EC2VPC: &inverttesttag
        - property: "tag:test"
          value: "true"
          invert: true
      EC2SecurityGroup: *inverttesttag
      EC2LaunchTemplate: *inverttesttag
      EC2RouteTable: *inverttesttag
      EC2NetworkInterface: *inverttesttag
      ECRRepository: *inverttesttag
      EC2Volume: *inverttesttag
      EKSNodegroups: *inverttesttag
      EC2Subnet: *inverttesttag
      KMSAlias: *inverttesttag
      KMSKey: *inverttesttag
      AutoScalingGroup: *inverttesttag
      EC2Address: *inverttesttag
      EKSCluster: *inverttesttag
      EC2InternetGatewayAttachment:
        - property: "tag:igw:test"
          value: "true"
          invert: true
      EC2InternetGateway: *inverttesttag
      EC2Instance: *inverttesttag
      EC2NATGateway: *inverttesttag
      CloudWatchLogsLogGroup: *inverttesttag
      IAMRole: *inverttesttag
      IAMRolePolicy: *inverttesttag
      IAMRolePolicyAttachment:
        - property: "tag:role:test"
          value: "true"
          invert: true
      IAMPolicy: *inverttesttag
      IAMOpenIDConnectProvider: *inverttesttag

This can be used to directly use aws-nuke for testing and debugging needs in the future, if needed, as the upstream aws-nuke implementation changes.

The delete operation runs the aws-nuke delete code, printing aws-nuke delete output:

$ go run ./ -provider aws-nuke -awsregions "us-east-1,us-east-2" -tags 'environment=dev' -retention-period=1m -delete
Total resources found: 40
IAMOpenIDConnectProvider: arn:aws:iam::111111111:oidc-provider/oidc.eks.us-east-2.amazonaws.com/id/67BEB20F2D6450F29184D2DBF121D3DD
IAMRole: blue-eks-node-group-20240521195851999200000003
IAMRole: flux-test-32617-cluster-20240521195851995900000001
IAMRole: green-eks-node-group-20240521195851995900000002
IAMRolePolicyAttachment: blue-eks-node-group-20240521195851999200000003 -> AmazonEKS_CNI_Policy
IAMRolePolicyAttachment: blue-eks-node-group-20240521195851999200000003 -> AmazonEC2ContainerRegistryReadOnly
...
2024/05/22 02:18:13 Deleting resources...
global - IAMOpenIDConnectProvider - arn:aws:iam::111111111:oidc-provider/oidc.eks.us-east-2.amazonaws.com/id/67BEB20F2D6450F29184D2DBF121D3DD - [Arn: "arn:aws:iam::111111111:oidc-provider/oidc.eks.us-east-2.amazonaws.com/id/67BEB20F2D6450F29184D2DBF121D3DD", tag:Name: "flux-test-32617-eks-irsa", tag:createdat: "x2024-05-21_19h31m47s", tag:createdby: "darkowlzz", tag:environment: "dev", tag:test: "true"] - triggered remove
global - IAMRole - blue-eks-node-group-20240521195851999200000003 - [CreateDate: "2024-05-21T19:58:52Z", LastUsedDate: "2024-05-21T20:17:54Z", Name: "blue-eks-node-group-20240521195851999200000003", Path: "/", tag:createdat: "x2024-05-21_19h31m47s", tag:createdby: "darkowlzz", tag:environment: "dev", tag:test: "true"] - failed
global - IAMRole - flux-test-32617-cluster-20240521195851995900000001 - [CreateDate: "2024-05-21T19:58:52Z", LastUsedDate: "2024-05-21T20:30:31Z", Name: "flux-test-32617-cluster-20240521195851995900000001", Path: "/", tag:createdat: "x2024-05-21_19h31m47s", tag:createdby: "darkowlzz", tag:environment: "dev", tag:test: "true"] - failed
global - IAMRole - green-eks-node-group-20240521195851995900000002 - [CreateDate: "2024-05-21T19:58:52Z", LastUsedDate: "2024-05-21T20:18:43Z", Name: "green-eks-node-group-20240521195851995900000002", Path: "/", tag:createdat: "x2024-05-21_19h31m47s", tag:createdby: "darkowlzz", tag:environment: "dev", tag:test: "true"] - failed
global - IAMRolePolicyAttachment - blue-eks-node-group-20240521195851999200000003 -> AmazonEKS_CNI_Policy - [PolicyArn: "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy", PolicyName: "AmazonEKS_CNI_Policy", RoleCreateDate: "2024-05-21T19:58:52Z", RoleLastUsed: "2024-05-21T20:17:54Z", RoleName: "blue-eks-node-group-20240521195851999200000003", RolePath: "/", tag:role:createdat: "x2024-05-21_19h31m47s", tag:role:createdby: "darkowlzz", tag:role:environment: "dev", tag:role:test: "true"] - triggered remove
...
Removal requested: 21 waiting, 19 failed, 67 skipped, 0 finished
...
us-east-2 - EKSCluster - flux-test-32617 - [CreatedAt: "2024-05-21T19:59:28Z", tag:createdat: "x2024-05-21_19h31m47s", tag:createdby: "darkowlzz", tag:environment: "dev", tag:terraform-aws-modules: "eks", tag:test: "true"] - waiting

Removal requested: 1 waiting, 0 failed, 67 skipped, 39 finished

us-east-2 - EKSCluster - flux-test-32617 - [CreatedAt: "2024-05-21T19:59:28Z", tag:createdat: "x2024-05-21_19h31m47s", tag:createdby: "darkowlzz", tag:environment: "dev", tag:terraform-aws-modules: "eks", tag:test: "true"] - removed

Removal requested: 0 waiting, 0 failed, 67 skipped, 40 finished

Nuke complete: 0 failed, 67 skipped, 40 finished.

Changes to the reaper README with granular permissions needed for this will be added separately after more testing and evaluation.

Part of fluxcd/flux2#4619

@swade1987
Copy link
Member

@darkowlzz you may want to check out https://github.com/ekristen/aws-nuke

@ekristen
Copy link

Thanks @swade1987! @darkowlzz I've started a break away fork after getting lackluster response from the original maintainers on updates/changes/fixes and a general reluctance to bring anyone else on board to maintain. I rewrote the entire tool as a core library called libnuke and then rewrote aws-nuke. I've also written them for azure and gcp now.

All this to say that embedding it into go should be a lot more simple now. The library is fairly well documented. You can look at the command structure in aws-nuke to see how to call the library and import the resources. I hate the use of internal in open source tools, so you'll run into no issues there.

If you find any resources not supporting tags, let me know I'll get them added or feel free to open a PR.

Technically libnuke is in 0.x I have a few more "changes" to make that could be semi-breaking but as long as you are pinning deps you'll be fine. aws-nuke is in 3.x-beta, that's only because I have a few more deprecation warnings to put into place as I've been standardizing things, it's been working very well.

P.S. nice use of YAML anchor, it should still work in my version, if you find that it doesn't let me know.

@stefanprodan
Copy link
Member

@darkowlzz let's try to move this to @ekristen's project. In case things break, we have someone to reach out to 🤗

@darkowlzz
Copy link
Contributor Author

darkowlzz commented May 22, 2024

Thanks @swade1987 and @ekristen . Using libnuke for reaper's backend would also be great for consistency across the providers. I have already spent a few days figuring out how to modify and extend aws-nuke to integrate here. I can spend some more time to see what needs to change to integrate the fork. I'll hold from merging this and will get back after more research and testing.

To integrate aws-nuke in reaper, it is divided into multiple
components by separating the scan and delete operations. In order to
achieve this, some of the aws-nuke code is copied, modified and
extended.
The resource data scanned by aws-nuke is converted to the reaper's
native resource type to be listed in a coherent manner with the other
reaper providers and also support json output.
Applying the retention period/age filter to aws-nuke resources also
requires post processing of the resources after scanning and before
deleting. aws-nuke doesn't provide option for custom filters. The
scanned resources are passed through a custom filter, implemented as an
extension of the Nuke type, which understands the custom createdat
timestamp format and filters the resources accordingly.

Signed-off-by: Sunny <[email protected]>
@ekristen
Copy link

I should rev the go mod version on aws nuke so you can properly include the v3 resources.

@ekristen
Copy link

Question, is there a reason you don't just use a dedicated account and run the aws-nuke tool directly vs including the library and re-implementing some of the logic? If the library + resources makes more sense, I'm happy to make additional tweaks that might make things easier. Let me know. Happy to see you including it! :)

@darkowlzz
Copy link
Contributor Author

darkowlzz commented Jun 20, 2024

@ekristen

I should rev the go mod version on aws nuke so you can properly include the v3 resources.

That would be nice to have. I thought of opening an issue asking if this is intentional.

Question, is there a reason you don't just use a dedicated account and run the aws-nuke tool directly vs including the library and re-implementing some of the logic?

We may be able to do that in the future but at present, we have a single account with some free credits and I'm not sure if those free credits can be used with sub-accounts as well. Since I don't have root access to the account to try creating subaccounts and we wanted to start using the credits for testing soon, I have been trying to make this work within the same account such that it runs in a limited scope, not interfering with anything else in the account.
We have similar constraints in our GCP account, a single project access, and we are doing the same there, not nuking everything, running the cleanup within a limited scope to also allow for manual testing and development needs. I tried to do the same for AWS initially, and later look into using dedicated test subaccounts.
This is also reflected in this cleanup tool we have that tries to consolidate the work for different cloud providers within our constraints. I wasn't aware of aws-nuke or libnuke before last month. I would like to use libnuke as the back-end for GCP and Azure cleanup too so that we don't have to use their CLI and have consistent tooling for all the clouds. With most of the changes here, it would be easy to add them.

If the library + resources makes more sense, I'm happy to make additional tweaks that might make things easier. Let me know. Happy to see you including it! :)

I'm still doing some testing and I plan to discuss some changes upstream that would make things easier. Just wanted to get this working before starting any discussion.
Thanks again.

@ekristen
Copy link

I have azure and gcp nuke variants too.

@darkowlzz
Copy link
Contributor Author

I have azure and gcp nuke variants too.

@ekristen Yes, I've seen those 🙂 . Haven't looked into the details yet. But excited to use them in the future.

libnuke is more actively developed compared to upstream aws-nuke, and
also has support for GCP and Azure, which could be a nice backend for
reaper in the future, all based on libnuke.

A new internal package libnukemod has been added with modifications and
extensions to libnuke to fit with reaper needs. It contains aws.go for
all the aws-nuke related helpers. This can be further extended in the
future to add GCP and Azure helpers.

Signed-off-by: Sunny <[email protected]>
@darkowlzz darkowlzz merged commit c8e430d into main Jul 31, 2024
2 checks passed
@darkowlzz darkowlzz deleted the reaper-aws-nuke branch July 31, 2024 13:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants