Skip to content

Latest commit

 

History

History
138 lines (117 loc) · 4.44 KB

README.md

File metadata and controls

138 lines (117 loc) · 4.44 KB

Build Status

RDS Alarms

Creates warning and critical alarms for RDS instances on Amazon CloudWatch. For more details, check out the blog post.

💥 Battle-tested at Grofers

Requirements

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Stmt19471460522000",
            "Effect": "Allow",
            "Action": [
                "cloudwatch:DeleteAlarms",
                "cloudwatch:DescribeAlarms",
                "cloudwatch:PutMetricAlarm"
            ],
            "Resource": [
                "*"
            ]
        },
        {
            "Sid": "Stmt1947940274000",
            "Effect": "Allow",
            "Action": [
                "rds:DescribeDBInstances"
            ],
            "Resource": [
                "*"
            ]
        }
    ]
}

Installation

To install, just run

$ ansible-galaxy install grofers.rds-alarms

Role Variables

  • rds_alarms_region - AWS region (Required)
  • rds_alarms_common_action_list - List of ARN of topics in AWS SNS
  • rds_alarms_period - Time (in seconds) between metric evaluations
  • rds_alarms_evaluation_periods - The number of times in which the metric is evaluated before final calculation
  • rds_alarms_common_action_list - Always include this alarm actions
  • rds_alarms_warning_threshold - Threshold for warning (default - 75%)
  • rds_alarms_critical_threshold - Threshold for warning (default - 90%)
  • rds_alarms_warning_cpu_credits_threshold - Threshold for CPU Credits (default - 30)
  • rds_alarms_critical_cpu_credits_threshold - Threshold for CPU Credits (default - 15)
  • rds_alarms_db_instances - Dict with the following format:
rds_alarms_db_instances:
  <rds-instance-identifier>:
    warning_db_connections_threshold: 100
    critical_db_connections_threshold: 200
    warning_burst_balance_threshold: 100
    critical_burst_balance_threshold: 200
    alarm_action_list: ["arn:aws:sns:us-east-1:9783248248:MYALARM"]
    critical_threshold: 90 # Optional
    warning_threshold: 75 # Optional
    credit_warning_threshold: 30 # Optional
    credit_critical_threshold: 15 # Optional
    replica_lag_threshold: 1800 # Required Only for replicas. Units seconds

Conventions

The format of name of alarms created in Amazon CloudWatch is: rds-<instance_name>-<metric_name>-<alert_type>.

For example, warning alarm for CPU for an instance with identifier my-rds-instance will be created as rds-my-rds-instance-cpu-warning.

Example Playbook

This playbook will create alarms for my-rds-instance-identifier with the default thresholds. Where as for alarms created for my-replica-rds-instance-identifier will be created with warning threshold of 80% and critical threshold will be the default value(90%). If the instance is a replica then an alarm will also be created for the replica lag. For every t2 instance, alarms are also created for remaining CPU credits.

- hosts: localhost
  connection: local
  vars:
    rds_alarms_common_action_list:
      - "arn:aws:sns:us-east-1:9783248248:ALARMS"
    rds_alarms_period: 60
    rds_alarms_evaluation_periods: 2
    rds_alarms_region: us-east-1
    rds_alarms_warning_threshold: 70
    rds_alarms_critical_threshold: 90
    rds_alarms_warning_cpu_credits_threshold: 60
    rds_alarms_critical_cpu_credits_threshold: 30
    rds_alarms_db_instances:
      my-rds-instance-identifier: # this will use the defaults
        warning_db_connections_threshold: 100
        critical_db_connections_threshold: 200
        alarm_action_list: ["arn:aws:sns:us-east-1:9783248248:MYALARM"]
      my-replica-rds-instance-identifier:
        warning_threshold: 80
        warning_db_connections_threshold: 100
        critical_db_connections_threshold: 200
        alarm_action_list: ["arn:aws:sns:us-east-1:9783248248:MYALARM"]
        credit_warning_threshold: 20
        credit_critical_threshold: 10
        replica_lag_threshold: 1800
  roles:
    - rds-alarms

Limitations

You need to create multiple playbooks for different regions.

License

MIT License