Stop Idle RDS Databases: EventBridge + Lambda vs a Dedicated Tool

Development and staging databases are typically used 8 to 10 hours a day, 5 days a week. Yet in most organizations, they run continuously: nights, weekends, and holidays, without a single connection.

The fix is well-known: automate stop and start operations on a schedule. The question is how. The most common DIY pattern reaches for Amazon EventBridge and AWS Lambda. It works, but it introduces a category of operational overhead that teams often underestimate.

The Real Cost of Idle RDS Instances

The math is unforgiving. A database used 8 hours a day, 5 days a week is idle for roughly 73% of the time: nights (16h/day × 5 days) plus full weekends (48h). Every dollar spent during those hours is waste.

Here is what that means in practice (eu-west-1, Single-AZ, Mon-Fri 9 AM to 6 PM schedule):

InstanceHourly cost24/7 monthlyWith scheduleMonthly saving
db.t3.micro$0.020/h$14/mo$4.50/mo~$10/mo
db.t3.medium$0.068/h$49/mo$16/mo~$33/mo
db.m5.large$0.192/h$138/mo$45/mo~$93/mo
db.r6g.xlarge$0.480/h$346/mo$113/mo~$233/mo

Based on approximate on-demand pricing in eu-west-1. Schedule: Mon-Fri, 9 AM to 6 PM (45h/week active). Actual savings vary by region and engine.

Five dev/staging databases averaging a db.t3.medium ($0.068/h) cost around $245/month running 24/7. On a business-hours schedule, that drops to $65/month, saving $180/month or over $2,100/year.

Automating RDS Scheduling with EventBridge + Lambda

The standard DIY setup uses two EventBridge scheduled rules and a single Lambda function.

  1. Two EventBridge rules with cron expressions: one triggering at the stop time (e.g., 7 PM weekdays), one at the start time (e.g., 8 AM weekdays).
  2. Each rule invokes the Lambda with a different payload: action "stop" or action "start".
  3. The Lambda calls StopDBInstance or StartDBInstance via the AWS SDK, filtering instances by tag to avoid touching production.

Here is a minimal implementation in Python that covers the basic case:

import boto3

def handler(event, context):
    rds = boto3.client("rds")
    action = event.get("action", "stop")  # "stop" or "start"

    paginator = rds.get_paginator("describe_db_instances")
    for page in paginator.paginate():
        for instance in page["DBInstances"]:
            db_id    = instance["DBInstanceIdentifier"]
            status   = instance["DBInstanceStatus"]
            tags     = {t["Key"]: t["Value"] for t in instance.get("TagList", [])}

            # Skip production instances
            env = tags.get("Environment", "").lower()
            if env in ("production", "prod", "prd", "live"):
                print(f"Skipping {db_id} (production tag)")
                continue

            # Only manage explicitly tagged instances
            if tags.get("AutoStop") != "true":
                continue

            try:
                if action == "stop" and status == "available":
                    rds.stop_db_instance(DBInstanceIdentifier=db_id)
                    print(f"Stopped {db_id}")
                elif action == "start" and status == "stopped":
                    rds.start_db_instance(DBInstanceIdentifier=db_id)
                    print(f"Started {db_id}")
            except Exception as e:
                print(f"Error on {db_id}: {e}")

This script reads the AutoStop tag to identify managed instances and skips anything tagged as production. It works well for a single AWS account with a fixed schedule.

The Lambda execution role needs the following IAM permissions: rds:DescribeDBInstances, rds:ListTagsForResource, rds:StopDBInstance, and rds:StartDBInstance.

The Limitations of This Approach

1. Someone has to maintain the script

What starts as 50 lines of Python grows. New engineers, new instance types, new regions, new tagging conventions: each change requires a code deployment. Six months in, the Lambda function has become infrastructure that nobody fully owns.

2. Production safety is your responsibility

Accidentally stopping a production database is a significant incident. The safeguards (tag filtering, Multi-AZ detection, account restrictions) have to be written, tested, and kept up to date manually. A tagging convention change silently breaks your protection.

3. No visibility without digging into logs

There is no dashboard showing which databases are currently scheduled, which are stopped, or which were skipped last night. Answering "did the stop actually run for staging-eu?" means opening CloudWatch and reading execution logs.

4. Runtime overrides require code changes

Deployments run late. A developer needs staging to stay online until 10 PM. Handling this ad hoc means either manually starting the instance and hoping the Lambda does not stop it again, disabling the EventBridge rule, or adding override logic to the function.

5. AWS restarts stopped instances after 7 days

AWS automatically restarts any stopped RDS instance after 7 days for maintenance. Without extra logic in your Lambda to detect and re-stop those instances, your automation silently fails over long weekends and holiday periods.

DIY vs Dedicated Tool

Here is how the two approaches compare across the dimensions that matter in practice:

FeatureEventBridge + LambdaSnoozeDB
Setup timeHours to days (IaC, IAM, deploy)5 minutes (CloudFormation + UI)
Ongoing maintenanceRegular (SDK updates, IAM drift, bugs)None
VisibilityCloudWatch logs onlyDashboard, per-instance status
Production protectionCustom tag logic requiredBuilt-in (tags + Multi-AZ detection)
Runtime overridesDisable rule or modify codeOne-click pause from dashboard
AWS 7-day restart handlingCustom re-stop logic requiredAutomatic
Multi-accountCross-account IAM setup per accountNative support
Direct costLambda + engineering hoursFrom $0/month (free tier available)

When Each Approach Makes Sense

EventBridge + Lambda is the right choice when your infrastructure team has the bandwidth to own it, your environment is stable (fixed set of instances, consistent tagging), and you need deep integration with your existing IaC pipelines.

A dedicated tool makes sense when you want the automation to work reliably without ongoing engineering time: when your team is small, when non-engineers need to adjust schedules, or when you are managing multiple AWS accounts.

Either way, the underlying point remains: idle non-production databases should not run 24/7. The cost is real, the fix is straightforward, and the engineering effort required to do it right is consistently underestimated.

Related articles

Try SnoozeDB on Your Existing Databases

Connect your AWS account in 5 minutes. SnoozeDB discovers your RDS instances automatically, protects production databases, and lets you set schedules without writing a single line of code.

14-day free trial. No credit card required.