Skip to content

Conversation

@mkilp
Copy link
Contributor

@mkilp mkilp commented Dec 23, 2025

Hi everyone,

I ran into a semi-big issue while developing my ECS service. My containers are pretty heavy (multiple gigs) and I did a sanity check inside ECR to make see what kind of storage I am looking at.

Low and behold I noticed there are dozens of images that never get deleted.
This PR adds a lifecycle policy to ECR that expires untagged images, which as far as I can tell are what we can safely remove.

I do forsee a problem with rollbacks since SST uses the digest to attache the image to the task.

I am happy to take input on the exact lifecycle policy to use, I understand this is a pretty big change since it adds to the bootstrap. I tested it and its working:

CleanShot 2025-12-23 at 18 23 39

My change will essentially gives 30 days of rollback time. We could also change the rule to at least keep x number of untagged images. Note: We can only target untagged images with one rule.

I do believe this is very important since ECR storage is pretty expensive at 0.10$ per GB Month.

cheers,

marv

@ekaya97
Copy link
Contributor

ekaya97 commented Dec 28, 2025

+1

as workaround, following can be applied to sst.config.ts

 async run() {
      new aws.ecr.LifecyclePolicy("sst-asset-lifecycle", {
      repository: "sst-asset", //ECR name from SST bootstrap
      policy: JSON.stringify({
       "rules": [
				{
					"rulePriority": 1,
					"description": "Expire untagged images pushed over 30 days ago",
					"selection": {
						"tagStatus": "untagged",
						"countType": "sinceImagePushed",
						"countUnit": "days",
						"countNumber": 30
					},
					"action": {"type": "expire"}
				}
			]
      }),
    });
  },

@vimtor vimtor self-assigned this Jan 10, 2026
@vimtor
Copy link
Collaborator

vimtor commented Jan 12, 2026

i'm wondering if this should be more explicit. something like:

const vpc = new sst.aws.Vpc("MyVpc");
const cluster = new sst.aws.Cluster("MyCluster", { vpc });

new sst.aws.Service("MyService", {
  cluster,
  image: {
    context: "./app",
    dockerfile: "Dockerfile",
    expiresIn: "30 days"
  }
});

what do you think?

@ekaya97
Copy link
Contributor

ekaya97 commented Jan 12, 2026

@vimtor

yes, this is much better.

with enums like "x days", "on push", etc.

@vimtor
Copy link
Collaborator

vimtor commented Jan 12, 2026

exactly! @ekaya97

do any of you wanna give it a try?

@mkilp
Copy link
Contributor Author

mkilp commented Jan 12, 2026

@vimtor I'll take care of it tonight!

@ekaya97
Copy link
Contributor

ekaya97 commented Jan 12, 2026

couple questions:

  1. putting this into the service might cause the impression that it applies only to this service but it applies to the entire ECR - would need to work with tags, make sure custom image tags and existing untagged images are handled properly.
  2. how do you handle modifications? deprecate/delete the old policy? update the old policy?
  3. do you allow for separate ECR for each Cluster and apply policy at that level ? do you add a new ECR component entirely and remove the bootstrapped ECR ?
  4. how do you confirm that policy was applied? they are not instant and may take up to 24h - fire and forget or actually check?

@vimtor
Copy link
Collaborator

vimtor commented Jan 12, 2026

probably if we do it by service/task we should add a unique tag based on the component's name and create the lifecycle targeting it. maybe there's another way but that's the only one that comes to mind atm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants