Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Files are stored in the state file multiple times making the state huge #287

Open
1 task done
sashee opened this issue Jun 28, 2023 · 6 comments
Open
1 task done
Labels

Comments

@sashee
Copy link

sashee commented Jun 28, 2023

Terraform CLI and Provider Versions

Terraform v1.5.1
on linux_amd64

  • provider registry.terraform.io/hashicorp/aws v5.5.0
  • provider registry.terraform.io/hashicorp/http v3.4.0
  • provider registry.terraform.io/hashicorp/local v2.4.0
  • provider registry.terraform.io/hashicorp/random v3.5.1

Terraform Configuration

provider "aws" {
}

resource "random_id" "id" {
  byte_length = 8
}

data "http" "image" {
	url = "https://unsplash.com/photos/F3rDBnQQbQU/download?force=true&w=1300"
}

resource "local_file" "image" {
  content_base64  = data.http.image.response_body_base64
  filename = "/tmp/img-${random_id.id.hex}"
}

resource "aws_s3_object" "images" {
  key    = "testimage"
	source = local_file.image.filename
  bucket = aws_s3_bucket.bucket.bucket
  etag   = local_file.image.content_md5
}

resource "aws_s3_bucket" "bucket" {
  force_destroy = "true"
}

Expected Behavior

The state file is not too big.

Actual Behavior

Downloading a ~400kB image blows up the state file:

$ ls -l
total 4528
-rw-r--r-- 1 sashee sashee     546 Jun 28 09:55 main.tf
-rw-r--r-- 1 sashee sashee 4625325 Jun 28 10:05 terraform.tfstate
-rw-r--r-- 1 sashee sashee     181 Jun 28 10:05 terraform.tfstate.backup

Looking into it I see that the file contains the entire contents of the file multiple times:

image

It would be nice if either the body would not be in the state file at all or at least it wouldn't be included multiple times.

Steps to Reproduce

  1. terraform apply

How much impact is this issue causing?

Medium

Logs

https://gist.github.com/sashee/ee2392c311a64ec0a1f5789b319528f0

Additional Information

What I'm trying to do is to dynamically download a binary file (an image) and upload it to an S3 bucket. If there would be a way to specify a filename the http data source downloads the file without exposing the contents at all would also solve this problem.

Code of Conduct

  • I agree to follow this project's Code of Conduct
@sashee sashee added the bug label Jun 28, 2023
@bendbennett
Copy link
Contributor

Hi @sashee 👋

The contents of the state file reflect the values associated with the resource or data source. This is a fundamental aspect of Terraform and represents a design feature that is used for tracking changes in state and for making the values available for use elsewhere, such as further Terraform configuration. Consequently, not storing the values in state would mean that these values were no longer accessible or available for use.

@sashee
Copy link
Author

sashee commented Jun 28, 2023

Hi @bendbennett ,

Do you see a way to at least eliminate the repetition? Maybe an attribute that says which outputs are not needed. In the example above, I use the response_body_base64 but not the response_body nor the body, so I'd be happy if those two are set to null so that they don't swell the state. Adding something like expose_base64_response_only: true or something similar could work.

Alternatively, maybe a separate data source that writes the contents into a file would cover my use case as well. In that case, the state could omit the response body altogether.

@bendbennett
Copy link
Contributor

Hi @sashee,

Thank you for the suggestion. We will consider your proposal in light of the level of community interest that this issue receives. In terms of the repetition, the body attribute is deprecated and will be removed in the future so this will reduce the size of the state file.

@cunhafinrix
Copy link

We found the same issue while developing our Apps, it would be nice if the size of the file got reduced

@et304383
Copy link

It would be nice if it was at least stored just once.

@sfertman
Copy link

sfertman commented Jun 4, 2024

In some cases it is a problem even if it's stored once. I have a ~20MB binary I download before packaging it with some other files for deployment. The resulting state is ~300 MB which is completely unmanageable. Perhaps one way to go about it is to store a hash of the body if it exceeds a certain size?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants