-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check for large files in CircleCI #1142
Conversation
2444e83
to
a253df5
Compare
.circleci/config.yml
Outdated
set +e | ||
find . -size +1000000c -not -path './.git/*' | grep . | ||
(( $? == 1 )) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
find
will output matching files. grep .
will match any file, so if find
outputs any files, grep
will return with an exit status of 0. We want the opposite though: we want there to be no matching files. In that case, grep
returns with an exit status of 1. In other words, we want the command as a whole to succeed only if grep
returns 1.
It looks like CircleCI runs the command with -e
, which would cause the command to fail if grep
returned with an exit status of 1. That's why I run set +e
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(( $? == 1))
seems unidiomatic compared to e.g. [[ $? -eq 1]]
.
An alternative to using grep
might be:
find . -size +1000000c -not -path './.git/*' \
-execdir bash -c '(echo "Big file(s) found: $@" && exit 1)' -- {} + ;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've replaced [[ ... ]]
with (( ... ))
. 👍
I feel less sure about -execdir bash ...
. At least to me, it doesn't feel much clearer than grep
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1M
looks (roughly) equivalent to +1000000c
, and may be easier to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was my first choice as well and worked for me locally. You can see +1Mc
in the first commit above. However, I saw this message in CircleCI:
find: Invalid argument `+1Mc' to -size
Would it be helpful to add this as a make target or separate script file? It could then optionally be included in other targets, e.g. |
That sounds reasonable to me. 👍 Would that work OK with your local testing of S3 things? |
Good point. Here's a possible alternative that only checks files tracked by git: ! (git ls-files | xargs du -b | awk '$1 > 1000000 { printf("%s %.2f MB\n", $2, $1/1048576) }' | grep .) Example output when threshold set to 100,000 bytes, and showing final status code: $ ! (git ls-files | xargs du -b | awk '$1 > 100000 { printf("%s %.2f MB\n", $2, $1/1048576) }' | grep .); echo $?
docs/api.yaml 0.52 MB
package-lock.json 0.76 MB
test/integration/api/datasets.js 0.21 MB
test/integration/api/entities.js 0.15 MB
test/integration/api/odata.js 0.10 MB
test/integration/api/submissions.js 0.23 MB
1 |
git ls-files |
xargs du -b |
awk '
$1 > 1000000 {
found=1;
printf("%s %.2f MB\n", $2, $1/1048576)
}
END { exit found }
' |
Interesting! I think it'd be good to get something merged, so I'm going to go ahead and merge what's been approved here. Feel free to follow up with another PR though. |
This PR makes progress on getodk/central#672. It updates CircleCI to check for large files. Any large files will be listed.
What has been done to verify that this works as intended?
CircleCI passes currently, but if I reduce the limit from 1 MB to 100 kB, then it fails. See https://app.circleci.com/pipelines/github/getodk/central-backend/2186/workflows/e2e9689b-4e3a-4c28-80bd-392837ccb9d3/jobs/3046 for an example of that.
Why is this the best possible solution? Were any other approaches considered?
I'll add a few line comments.
Before submitting this PR, please make sure you have:
make test
and confirmed all checks still pass OR confirm CircleCI build passes