Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File system max cache size exceeded #25105

Open
skuzzle opened this issue Feb 21, 2025 · 1 comment
Open

File system max cache size exceeded #25105

skuzzle opened this issue Feb 21, 2025 · 1 comment

Comments

@skuzzle
Copy link

skuzzle commented Feb 21, 2025

Hi,

I'd like to understand how strict trino is with keeping the actual file system cache's size within the configured limit. We are running trino in kubernetes and have mounted an ephemeral volume with a maximum size of 10Gi. We configured trino to use at most 9G for the file system cache. Still we see that the pod would get evicted from time to time with a message like:

Usage of EmptyDir volume "fs-cache" exceeds the limit "10Gi"

So it looks like 1Gi of headroom isn't enough?

For a local test, I've reduced the max cache size to 9Mb.
We actually have 2 connectors and configure them individually like this:

[trino@trino-worker-5888d576bb-kl682 /]$ cat /etc/trino/worker-catalog/hive.properties
connector.name=hive
...
fs.cache.enabled=true
fs.cache.directories=/var/fs-cache/hive
fs.cache.max-sizes=9437184B

and

[trino@trino-worker-5888d576bb-kl682 /]$ cat /etc/trino/worker-catalog/iceberg.properties
connector.name=iceberg
...
fs.cache.enabled=true
fs.cache.directories=/var/fs-cache/iceberg
fs.cache.max-sizes=9437184B

After running and querying for a while I can already see that the sizes are exceeded a bit:

[trino@trino-worker-5888d576bb-kl682 /]$ du -sh /var/fs-cache/hive/ /var/fs-cache/iceberg/
13M	/var/fs-cache/hive/
12M	/var/fs-cache/iceberg/

I've also observed that sizes went down a bit from time to time. I'm unsure though what amount of headroom we should grant the cache?

@wendigo
Copy link
Contributor

wendigo commented Feb 21, 2025

I think that you should ask Alluxio folks.

The only thing that I could read in their code is:

  // We assume there will be some overhead using local fs as a page store,
  // i.e., with 1GB space allocated, we
  // expect no more than 1024MB / (1 + LOCAL_OVERHEAD_RATIO) logical data stored

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants