-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some files randomally have appended null on read #1586
Comments
How are these files originally created ? Was it using blobfuse only or some other tool. |
Created with blobfuse. It's random, I have no idea how to reproduce this. |
There were some issues with blobfuse writing using block-cache which were resolved with Blobfuse 2.3.2 version. Were these files created with a prior version? |
Additionally, could you confirm whether you are using file-cache or block-cache for file write/read operations? I noticed that you were using the |
@vibhansa-msft The writer is also version 2.3.2. It runs on arm64 if it matters. The reader is on amd64. @ashruti-msft Looks like file-cache. This is the blobfuse2 command from the node:
|
This issue hasn't been reported before for file-cache. Could you let me know how frequently you encounter this scenario? I'll attempt to reproduce the issue, but since it occurs randomly, it might be a bit tricky. Some hints to reproduce this issue would be to focus on the number of reads/writes it takes to start showing this problem. Additionally, if you have the debug logs from the last time you encountered this issue, please share them as they might help. |
It happened again 2 days ago, and now I have trace logs from both the writer and reader. How can I share them privately? |
You can share them via email: [email protected]. |
@vibhansa-msft I tried, but my mail was rejected:
|
I tried to inspect the logs. I compared Dec 13 which was the first failure vs Dec 15 which succeeded. Found no difference except Request-Id and the date in the requests. |
I sent a link to your email. If you didn't get it, let me know and I'll upload it to a storage. |
I received the logs thanks |
Hi @orgads, Try to find the steps to reproduce this issue as without that we won't be able to debug further. |
Hi @ashruti-msft, thanks for your response. I don't fully understand.
|
Hi @orgads, At the 02:00:00 time mark I can see we are downloading the file multiple times as its not found in local cache. I do not see an explicit delete call but after that timemark I see the behaviour I described later. But like I said it's tough to debug the issue from our side. It would best if there were a set of steps reproducing the issue consistently. |
I have 2 Node.JS services on kubernetes - one writes json files to Azure blob, and the other reads them.
I use blob.csi.azure.com driver, which uses blobfuse2 version 2.3.2.
Mount options are
-o allow_other --file-cache-timeout-in-seconds=120
.The read and write are straightforward:
We read up to 10 files concurrently.
Sometimes some of the files are read with an extra null, although on the storage they don't have it. When I read the same file directly from the storage, or from a pod on another node, it doesn't have the extra null.
Example:
This results in failure on JSON.parse.
This is not a race between read and write, because the first failing read happened ~30 minutes after the file was written (it is only written once).
After this failure, the same files keep failing, even after several hours (when blobfuse cache is already invalidated).
I can see this null also when running
hexdump -C
on the file from the command line, both in the pod and on the k8s node that hosts it.The only way I found to solve this was to drop OS cache using
echo 3 > /proc/sys/vm/drop_caches
.Any hints how to debug this further?
The text was updated successfully, but these errors were encountered: