Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure get_blob_properties not ideal to check for blob existence #9

Open
lucaong opened this issue Nov 25, 2024 · 2 comments
Open

Azure get_blob_properties not ideal to check for blob existence #9

lucaong opened this issue Nov 25, 2024 · 2 comments

Comments

@lucaong
Copy link

lucaong commented Nov 25, 2024

Currently, one can use get_blob_properties in order to check for existance of a certain blob. Unfortunately, it is not as straightforward as it could be.

Since client.get_blob_propertes(key) always returns a Blob, it's unclear how one can check for existance. The fact that Blob has a present? method would make it seem one can do client.get_blob_propertes(key).present?, but Http always raises an exception if the status code is not a success. Therefore, a 404 response raises an exception and never makes it here.

Currently, one has to rescue errors:

def blob_exists?(key)
  client.get_blob_propertes(key).present?
rescue AzureBlob::Http::FileNotFoundError
  false
end

Ideally, one should not need to use exceptions for flow control. It might be clearer to have a blob_exists? method on AzureBlob::Client, that currently might even rescue the exception like above, but hide such details from the end user.

What do you think?

@JoeDupuis
Copy link
Member

Yup, I think it makes a lot of sense. I thought the existing signature was too verbose, I wrote it that way to keep it close to the existing client for usage in paperclip azure, but adding a method for asserting on blob existence is a great idea.

I'll look into it this week.

@lucaong
Copy link
Author

lucaong commented Nov 26, 2024

I also realized that one could avoid catching the exception if get_blob_properties would internally call Http.new with , raise_on_error: false (which is already the case with get_container_properties). Then, client.get_blob_propertes(key).present? would work to check for existence.

That said, it would be better to "catch" only 404 HTTP errors: calls to get_blob_properties and get_container_properties should still raise on anything that is not a 404, otherwise a 500 (or any other error) would be indistinguishable from "blob/container does not exist".

I feel that the best course of action would be:

  • Create methods called blob_exists? and container_exists?, which would call the API to get properties and return true/false depending on whether the response is a 200 or 404. Other 4xx and 5xx errors should still be raised as exceptions (e.g. if the API is down, or if the container or blob name is invalid)
  • The methods get_blob_properties and get_container_properties could either raise on all HTTP errors, inclusing 404 (as get_blob_properties does currently) or, if necessary for compatibility with Paperclip, catch only 404 responses and return a Blob/Container that returns false on present?. They should still raise other HTTP errors, as those should not be conflated with a missing container or blob.

In case it's useful, I am happy to contribute a PR if a decision on the appropriate plan is made.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants