Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tar_mirage: feature proposal for a set_streaming #158

Open
hannesm opened this issue Nov 4, 2024 · 0 comments
Open

tar_mirage: feature proposal for a set_streaming #158

hannesm opened this issue Nov 4, 2024 · 0 comments

Comments

@hannesm
Copy link
Member

hannesm commented Nov 4, 2024

In our flagship project, opam-mirror, it'd be great to support the following use-case:

  • we download a file to a temporary swap file system (i.e. we write the downloaded chunks to disk)
  • we verify the validity of the checksum
  • if the checksum is ok, we want to move that data over to the tar file system
  • now, the big issue is about the API and failure safety: at any moment in time, there may be some power failure -- so we'd like the semantics of set -- i.e. first write the data, then as last elements the tar header -- so that at (nearly every) moment the tar is consistent, and the data that is written is correct / valid

Now, the question is what the API should be - maybe val set_streaming : t -> key -> size:int -> (bytes -> offset:int -> length:int -> (unit, error) Lwt.t) -> (unit, error) Lwt.t

Or should we rely on Lwt_stream.t? Maybe val set_stream : t -> key -> string Lwt_stream.t -> (unit, error) Lwt.t is a neat way to go?

It is a bit unclear to me what to do about ownership of buffers/data, and also who is in charge of the size of chunks?

A "size" argument is not really necessary -- we'll just keep the write_lock, so there's no other simultaneous writer -- also once the stream is finished, we know by heart how much data we wrote ;).

Another question is whether a similar read interface is useful and worth to implement (i.e. val get_stream : t -> key -> (string Lwt_stream.t, error) Lwt.t)? Again, who should control the chunk size, and what should be done with intermediate read errors?

I guess some more research / looking at other streaming APIs may be beneficial. If someone has nice pointers, please feel free to suggest them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant