Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pcloud] OpenWriterAt: Default values for multi threaded upload #8225

Open
gfelbing opened this issue Dec 3, 2024 · 8 comments
Open

[pcloud] OpenWriterAt: Default values for multi threaded upload #8225

gfelbing opened this issue Dec 3, 2024 · 8 comments

Comments

@gfelbing
Copy link
Contributor

gfelbing commented Dec 3, 2024

The associated forum post URL from https://forum.rclone.org

Follow up on #8147

What is your current rclone version (output from rclone version)?

Latest master

What problem are you are trying to solve?

As @ncw described it:

We don't have a way currently for a backend to give a hint as to what chunk size it prefers for OpenWriterAt only for OpenChunkWriter. It would be easy enough to make one though and I think it is a good idea.

How do you think rclone should be changed to solve that?

As far as I see there are two approaches:

  1. Add a way for backends to give hints for buffer size etc. in OpenWriterAt. (Similar to ChunkWriterInfo for OpenChunkWriter)
  2. Implement OpenChunkWriter in pcloud.

IMHO 1/ is more invasive to the code base but cleaner and consistent with the ChunkWriter. Your thoughts @ncw?

How to use GitHub

  • Please use the 👍 reaction to show that you are affected by the same issue.
  • Please don't comment if you have no relevant information to add. It's just extra noise for everyone subscribed to this issue.
  • Subscribe to receive notifications on status change and new comments.
@ncw
Copy link
Member

ncw commented Dec 5, 2024

I think option 1 is probably the best, but how to do it neatly...

There are some related problems with multi thread transfers. For example the S3 backend has --s3-upload-cutoff which is used to switch between single part and multi part, but under certain conditions the --multi-thread-cutoff flag is used instead. This is super confusing to users.

The same situation happens with concurrency, and now here with chunk size.

So maybe what we need is an info struct with these 3 things in. This could maybe be part of the Features() struct, not sure.

We need to know whether these are set or not (to use the global defaults) so they need to default to an invalid value. Zero is probably ok.

I think that would solve your problem and also some other ones which have been annoying the users!

We would need to add a flag to the pcloud backend so it could be independently controlled.

How does that sound?

@gfelbing
Copy link
Contributor Author

gfelbing commented Dec 7, 2024

Hey @ncw,
I thought a bit on that: The idea to streamline the arguments sounds really good, especially cases like you described with S3's upload cutoff.
One thing that I see differently is the part with the additional options for the backend:

  • that could bring again confusion to the users, similar to your example
  • we need to come up with a logic of what settings is winning: option default vs global default, option setting vs global setting. That again might be confusing.

What I outlined in PR #8226 is that the backend would only provide it's own default for the global option.
That way, the process is simpler, e.g. the chosen option is the first valid of

  • the option chosen by the user
  • the option suggested by the backend
  • the global default

What do you think?

@ncw
Copy link
Member

ncw commented Dec 8, 2024

That way, the process is simpler, e.g. the chosen option is the first valid of

  • the option chosen by the user
  • the option suggested by the backend
  • the global default

That is a good way of thinking about it - how to explain to the user what is happening. There is some complication in the above though...

  • the option chosen by the user - as set by --multi-thread-chunk-size, --multi-thread-cutoff, --multi-thread-streams
  • the option suggested by the backend - this can also be set by the user as --s3-chunk-size, --s3-upload-cutoff, --s3-upload-concurrency
  • the global default - these are the default values of the --multi-thread* variables

There are two ways we can do multi thread copies

  1. OpenWriterAt
  2. OpenChunkWriter

Currently with OpenChunkWriter the backend specifies a chunk size and that gets used. Due to the way that protocol works, that is probably the only way it can be done. The cutoff and the streams are a bit of a mess though.

Currently with OpenWriterAt the backend has no say at all and the values of the --multi-thread* variables are used.

My concern in the above scheme is that it that the backend will always specify a value for chunk size, cutoff and concurrency most likely so I think we need something a bit more complicated

  • the option chosen by the user - as set by --multi-thread-chunk-size, --multi-thread-cutoff, --multi-thread-streams - if and only if this is changed from the default value
  • the option suggested by the backend - this can also be set by the user as --s3-chunk-size, --s3-upload-cutoff, --s3-upload-concurrency
  • the global default - these are the default values of the --multi-thread* variables

Detecting whether a value has been changed from the default isn't particularly easy with the rclone config system unfortunately - the config system provides either the value the user set or a default. To detect whether it is not the default will require a bit of extra work, expecially considering the defaults can be set with environment variables.

Another alternative would be

  • We use the maximum value of as set by the user and the backend, eg
    • max(--multi-thread-chunk-size, --s3-chunk-size)
    • max(--multi-thread-cutoff, --s3-upload-cutoff)
    • max(--multi-thread-streams, --s3-upload-concurrency)

That would be much easier to implement but doesn't let the user lower values easily.

@gfelbing
Copy link
Contributor Author

gfelbing commented Dec 8, 2024

Agree, the max option would be easiest to implement but not being able to lower values can easily be a really bad thing for users. Also quite surprising, I've never seen a cli acting like that.

My take was actually to remove the option from the backend (e.g. --s3-chunk-size) entirely.
That would leave us with 3 options that can easily hierarchically ordered:

  • user chosen value
  • suggestion by backend
  • global default

Only catch is that we need to specify whether the option was set or not. As you already said, it's not exactly convenient but possible. We could make it code wise a bit nicer by abstracting a function for it though.

On the other hand, we could also turn it around by removing the global option and solely relying on the backends to offer an option. Then it would also be easy implementation wise:

  • add an option with a default suitable for the backend
  • return the values via info struct on openwriterat

The configuration would then be the first of

  • option set by user
  • default of the option specific for backend

You can hardly go simpler.

@ncw
Copy link
Member

ncw commented Dec 14, 2024

Agree, the max option would be easiest to implement but not being able to lower values can easily be a really bad thing for users. Also quite surprising, I've never seen a cli acting like that.

Agreed

My take was actually to remove the option from the backend (e.g. --s3-chunk-size) entirely. That would leave us with 3 options that can easily hierarchically ordered:

  • user chosen value
  • suggestion by backend
  • global default

Only catch is that we need to specify whether the option was set or not. As you already said, it's not exactly convenient but possible. We could make it code wise a bit nicer by abstracting a function for it though.

I don't think we can remove the backend options. They've been around for a long time and for most of that time were the only way of controlling multipart copies

On the other hand, we could also turn it around by removing the global option and solely relying on the backends to offer an option. Then it would also be easy implementation wise:

  • add an option with a default suitable for the backend
  • return the values via info struct on openwriterat

The configuration would then be the first of

  • option set by user
  • default of the option specific for backend

You can hardly go simpler.

I like this idea better.

This would effectively make the --multi-thread-* options obsolete. We can't remove them though (backwards compatibility) but they could make a warning.

I think it is better to be able to get the info struct separately from OpenWriterAt though because committing to calling it means it is too late to decide on the cutoff.

I'll have a noodle with some code and see if it gives me any ideas!

@IZSkiSurfer
Copy link

IZSkiSurfer commented Mar 15, 2025

Is there atm. a recommended size for "--multi-thread-chunk-size" when using pCloud?
Because with the current version I'm still facing issues with large files if I don't disable OpenWriterAt.

e.g.

2025/03/15 17:06:37 DEBUG : __filename__: multi-thread copy: chunk 3/30 failed: multi-thread copy: failed to write chunk: open file: open new file descriptor: pcloud error: Internal upload error. (5001)
2025/03/15 17:06:37 DEBUG : __filename__: multi-thread copy: chunk 5/30 (268435456-335544320) size 64Mi starting
2025/03/15 17:06:37 DEBUG : __filename__.60633b35.partial: writing chunk 4
2025/03/15 17:06:37 DEBUG : __filename__: multi-thread copy: chunk 5/30 failed: multi-thread copy: failed to write chunk: context canceled
2025/03/15 17:06:37 DEBUG : pacer: low level retry 1/1 (error pcloud error: Internal upload error. (5001))
2025/03/15 17:06:37 DEBUG : pacer: Rate limited, increasing sleep to 40ms
2025/03/15 17:06:37 DEBUG : __filename__: multi-thread copy: chunk 2/30 failed: multi-thread copy: failed to write chunk: open file: open new file descriptor: pcloud error: Internal upload error. (5001)
2025/03/15 17:06:37 DEBUG : pacer: Reducing sleep to 30ms
2025/03/15 17:06:37 DEBUG : pacer: Reducing sleep to 22.5ms
2025/03/15 17:06:37 DEBUG : pacer: Reducing sleep to 16.875ms
2025/03/15 17:06:37 DEBUG : pacer: low level retry 1/1 (error pcloud error: Internal upload error. (5001))
2025/03/15 17:06:37 DEBUG : pacer: Rate limited, increasing sleep to 33.75ms
2025/03/15 17:06:37 DEBUG : __filename__: multi-thread copy: chunk 1/30 failed: multi-thread copy: failed to write chunk: open file: open new file descriptor: pcloud error: Internal upload error. (5001)
2025/03/15 17:06:37 DEBUG : pacer: Reducing sleep to 25.3125ms
2025/03/15 17:06:37 DEBUG : pacer: Reducing sleep to 18.984375ms
2025/03/15 17:06:37 DEBUG : pacer: Reducing sleep to 14.238281ms
2025/03/15 17:06:37 DEBUG : pacer: Reducing sleep to 10.67871ms
2025/03/15 17:06:37 DEBUG : pacer: low level retry 1/1 (error pcloud error: Internal upload error. (5001))
2025/03/15 17:06:37 DEBUG : pacer: Rate limited, increasing sleep to 21.35742ms
2025/03/15 17:06:37 DEBUG : __filename__: multi-thread copy: chunk 4/30 failed: multi-thread copy: failed to write chunk: open file: open new file descriptor: pcloud error: Internal upload error. (5001)
2025/03/15 17:06:37 DEBUG : __filename__: multi-thread copy: cancelling transfer on exit
2025/03/15 17:06:37 DEBUG : __filename__.60633b35.partial: checking file size: try 0/5
2025/03/15 17:06:37 DEBUG : pacer: Reducing sleep to 16.018065ms
2025/03/15 17:06:37 DEBUG : pacer: Reducing sleep to 12.013548ms
2025/03/15 17:06:37 DEBUG : pacer: Reducing sleep to 10ms
2025/03/15 17:06:38 DEBUG : __filename__.60633b35.partial: checking file size: try 1/5
2025/03/15 17:06:40 DEBUG : __filename__.60633b35.partial: checking file size: try 2/5
2025/03/15 17:06:41 DEBUG : __filename__.60633b35.partial: checking file size: try 3/5
2025/03/15 17:06:42 DEBUG : __filename__.60633b35.partial: checking file size: try 4/5
2025/03/15 17:06:43 ERROR : __filename__.60633b35.partial: multi-thread copy: failed to close file before aborting: incorrect size after upload: got 204341248, want 1996672259
2025/03/15 17:06:43 DEBUG : __filename__: Received error: multi-thread copy: failed to write chunk: open file: open new file descriptor: pcloud error: Internal upload error. (5001) - low level retry 4/10
2025/03/15 17:06:43 DEBUG : __filename__: multi-thread copy: disabling buffering because destination uses OpenWriterAt
2025/03/15 17:06:43 DEBUG : __filename__: multi-thread copy: write buffer set to 131072
2025/03/15 17:06:43 DEBUG : __filename__: multi-thread copy: using backend concurrency of 4 instead of --multi-thread-streams 4
2025/03/15 17:06:43 DEBUG : __filename__: Starting multi-thread copy with 30 chunks of size 64Mi with 4 parallel streams
2025/03/15 17:06:43 DEBUG : __filename__: multi-thread copy: chunk 4/30 (201326592-268435456) size 64Mi starting
2025/03/15 17:06:43 DEBUG : __filename__: multi-thread copy: chunk 1/30 (0-67108864) size 64Mi starting
2025/03/15 17:06:43 DEBUG : __filename__: multi-thread copy: chunk 2/30 (67108864-134217728) size 64Mi starting
2025/03/15 17:06:43 DEBUG : __filename__: multi-thread copy: chunk 3/30 (134217728-201326592) size 64Mi starting
2025/03/15 17:06:43 DEBUG : __filename__.60633b35.partial: writing chunk 1
2025/03/15 17:06:43 DEBUG : __filename__.60633b35.partial: writing chunk 0
2025/03/15 17:06:43 DEBUG : __filename__.60633b35.partial: writing chunk 3
2025/03/15 17:06:43 DEBUG : __filename__.60633b35.partial: writing chunk 2

@IZSkiSurfer
Copy link

I tried with the recommended 16Mi which I found in a pull request. (default 4 streams)
I checked one file in the trash which showed 2.0 GB in size at the pCloud website.
But the log showed:

Failed to copy: multi-thread copy: failed to close object after copy: incorrect size after upload: got 919728128, want 2173798452

Would be checking the checksum instead of the filesize maybe working more reliable here:

if obj.Size() == c.size {

I don't know when pCloud does provide the (hopefully correct) hash and/or the size but at least regarding the size it is reporting the wrong file size at least for a certain amount of time. Maybe increasing the 5 retries (or the delay within would help too).

@IZSkiSurfer
Copy link

I fiddled around a little bit... see logs down below.
It looks like it takes about 60 seconds until size and/or checksum do match.
So the 5 (or 6) (re)tries with 1 second delay will fail.
I don't know if it depends on the file size or the number of files in a folder or some other bad magic on pCloud server side.

I'm not that into the codebase (yet) but couldn't we maybe do

	// get target hash
	outChecksum, err := fileChecksum(c.ctx, client, c.fs.pacer, openResult.FileDescriptor, offset, int64(contentLength))

again after

	// upload buffer with offset if necessary
	if _, err := filePWrite(c.ctx, client, c.fs.pacer, openResult.FileDescriptor, offset, buffer); err != nil {

and check the sha1 hash of each individual chunk (given that this would work more reliable and faster?!) instead of the final size check? If all chunks were uploaded without an error and each individual chunk's hash matched with the source then the file most likely was uploaded successfully and we could skip the long and unreliable final size check (where a hash check would be better btw.).
What also could be worth a shot is to move/rename the "*.partial" file on the remote before checking the size, maybe that action triggers a faster refresh of the size/checksum on pCloud server side.

2025/03/16 00:38:06 DEBUG : __filename__: multi-thread copy: chunk 126/128 (2097152000-2113929216) size 16Mi finished
2025/03/16 00:38:06 DEBUG : __filename__: multi-thread copy: chunk 124/128 (2063597568-2080374784) size 16Mi finished
2025/03/16 00:38:07 DEBUG : __filename__: multi-thread copy: chunk 128/128 (2130706432-2147340288) size 15.863Mi finished
2025/03/16 00:38:08 DEBUG : __filename__: multi-thread copy: chunk 127/128 (2113929216-2130706432) size 16Mi finished
2025/03/16 00:38:08 DEBUG : __filename__.429ed3dc.partial: checking file size: try 1/10
2025/03/16 00:38:15 DEBUG : __filename__.429ed3dc.partial: checking file hash: want b99c858b3d22c04cd7bc90bd36f2f088c326b6b8
2025/03/16 00:38:15 DEBUG : __filename__.429ed3dc.partial: checking file hash: got da39a3ee5e6b4b0d3255bfef95601890afd80709, want b99c858b3d22c04cd7bc90bd36f2f088c326b6b8
2025/03/16 00:38:15 DEBUG : __filename__.429ed3dc.partial: checking file size: got 0, want 2147340288
2025/03/16 00:38:16 DEBUG : __filename__.429ed3dc.partial: checking file size: try 2/10
2025/03/16 00:38:16 DEBUG : __filename__.429ed3dc.partial: checking file hash: want b99c858b3d22c04cd7bc90bd36f2f088c326b6b8
2025/03/16 00:38:16 DEBUG : __filename__.429ed3dc.partial: checking file hash: got da39a3ee5e6b4b0d3255bfef95601890afd80709, want b99c858b3d22c04cd7bc90bd36f2f088c326b6b8
2025/03/16 00:38:16 DEBUG : __filename__.429ed3dc.partial: checking file size: got 0, want 2147340288
2025/03/16 00:38:18 DEBUG : __filename__.429ed3dc.partial: checking file size: try 3/10
2025/03/16 00:38:18 DEBUG : __filename__.429ed3dc.partial: checking file hash: want b99c858b3d22c04cd7bc90bd36f2f088c326b6b8
2025/03/16 00:38:18 DEBUG : __filename__.429ed3dc.partial: checking file hash: got da39a3ee5e6b4b0d3255bfef95601890afd80709, want b99c858b3d22c04cd7bc90bd36f2f088c326b6b8
2025/03/16 00:38:18 DEBUG : __filename__.429ed3dc.partial: checking file size: got 0, want 2147340288
2025/03/16 00:38:22 DEBUG : __filename__.429ed3dc.partial: checking file size: try 4/10
2025/03/16 00:38:22 DEBUG : __filename__.429ed3dc.partial: checking file hash: want b99c858b3d22c04cd7bc90bd36f2f088c326b6b8
2025/03/16 00:38:22 DEBUG : __filename__.429ed3dc.partial: checking file hash: got da39a3ee5e6b4b0d3255bfef95601890afd80709, want b99c858b3d22c04cd7bc90bd36f2f088c326b6b8
2025/03/16 00:38:22 DEBUG : __filename__.429ed3dc.partial: checking file size: got 0, want 2147340288
2025/03/16 00:38:30 DEBUG : __filename__.429ed3dc.partial: checking file size: try 5/10
2025/03/16 00:38:30 DEBUG : __filename__.429ed3dc.partial: checking file hash: want b99c858b3d22c04cd7bc90bd36f2f088c326b6b8
2025/03/16 00:38:30 DEBUG : __filename__.429ed3dc.partial: checking file hash: got da39a3ee5e6b4b0d3255bfef95601890afd80709, want b99c858b3d22c04cd7bc90bd36f2f088c326b6b8
2025/03/16 00:38:30 DEBUG : __filename__.429ed3dc.partial: checking file size: got 0, want 2147340288
2025/03/16 00:38:40 INFO  : 
Transferred:        2.000 GiB / 2.000 GiB, 100%, 309.546 KiB/s, ETA 0s
Transferred:            0 / 1, 0%
Elapsed time:      9m59.9s
Transferring:
 *                            __filename__:100% /2.000Gi, 308.463Ki/s, 0s

2025/03/16 00:38:46 DEBUG : __filename__.429ed3dc.partial: checking file size: try 6/10
2025/03/16 00:38:47 DEBUG : __filename__.429ed3dc.partial: checking file hash: want b99c858b3d22c04cd7bc90bd36f2f088c326b6b8
2025/03/16 00:38:47 DEBUG : __filename__.429ed3dc.partial: checking file hash: got da39a3ee5e6b4b0d3255bfef95601890afd80709, want b99c858b3d22c04cd7bc90bd36f2f088c326b6b8
2025/03/16 00:38:47 DEBUG : __filename__.429ed3dc.partial: checking file size: got 0, want 2147340288
2025/03/16 00:39:19 DEBUG : __filename__.429ed3dc.partial: checking file size: try 7/10
2025/03/16 00:39:19 DEBUG : __filename__.429ed3dc.partial: checking file hash: want b99c858b3d22c04cd7bc90bd36f2f088c326b6b8
2025/03/16 00:39:19 DEBUG : __filename__.429ed3dc.partial: checking file hash: got b99c858b3d22c04cd7bc90bd36f2f088c326b6b8, want b99c858b3d22c04cd7bc90bd36f2f088c326b6b8
2025/03/16 00:39:19 DEBUG : __filename__.429ed3dc.partial: checking file size: got 2147340288, want 2147340288
2025/03/16 00:39:19 DEBUG : __filename__: Finished multi-thread copy with 128 parts of size 16Mi
2025/03/16 00:39:19 DEBUG : __filename__: sha1 = b99c858b3d22c04cd7bc90bd36f2f088c326b6b8 OK
2025/03/16 00:39:19 DEBUG : __filename__.429ed3dc.partial: renamed to: __filename__
2025/03/16 00:39:19 INFO  : __filename__: Multi-thread Copied (new)
2025/03/16 00:39:19 INFO  :
Transferred:        2.000 GiB / 2.000 GiB, 100%, 24.980 KiB/s, ETA 0s
Transferred:            1 / 1, 100%
Elapsed time:     10m38.5s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants