Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seq write #218

Open
wants to merge 2 commits into
base: next
Choose a base branch
from
Open

Conversation

ashishsangwan2
Copy link

@dang @ffilz @malahal

Can you please review the changesets?

This is inspired from the linux kernel client changeset:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/net/sunrpc/xprtsock.c?id=8d6f97d698b2b9e71124ba95688a6bf1adad5055

Also, for the current changeset, I haven't yet submitted the Ganesha change for setting the TIRPC conf.
Though the default value for cont_recv_limit is 64, it gets reset to 0 by tirpc_control() once Ganesha starts.

…RPCs

Current scenario:
Xprt socket fd is being monitored via epoll mechanism for the arrival of data on the socket.
Once the data is arrived, an epoll event is generated and the socket fd is removed from the epoll fd.
A new thread is chosen to monitor the epoll fd while the current thread recv 1 RPC worth of data.
In the current thread, once the data is received, the socket fd is added back to the epoll fd
so that other thread can start recving data while the current thread than continues executing the RPC.

Changed scenario:
We dont add the socket fd back to epoll fd unconditionally once the data is read from the socket.
We look at the RPC size, if it's more than 4KB (which can only happen in the case of write RPC)
and if there is data still available on the socket fd to be read, the current thread
will keep doing recv while dispatching the requests asynchronously to svc_request_async().
At max we will recv max_cont_recv requests continuously (defaults to 64) in a single thread.
With this patch we are seeing consistent seq write perf improvement in the range of 15 - 20%
Copy link
Collaborator

@dang dang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This raises issues that need to be discussed. The current scheme (one RPC per epoll iteration) is our only method of ensuring fairness between clients. If we continue to read from a single socket, this breaks this fairness guarantee. Can we be sure this will be "fair enough"? It seems a "few" write heavy clients could starve other clients out.

@dang dang requested review from ffilz and mattbenjamin July 20, 2020 13:16
@ashishsangwan2
Copy link
Author

ashishsangwan2 commented Jul 20, 2020

If we continue to read from a single socket, this breaks this fairness guarantee

That is correct. That is why I have added this as conf param. Old behaviour can be achieved by setting cont_recv_limit as 0. Also, I am looking for suggestion(s) to keep this as fair as possible while still retaining the performance improvement.

@mattbenjamin
Copy link

@dang @ashishsangwan2 nfs-ganesha actually preferred to continue reading on active sockets, but as @dang noted, fairness was a problem in older ganesha. I think the configurable approach here is fine, but possibly the default coefficient of 64 is a bit large?

@ashishsangwan2
Copy link
Author

ashishsangwan2 commented Jul 20, 2020

One thing we can do is return the number of executing + pending request to the caller of work_pool_submit() i.e. svc_vc_recv(). Based on that, we can decide should we continue recv OR should we yield.

@ashishsangwan2
Copy link
Author

ashishsangwan2 commented Jul 20, 2020

Thanks @mattbenjamin for the reply. Would you please suggest in your opinion what could be a fair default value? Also, I would like to know your thoughts about returning pending requests as part of work_pool_submit() and if we could make it a factor for doing continuous recv(). For example if (pending_request < x*ioq_thread_max), continue doing recv.
x could be 1 or 2

@mattbenjamin
Copy link

default value: maybe 4 or 8? It sounds helpful to consider pending_requests, but I'm not sure about the tuning. Maybe a fractional value would be safest.

@ashishsangwan2
Copy link
Author

ashishsangwan2 commented Jul 20, 2020

Here's a quick summary of the performance numbers I got on my setup with different values of cont_recv_limit
cont_recv_limit:SeqThroughput
0:740MB/sec
4:788MB/sec
8:800MB/sec
16:820MB/sec
24:840MB/sec
1000:900MB/sec
As next step I will do these changes:
a) Set default to 8.
b) Also check the number of pending requests for taking the decision of continue receiving or not. Will make it 50% of the ioq_thread_max

@mattbenjamin
Copy link

Also of interest though, is affect on concurrent clients.

@ashishsangwan2
Copy link
Author

ashishsangwan2 commented Jul 20, 2020

Also of interest though, is affect on concurrent clients.

@mattbenjamin
I think that once we take the number of pending requests into account, than it will be like old behaviour under load scenarios originating from concurrent clients. I will be taking perf numbers with multiple clients doing parallel writes. If you could suggest any other particular test cases, I can try that.

@dang
Copy link
Collaborator

dang commented Jul 21, 2020

Could you have some clients doing writes, and others doing something else, such as listing large directories? That's a nicely gated round-trip workload that could be easily starved.

@ashishsangwan2
Copy link
Author

@dang Ok, I am going to measure the directory listing time (in next week or so) for a directory with 100K entries while 0, 2, 4, 6 clients doing parallel sequential writes, with and without the patch.

@ffilz
Copy link
Member

ffilz commented May 14, 2024

I think we should abandon this one. Submitter has not gotten back to us.

@ffilz ffilz added the wontfix label May 14, 2024
@dang
Copy link
Collaborator

dang commented May 15, 2024

Let's leave it for the moment, for future reference when we revisit queuing and whatnot.

@ffilz ffilz added enhancement and removed wontfix labels May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants