-
-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Returning large number of rows in SimpleQueryHandler::do_query() is extremely slow #37
Comments
Thank you for reporting. I haven't got chance to work on performance of pgwire. I will do some profile to find out the bottleneck. Contribution is welcomed if you are interested in this part. |
In #94 I'm adding |
Oh thats really cool. |
I was looking at the again. It still seems pretty slow. When executing a query with alot of results (millions) on postgres directly it seems that some values are returned to the client before the query has finished executing (so it gives the impression that its quicker), however when using pgwire, all results are returned before any results are shown. Is this the case? The implementation of streaming to the client I'm using is identical to the datafusion example. Also in GrepTime, have you guys done performance testing with large resultsets? |
Sorry for late response. I have been super busy these days. At greptime we haven't cover this part on postgres interface. I'm going to check the code again but iirc we are using a stream based API to return results to client. It seems my datafusion example has some potential improvement that for a recordbatch, we don't need to add all results in the vector. This might be the reason you it's blocking for results. I will find time to update the example. |
I just improved performance of DataRowEncoder in #165 and it should have twice throughtput in some cases. |
Hi there. I've been experimenting with pgwire. Using psql when returning a large number of rows (even with a single i32 column) from SimpleQueryHandler::do_query() the results are almost 10x slower than in postgres.
version : 0.7.0
OS: MacOS M1
The text was updated successfully, but these errors were encountered: