-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow Table List Retrieval in Many Tables #589
Comments
Hey, thanks for a very detailed issue. Indeed I anticipated this might be the case seafowl/src/catalog/metastore.rs Lines 148 to 152 in 5b0462f
So basically the root of the issue is that the DataFusion loads each (in our case Delta) table serially whilst generating the My presumption is that this can be averted by having a bulk table load method on the schema provider (or something else). The only real reason to load each table (since we already have its name) is to find out its table type to display. In the short term we can take a look at fixing the |
Thank you for your kind and detailed answer. I now understand that we need to request all information to verify the actual state of deleted tables. I was curious about this because SQLite has metadata, so I wondered why it would take a long time. |
There's now an accompanying DataFusion issue apache/datafusion#11865, though I'm not sure about a good long-term solution there. In the meantime, the workaround for |
Description
We are currently managing a large number of tables (964) in our database, with expectations for continued growth. We've encountered a significant performance issue when attempting to retrieve the list of tables using the
SELECT * FROM information_schema.tables
query. This operation is taking over 30 seconds to complete, which is causing concerns about efficiency and scalability.Context
DROP TABLE
andCREATE TABLE
operations do not supportIF NOT EXISTS
orIF EXISTS
clausesSELECT * FROM information_schema.tables
to check table existenceImpact
The slow retrieval of the table list is affecting our ability to efficiently manage and operate on our database schema. This issue may become more severe as we continue to add more tables to the database.
Questions for Investigation
information_schema.tables
query to improve its performance?IF NOT EXISTS
andIF EXISTS
clauses in ourCREATE TABLE
andDROP TABLE
operations to avoid the need for this query?Additional Information
Next Steps
We would appreciate insights from the team on:
Any additional context or suggestions for improving our database management approach would be greatly appreciated.
The text was updated successfully, but these errors were encountered: