Performance for large databases with 10+ million entries #219
Replies: 2 comments 1 reply
-
Hi @Typ0genius, it's hard to say without knowing more about the schema you are dealing with, but in general I think it is fair to say that using SQLite directly is going to generally be faster than Core Data, which uses SQLite under the hood but has layers of abstraction built on top. And our SQLiteData library also has abstractions built on top of SQLite, but it's not much beyond some type safety and decoding logic, which is a bare minimum of what you want since SQLite is largely an untyped database and its API is built in C. There absolutely are people out there using SQLite for databases that have hundreds of millions of rows and such that the database file is tens of gigabytes. The usual limiting factor of SQLite's performance is how well your schema is designed, such as making sure you have indices that turn the kinds of queries you write into fast look ups. As one concrete example of speed differences between SQLite and SwiftData: By tweaking the settings of SQLite in various ways, someone was able to insert 100 million rows into a SQLite database in 34 seconds (source). And according to this, it can take upwards of 7 seconds to insert 100 thousand rows into SwiftData. So, that very rough comparison is essentially a 200x speed up of SQLiteData over SwiftData. This by no means says that SQLite is always 200x faster than SwiftData. Such a claim does not even make sense. But I think it is safe to say that SQLite is much faster than SwiftData when it comes to inserting many rows at once. If you provide some concrete info about your schema (i.e. the models that are persisted to Core Data), as well as the rough number of rows in the database for each model, then I might be able provide more concrete information. |
Beta Was this translation helpful? Give feedback.
-
@mbrandonw Thanks for the quick reply. I quickly built a sample project based on your reminders demo, and this is my schema. There may be multiple variations of the SampleTable, but they all reference the DBApp and no other relations are involved. For now, there is no need to join multiple variations of SampleTable. How many rows a SampleTable will actually contain is hard to say, but in the worst case I expect somewhere between 10 and 100 million. The data is provided as DataFrames, and I would also prefer to read them in that format. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
what level of performance can be expected when working with large databases, for example when inserting and querying millions of entries? I am currently using Core Data, but I’m not satisfied with its performance, which noticeably degrades at this scale. Most of the data is structured as one-to-many relationships. This may be partly due to my own lack of experience, but also due to limitations of the framework itself.
How would you expect SQLiteData to perform in comparison in such scenarios? As a developer, what should I pay particular attention to when working with SQLiteData? For example, Core Data requires saving the context regularly when inserting large batches, otherwise memory consumption grows rapidly.
I would greatly appreciate any insights.
Beta Was this translation helpful? Give feedback.
All reactions