MWS: Use Promises for sql-tiddler-database.js #8885

Arlen22 · 2025-01-08T16:25:36Z

Can we use promises for sql-tiddler-database.js? Most modern database adapters should not be blocking the server thread, especially if doing so via a network request, since anything more than a memory store is inherently asynchronous anyway.

Even if the sqlite adaptor we use is synchronous, any attempt to use a different adapter with any other sql engine (mysql, postgres, etc) will probably be asynchronous. Some database adapters probably don't even give you a sync option and ONLY use promises.

I was digging around in the code, of course, because that's what I like to do, and when I saw that database calls were synchronous, it reminded me of all the headaches that resulted back when I was trying to implement TiddlyServer to serve multiple wikis and discovered that the entirety of the TiddlyWiki boot loader was using sync filesystem calls, making the entire server (for all wikis) freeze for the duration of any file system operation. The slower your file system, the more noticable this becomes.

Since this system is being intended as a robust mult-user scenario, there is no way we can afford to use blocking I/O for anything in the server. It will only be a matter of time before some corner case comes back to bite us and it will absolutely prevent this from being used at scale. Forcing the server to twiddle it's thumbs every time it makes a database call prevents any possibility of using modern database scaling and connection pooling since we can never make more than one connection at a time.

Now that we are implementing a new system for saving and loading tiddlers, it would be absolutely amazing if we could make this async from the start. I've seen talk of using very recent Node version features (the native sqlite adapter, if I recall), and async/await was already implemented back in ES2017. I don't really see a reason not to restrict the MWS plugin to use ES2017 at the very least, given how much the Javascript world has changed between es5 and es2017. I'm sure there's security considerations that would argue for this as well.

I feel like using Promises is super important, not only for performance, but also for the ability to build other systems around it as needed.

Arlen22 · 2025-01-08T22:42:57Z

Node v8 is when async await and promises was introduced.

I don't remember what the minimum node version is we're supporting, but I'm assuming we're trying to support older browsers so we can't bring that into core code on the browser side.

Arlen22 · 2025-01-09T08:43:41Z

I still wish the tiddler file loading code in boot.js had been asynchronous because it would have allowed me to directly load wiki folders from Dropbox into a cloud connector and then used the tiddlyweb adaptor to save changes back to Dropbox. All of the file system operations had equivalent commands in Dropbox, and had boot.js been using async I/O, it would have been trivial to map them.

I never made that connector because I couldn't see myself continually translating code every time there was an update. Even if the code hardly ever changes, I would never be able to keep up with checking every version to make sure nothing had changed anywhere.

Arlen22 · 2025-01-09T09:05:22Z

NodeJS processes are intended to run at most one per core. With TiddlyWiki that is ever more the case, as it loads a significant amount of data into memory. The thought of trying to scale that per user is slightly terrifying. Granted, you can't put too much load on a mediocre database server, but you can usually do a lot more than one connection! And if you have multiple users each working on their own section, there's no reason not to have concurrent connections. The bigger it gets the more likely you'll run into users waiting for their turn to use that one connection.

And don't forget that literally the entire NodeJS process will be waiting on that one database connection. All parsing and tabulating and listing results and so forth has to happen while the connection is idle. So only one person's web request can be served at a time. From first byte in to last byte out, only one request can be served at a time. Only the actual data transfer gets offloaded. You can buffer bytes, but you can't process them (I should know how backpressure works, but I don't).

Arlen22 · 2025-01-09T09:51:48Z

I took a stab at it. #8888

pmario · 2025-01-10T16:42:04Z

There is a nice blog post about SQLite scalability: https://blog.wesleyac.com/posts/consider-sqlite, which you may find interesting.

I found the link: https://use.expensify.com/blog/scaling-sqlite-to-4m-qps-on-a-single-server also very interesting, where they describe, how to get 4 million reads per second with a single server. Much cheaper and more reliable than what they could get from Amazon.

There is some more performance related info at: https://sqlite.org/fasterthanfs.html -- read the full post.

A 2022 study (alternative link on GitHub) found that SQLite is roughly twice as fast at real-world workloads compared to Btrfs and Ext4 on Linux.

Also the "latency" charts later in the post are interesting, where the SQLite reads are much faster then direct IO on Windows. Which still is the OS, that 80% of our users will use to run TW servers.

Also see: 3 General Findings

pmario · 2025-01-10T16:45:21Z

I know that Jeremy does not watch videos, but I did find this one very informative. It mainly is about HTMX, -- it has a lot of relations to TWs "values".

I think we could replace HTMX with TiddlyWiki and give the exact same talk.

Building the Hundred-Year Web Service with htmx - Alexander Petros

pmario · 2025-01-10T16:51:10Z

Since this system is being intended as a robust mult-user scenario, there is no way we can afford to use blocking I/O for anything in the server. ...

That's not necessarily a "no go" problem. See: Write-Ahead Logging

Arlen22 · 2025-01-10T18:42:10Z

I'm not sure if your arguing for or against asynchronous SQLite calls. The problem I'm pointing out is that the entire NodeJS server thread is blocked while the SQLite call is being made, which means that the server can never process more than one web request at a time, so you can forget your 100 concurrent writers, because you'll never be able to have more than one.

WAL provides more concurrency as readers do not block writers and a writer does not block readers. Reading and writing can proceed concurrently. - From Write-Ahead Logging

The problem is that the current setup is blocking the JavaScript thread while the SQLite call is being made. Only one web request can be processed at a time. In order to use the WAL we still either need to use worker threads and promises or spawn multiple processes and forward requests from a parent process.

I'm guessing the simplest way to handle this would be to put SQLite on worker threads using the worker-threads module introduced in Node v12 and use promises for database calls.

But I guess the bigger reason I'm requesting this is because the moment I saw it, I immediately wanted to rewrite it to use Postgres, etc. It's not only about database performance. It's about being able to separate the application and database into separate processes and use whatever database you want.

Arlen22 · 2025-01-10T22:08:50Z

One thing I just ran into is transactions. Serving one request at a time vastly simplifies transaction handling. I think each request needs to be assigned a worker thread from a pool, which could be tuned based on server performance, and if no threads are available, a request would just wait to start until a worker is free.

Arlen22 · 2025-01-12T16:47:14Z

So, after some more tinkering and poking around, I've come to the conclusion that the entire request chain would have to be gutted and rewritten in order to support async IO. So much of the code relies on the atomicity of one request happening at a time that changing this requires actually rethinking the structure of the request chain.

Here are some things that the request chain needs to implement.

It should use async await for all db and fs access, and use fs/promises.
It should only interact with the store, admin wiki, and db through its state variable.
- This allows us to set a different store for each request, and ultimately open a different database connection for each request. Usually database connections are held in a pool and handed out as needed so that parallel connections don't overwhelm the server.
- Assigning a connection to only one request at a time simplifies the transaction handling, but then we need to make sure the connection isn't still being used by buggy code after the connection is returned to the pool. Usually this would be done by setting the DB field to null so that it can no longer be accessed by calling code and any attempts result in an error.
- at the moment, this would probably be done in the sqlengine instance.
It should never assume it's looking at the same store instance as a different request.
- But it should assume it is looking at the same admin wiki as every other request.
If it needs serializable isolation, it should open a transaction.
- It always needs serializable isolation if it is making multiple related requests (like a manual join, or reads then writes).
- It needs to handle any busy errors that may arise. This means it needs to open a transaction early enough that everything fails in one piece and can be retried. However, file system access can't be rolled back, so requests which access the file system need to use BEGIN IMMEDIATE to protect everything related to the file system access.
It needs to make sure that a request which writes to the admin wiki cannot fail.
- If the request also writes to the DB, this is best accomplished with a BEGIN EXCLUSIVE lock on the database.
- If the request also writes to the file system, this is best accomplished using sync I/O. Yes, I know, that blocks the entire server. But that's kind of the point.
It should catch errors at the top of the request, and log them, to prevent the server from crashing. The server request listener which calls the route also needs to wrap it in a try catch. If an error makes it all the way back to the listener function, the server should return status 500 and log as much information to console as possible.

Arlen22 · 2025-01-12T17:32:04Z

I would recommend using an ORM, specifically Prisma. It includes its own bindings to the native C sqlite adapter. I'm working on seeing if we can get a wasm version as well.

It takes care of all the relations and types, vastly simplifying database access, and enforcing various integrity checks. It supports all the basic stuff, plus multi field primary keys, unique indexes, relations, upserts, nested updates, generated defaults, etc. It has some limits, but I introspected the existing database setup and it didn't complain about anything.

In short, it makes it harder to write bad code. And if anything is incorrect, you'll either get a compile time or runtime error pointing out exactly what's wrong.

The entire client can be bundled, but installing it as a node_module is a breeze. And since we're already headed that direction with MWS, it seems like a logical route.

Also, it is mostly platform agnostic so if someone wanted to switch to MySQL or postgres, really all they would need to do is change one line in the schema and regenerate the code. Because SQLite feels like the simplest subset of all database features, I don't think anything else would need to change, but I'm not familiar with SQLite enough to know that for sure. So changing to a different database would literally just be a drop-in replacement.

Arlen22 · 2025-01-15T14:21:12Z

After attempting to do a refactor of the MWS plugin, I think I've come to the conclusion that rewriting this for async SQLite access introduces so many different headaches that it isn't even worth it. Because sqlite is inherently synchronous, trying to do asynchronous work introduces a whole bunch of things you have to think through to prevent race conditions.

There is a significant amount of code dedicated to managing users and permissions, and if someone cared enough about using a different database it's probably because they don't really have file system access, in which case they're going to need a solution for the admin wiki anyway, or it's because they're going to be running a website that functions like some form of Wikipedia, in which case they're absolutely going to totally rewrite the user management anyway, which is 95% of the complexity.

That being said I don't know why anyone is trying to use the WAL. That's the exact same thing as what I was proposing. Pretty much any attempt to add any kind of concurrency I think is going to require a complete rewrite. You can't just shoehorn concurrency into an existing code base because it results in race conditions and deadlocks. Every single line of code you write has to be written with the question of how it relates to the database access.

If the current code has already been done with that in mind then we're good, and we just need to hand requests off to worker threads which each open up their own copy of the database. Obviously requests which access the admin wiki don't get handed off, which satisfies the points I laid out in my previous comment. File system access still needs to be done inside an explicit begin immediate transaction.

Arlen22 mentioned this issue Jan 10, 2025

MWS: Async rewrite #8888

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MWS: Use Promises for sql-tiddler-database.js #8885

MWS: Use Promises for sql-tiddler-database.js #8885

Arlen22 commented Jan 8, 2025 •

edited

Loading

Arlen22 commented Jan 8, 2025

Arlen22 commented Jan 9, 2025 •

edited

Loading

Arlen22 commented Jan 9, 2025

Arlen22 commented Jan 9, 2025

pmario commented Jan 10, 2025 •

edited

Loading

pmario commented Jan 10, 2025 •

edited

Loading

pmario commented Jan 10, 2025

Arlen22 commented Jan 10, 2025 •

edited

Loading

Arlen22 commented Jan 10, 2025

Arlen22 commented Jan 12, 2025 •

edited

Loading

Arlen22 commented Jan 12, 2025 •

edited

Loading

Arlen22 commented Jan 15, 2025 •

edited

Loading

MWS: Use Promises for sql-tiddler-database.js #8885

MWS: Use Promises for sql-tiddler-database.js #8885

Comments

Arlen22 commented Jan 8, 2025 • edited Loading

Arlen22 commented Jan 8, 2025

Arlen22 commented Jan 9, 2025 • edited Loading

Arlen22 commented Jan 9, 2025

Arlen22 commented Jan 9, 2025

pmario commented Jan 10, 2025 • edited Loading

pmario commented Jan 10, 2025 • edited Loading

pmario commented Jan 10, 2025

Arlen22 commented Jan 10, 2025 • edited Loading

Arlen22 commented Jan 10, 2025

Arlen22 commented Jan 12, 2025 • edited Loading

Arlen22 commented Jan 12, 2025 • edited Loading

Arlen22 commented Jan 15, 2025 • edited Loading

Arlen22 commented Jan 8, 2025 •

edited

Loading

Arlen22 commented Jan 9, 2025 •

edited

Loading

pmario commented Jan 10, 2025 •

edited

Loading

pmario commented Jan 10, 2025 •

edited

Loading

Arlen22 commented Jan 10, 2025 •

edited

Loading

Arlen22 commented Jan 12, 2025 •

edited

Loading

Arlen22 commented Jan 12, 2025 •

edited

Loading

Arlen22 commented Jan 15, 2025 •

edited

Loading