OPFS filesystem transparency for AccessHandlePoolVFS? #99

rhashimoto · 2023-07-09T22:16:36Z

rhashimoto
Jul 9, 2023
Maintainer

One difference (of many) between the original OriginPrivateFileSystemVFS I wrote and the newer AccessHandlePoolVFS is filesystem transparency: in the original VFS, the file structure in OPFS is exposed to SQLite as-is. That is, if you open/create the file "myDB.db" in SQLite, that opens/creates a file named "myDB.db" in OPFS with the expected contents, no more and no less. This is a nice feature to have - it's really easy to understand how your data is stored, and you can use the OPFS API directly for import/export.

AccessHandlePoolVFS doesn't have filesystem transparency. It uses randomly generated filenames in a single directory, and it prepends its own metadata header to the data that SQLite reads and writes. The metadata includes a somewhat obscure digest function, and so dealing with all this from outside SQLite is a bit of a hassle. I describe this VFS as implementing a filesystem using OPFS as a device, instead of exposing a filesystem. There are good reasons for all of this, but no filesystem transparency is an unfortunate drawback.

Can this be fixed or mitigated? I think the answer is yes. Here's a sketch:

There's no getting around the issue that filesystem transparency must be violated while SQLite is using AccessHandlePoolVFS. This VFS has to open all the files it uses before SQLite calls are made, which is before it knows what filenames SQLite is going to use. That is because opening an OPFS access handle is asynchronous and any AccessHandlePoolVFS methods that SQLite calls must be synchronous. So during use, in general the SQLite filenames won't necessarily match the OPFS filenames.

But what about before and after SQLite is using the VFS? Is there a way to transform the OPFS file structure so it is effectively transparent when the VFS is inactive but supports everything it needs to when the VFS is active? I think...mostly yes, to a practically useful extent.

Here's what we need to do:

Move the VFS metadata out of the OPFS files so the file content is exactly what SQLite reads and writes.
Add the capability to scan an OPFS directory recursively for files for the VFS to open.
Add the capability to move/rename OPFS files to match the path of their corresponding SQLite file, and move/rename unassociated OPFS files to a special directory for that purpose.
Do all of this in a way that is recoverable in case of a crash at any point.

For (1), if the metadata aren't attached to their files they will have to go into another OPFS file or files. Storing all the metadata (which now will each have to also contain the OPFS path) in one special file, say $ROOT/.ahp/metadata, should work as long as each metadata record is a multiple of the sector size, so that any write error when updating one record won't damage adjacent records.

For (2), the VFS already scans all the files in a specific directory (call it $ROOT) to acquire its access handles so changing that to a recursive scan shouldn't be difficult. We would likely add a special directory, e.g.$ROOT/.ahp/ exempt from the scan, to store VFS-specific files such as metadata and as yet unassociated filesystem files (OPFS files that aren't being used as SQLite files). Unlike the current scan, which assumes unrecognized files aren't associated with any SQLite file, the new scan would create a metadata record matching a new file with its path.

For (3), it seems straightforward but is a little tricky. First of all, although some browsers do support a FileSystemHandle.move() method (Chrome, Safari) to move/rename a file, that is not yet in the OPFS spec so the fallback is copy-and-delete (Update 7/11/23: All major browsers seem to have a working move() method). Second, as previously noted it will be common for SQLite filenames and OPFS filenames not to match. Under particularly convoluted sequences of file operations, it is possible for a SQLite filename "A" to map to OPFS filename "B" and SQLite filename "B" to map to OPFS filename "A". A naive renaming implementation could accidentally overwrite one of these files. On reflection, it will be much simpler if an OPFS file outside $ROOT/.ahp can only be associated to a SQLite file with the same path. That avoids the weird cases completely, and probably won't have any negative impact under typical usage.

For (4), hmm, that sounds like a job for...a database. It's turtles all the way down! Actually, while a special OPFS file written in sector-size chunks with careful flush() calls could be made to work, IndexedDB should do fine here if needed to ensure that an incomplete transformation can be detected and completed.

So I think there are some interesting problems here and some important details to get right, likely including some I have missed, but nothing seems insurmountable. Applications would probably do the transparency transformation (along with SQLite temporary file removal) both at startup and at shutdown - shutdown is really when you want it to happen but web apps aren't necessarily exited cleanly.

I don't have a need for filesystem transparency myself so I have no immediate plans to implement this. Anyone interested is welcome to give it a try.

rhashimoto · 2023-07-11T00:07:45Z

rhashimoto
Jul 11, 2023
Maintainer Author

Here's a visualization of a sample OPFS tree:

$ROOT/
├─ .ahp/
│  ├─ opaque/
│  │  ├─ az3k9p (example random name)
│  │  ├─ fp8rqn (example random name)
│  │  ├─ ...
│  ├─ metadata
├─ fileA.db (everything not under .ahp/ is transparent)
├─ fileB.db
├─ subdir/
│  ├─ fileC.db (arbitrary hierarchy works)

Files not under the special .ahp/ directory can only be associated with SQLite files with the same path (this is the transparent area). So if SQLite opens /fileA.db, that will always associate with $ROOT/fileA.db if it exists. Files under $ROOT/.ahp/opaque/ can be associated with any file. So if SQLite opens /fileA.db-journal and that path does not exist in OPFS under $ROOT/ then one of the randomly named files under $ROOT/.ahp/opaque/ that is not already associated with another file will be selected.

The metadata file contains a record for each file outside $ROOT/.ahp plus all the files under $ROOT/.ahp/opaque - these are all the OPFS files that can be associated with a SQLite file. Each metadata record contains:

OPFS path
SQLite path, if the OPFS file is current associated with a SQLite file
SQLite creation flags
A nonce to be used for transparency transform.
Digest to validate the record

When a VFS attaches its OPFS filesystem, it does the following things:

Fixes the metadata file and cleans up incomplete transparency transactions if necessary.
Scans the OPFS filesystem outside $ROOT/.ahp for files with no metadata record. For each one, add a new metadata record with the same OPFS and SQLite path and default creation flags.
Scans the OPFS filesystem under $ROOT/.ahp/opaque for files with no metadata record. For each one, add a new metadata record with the OPFS path and no SQLite path.
If there are metadata records for files that no longer exist, clear (ignore) those records.
From here things are the same as AccessHandlePoolVFS, where the pool includes both the transparent and opaque files.

To transform the OPFS filesystem for transparency, we need to make two types of changes:

Any associated non-temporary file under $ROOT/.ahp/opaque/ needs to be moved so its OPFS path matches its SQLite path, and its metadata needs to be updated.
Any unassociated file outside $ROOT/.ahp/ needs to be moved under $ROOT/.ahp/opaque/, renamed to its nonce, and its metadata needs to be updated.

The tricky part is being able to recover if any error occurs during these changes. I think something like this will work for each change (basically journalling a metadata transaction):

Save the metadata record offset and data to IndexedDB.
Perform the move/rename.
Update the metadata record (including a new nonce).
Delete the saved metadata from IndexedDB.

Then on restart the VFS should check IndexedDB for saved metadata and finish any outstanding steps.

0 replies

sgbeal · 2023-07-15T21:34:54Z

sgbeal
Jul 15, 2023

My current thinking (subject to change) for the new opfs-sahpool VFS is that, rather than adding the complications for bona fide transparency, offer helper routines which can export and import files from/to the "mangled" form used by the VFS. sqlite3_[de]serialize() can hypothetically already be used for that, but i've not yet tested them with that VFS.

4 replies

rhashimoto Jul 15, 2023
Maintainer Author

Obviously that's up to you, but I don't think implementing transparency is that complicated if the sketch here holds up. Maybe you're scared off by In******B, but (1) it would be used as a key-value database containing one record (I estimate ~40 LOC), and (2) you could replace it with an OPFS file if you really wanted to.

Questions are welcome here if anything is unclear or seems shaky (arguments over your project design probably should go in your forum), or you know how to reach me if you want to set up something more interactive/conversational.

sgbeal Jul 15, 2023

FWIW, i agree entirely that IDB is the closest adjacent/applicable turtle in the stack, but i'm not familiar enough with it to have a gut feeling for whether it could fill the role so will defer to your judgement on that.

Now that i think about it, we actually have a precedent for opaque db storage: the local/sessionStorage VFS stuffs each db page into one ASCII-encoded record in the storage object. Like with the pool VFS, we can't simply stuff dbs in there or pull them out, but we have to go through routines likes sqlite3_[de]serialize().

i'll need to poke around on it after some sleep and see what direction it goes.

BTW: my guestimate for the pool 3-4x performance increase over the older VFS was "perhaps as much as 2x," with "30-40%" of that being the elimination of the cross-thread communication and some smaller factor being the elimination of acquiring/relinquishing SAHs. Perhaps my old timings of cross-thread communication were fundamentally flawed or just way too conservative.

rhashimoto Jul 15, 2023
Maintainer Author

I'm reading the sqlite3_serialize/sqlite3_deserialize docs for the first time, and as I understand things they are not complementary operations despite the naming. sqlite3_serialize goes from persistent storage to memory, but sqlite3_deserialize doesn't go from memory to persistent storage, instead it uses an in-memory serialization as storage. Maybe I'm wrong because I don't get the use case for this.

Even if you can use these functions to round-trip the database file, they seem to be limited to available memory. You won't be able to use this mechanism for databases larger than that, unless I'm mistaken.

sgbeal Jul 16, 2023

Maybe I'm wrong because I don't get the use case for this.

Apologies, i was working from memory while writing that. You're correct that deserialize doesn't write to storage but in the context of the wasm bits we've added a routine for dumping byte arrays into storage using the VFS responsible for that storage. It's essentially a data copy op which uses VFS I/O:

https://sqlite.org/src/info?name=65e4c58924b862d9&ln=1353-1466

It doesn't work with the kvvfs (local/sessionStorage) because that VFS is oddly specific about only accepting writes of a certain size, but it's worked on the other VFSes so far.

However... i'm now not certain that would suffice for the pool vfs because it would not flag the file metadata as "this is a persistent database," so it would get nuked by the vfs. Okay, i'll need to rethink this.

Even if you can use these functions to round-trip the database file, they seem to be limited to available memory.

That's a recurring problem in the JS bits: byte arrays are of course limited to memory. Perhaps it's possible to stream, e.g. an upload of a db, into storage but i've not yet explored it.

sgbeal · 2023-07-18T07:25:44Z

sgbeal
Jul 18, 2023

The past day i've been considering, "on paper," (as opposed to "in
code") options for implementing filesystem transparency and the
following points are currently giving me varying degrees of stomach
ache:

Issue 1: Arbitrararily large init workloads

When we traverse the FS at startup we have to filter out any file
which is not for SQLite. The first heuristic is simply whether the
size is an even multiple of 512 bytes. If it's not, it can't be a
db. The second is to read the file's header and check if it starts
with "SQLite format 3". If it does, we can assume it's a db. Journal
files are always named X.ext-journal, and it seems safe to assume
that anything named *-journal belongs to sqlite3 (in particular if a
file matching the * prefix is found in the same dir).

Fundamentally that's not a problem at all. In the VFS we only need to
concern ourselves with dbs and journals, but "there's always someone
trying to ice-skate uphill" and we'll eventually see support requests
like:

My game is spending 90 seconds trying to initialize sqlite3.

and it will turn out that that game includes a few thousand asset
files (sprites, sound clips, whatever) which are all exactly some
multiple of 512 bytes in size and we're opening every one of them to
see if it's a database.

(It's only a matter of time before developers start using OPFS as local
cache for large applications, and we fully expect sqlite to have a
role in the next generation of large webapps.)

This problem does not bug me just yet because we "could" later,
without breaking compatibility, add the ability to provide include
and/or exclude lists to the VFS init to limit the amount of work we
do in finding suitable files. Adding such support up front would be
annoying and overkill, though, because "it's not a problem until it's
a problem."

Stomache ache level: low

Issue 2: Pre-existing files getting transformed to opaque ones

This may well be a purely academic problem, not a real one, but here
it is...

Let's say we open up /foo.db-journal then, at some point, sqlite
deletes it. When the library later reuses that same name for the next
transaction, it will no longer be a "pre-existing" file and will
become an opaque file managed by the VFS. For a journal that's
actually not even a tiny bit of a problem, but it is for a database.
If, for whatever reason, a pre-existing db file gets deleted by the
VFS (noting that the DELETE_ON_CLOSE flag is apparently not supported
by sqlite_open_v2()), then any future instance created by the VFS with
the same name would not fall into the "pre-existing" category. While
that's very possibly also not a catastrophe, the idea of files
changing status from "pre-existing" to "originating from the VFS" is
triggering me slightly (but only slightly).

Stomache ache level: the high end of low/the low end of medium

Issue 3: Holding directory handles

From the top post:

Any associated non-temporary file under $ROOT/.ahp/opaque/ needs
to be moved so its OPFS path matches its SQLite path, and its
metadata needs to be updated.

Any unassociated file outside $ROOT/.ahp/ needs to be moved under
$ROOT/.ahp/opaque/, renamed to its nonce, and its metadata needs to
be updated.

In order to be able to move files back and forth, we have to hold the
directory handles of every associated directory (arbitrarily
many). (Ignoring, for now, the case of conflicting names caused by,
e.g., a separate instance of this VFS moving a like-named file into
$ROOT.)

Moving might also require creation of subdirs, and we'd end up leaving
any number of empty subdirs laying around when we move the files back
into opaque storage (we cannot know, without tracking each of them,
whether or not we created each sub/sub/subdir).

Again, maybe this isn't a real problem, but the uphill ice-skater will
eventually show up with his 200 sub/sub/subdirectories and make it
problem. Perhaps it's not a technical problem to hold 200 such
handles, but it makes me queasy nonetheless.

Stomache ache level: medium

Issue 4: Multiple VFS copies with different dirs

It's possible, in both our impl and the original, to provide a
directory name to the VFS for storing its opaque data. If multiple
copies of the VFS end up getting run in the same origin...

The opaque storage of each VFS is just that - opaque. The other VFS
instances do not know that those files belong to another instance
and may try to use them. We could work around this by, e.g., using
some well-defined dotfile in each VFS's main dir, or use a
well-defined subdir name, and skip scanning those dirs.
(The top post already address this, but it's here again for the sake
of keeping it in mind.)
The transparent files, e.g. /foo.db, are a whole other can of
worms. Two VFS instances will want to have access to that but only
the first VFS can get access to it. Since we have no control over
the order the VFSes are initialized (only one instance per page in
the case of our current API), it's indeterminate which instance will
get which files.

This last point is what's bugging me the most. On the surface, it
appears to be insurmountable.

Stomache ache level: high

That said...

Perhaps i'm either over- or under-thinking this.

4 replies

rhashimoto Jul 18, 2023
Maintainer Author

Let me first clarify that I don't think that the root directory managed by the VFS should be the OPFS root if the application will ever use OPFS for anything else. It should be some subdirectory that is understood to be dedicated to database files.

When we traverse the FS at startup we have to filter out any file which is not for SQLite.

I don't think this is necessary. If there is a file there you don't have metadata for, you can just add it. If SQLite is directed to open the file, you'll find out then whether it is a valid database file or not.

In general, users should not be putting non-database files under the pool root. It shouldn't be a problem if they do unless they add so many files it exhausts some operating system file handle limit, but it's going to slow down attaching the VFS. This is why the pool root should not be the OPFS root.

Let's say we open up /foo.db-journal then, at some point, sqlite deletes it. When the library later reuses that same name for the next transaction, it will no longer be a "pre-existing" file and will
become an opaque file managed by the VFS. For a journal that's actually not even a tiny bit of a problem, but it is for a database.

I don't see the problem here. First of all, how does SQLite delete a database file? There might be a way to do it; I just don't know of any.

Second, even if it does happen, if a transparent file is deleted and re-created in the same VFS session, the association should go to the same transparent file, not an opaque file. So for example, say there is a transparent journal file in $ROOT/foo.db-journal. On start up SQLite opens it, applies the journal, and deletes it. On the next write transaction SQLite creates the journal file, so the VFS checks to see if there is a matching transparent path, and there is, so that association is made (otherwise the association goes to one of the opaque files). If the VFS session ends with that file deleted, the transparency transformation will then convert the transparent file into an opaque file. If the VFS session ends with that file not deleted, the transparent file remains transparent.

In order to be able to move files back and forth, we have to hold the directory handles of every associated directory (arbitrarily many).

I think maybe you're thinking that the transparency conversion takes place continuously as SQLite operates with the VFS. That would be an interesting idea, assuming you can move a file while holding an open access handle (which I'm not sure you can), but that is not my proposal here. The idea is to perform the conversion moves outside the scope of a VFS session, before and/or after the VFS holds all the access handles. So directory handles can always be created and discarded as needed.

The opaque storage of each VFS is just that - opaque. The other VFS instances do not know that those files belong to another instance and may try to use them. We could work around this by, e.g., using some well-defined dotfile in each VFS's main dir, or use a well-defined subdir name, and skip scanning those dirs.

Since we have no control over the order the VFSes are initialized (only one instance per page in
the case of our current API), it's indeterminate which instance will get which files.

Because the access handle pool approach doesn't support multiple connections, attaching more than one VFS isn't going to work. This is why the VFS root directory shouldn't be the OPFS root, in general, and especially if you want to make connections to different databases.

If you're asking how to ensure that multiple VFS instances don't try to attach the same pool, there are at least a couple easy ways to guarantee exclusive access. Maybe the easiest is to always acquire an access handle on the metadata file first. An alternative way is to acquire an exclusive Web Lock with the pool root directory name.

sgbeal Jul 18, 2023

In general, users should not be putting non-database files under the pool root.

Agreed. i missed the concept that $ROOT was specific to the VFS. That changes things.

I don't see the problem here. First of all, how does SQLite delete a database file? There might be a way to do it; I just don't know of any.

My initial concern was DELETE_ON_CLOSE but have since confirmed with Richard that DELETE_ON_CLOSE is never used for database (and of course can't sanely be used on them because they can be opened by multiple process).

Second, even if it does happen, if a transparent file is deleted and re-created in the same VFS session, the association should go to the same transparent file, not an opaque file.

Ah, right, if there's enough metadata to permit that. i was thinking in terms of the class-internal data, where that name mapping is currently removed when a file is deleted, combined with what i now believe is a bug in my port where i overwrite the first byte of the name with a NUL when setAssociatedPath() is passed an empty string (similar to how DOS used to "delete" files).

Maybe the easiest is to always acquire an access handle on the metadata file first. An alternative way is to acquire an exclusive Web Lock with the pool root directory name.

Either of those sound good.

Speaking of... in case you haven't seen it yet, the Chrome OPFS folks are proposing a new locking mechanism for OPFS are are looking for feedback:

https://github.com/whatwg/fs/blob/main/proposals/MultipleReadersWriters.md

Thank you once again for your explanations and insights! i was told today that the 3.43 release will tentatively be in September, so there's lots of time to work through this and still get it into that release.

rhashimoto Jul 18, 2023
Maintainer Author

Second, even if it does happen, if a transparent file is deleted and re-created in the same VFS session, the association should go to the same transparent file, not an opaque file.

Ah, right, if there's enough metadata to permit that.

There should be. You're right that the internal representation connecting SQLite paths, OPFS paths, and access handles will probably have to change somewhat. Off the top of my head (so bear that in mind), there could be a FileEntry class to represent each OPFS file that would provide:

metadata file offset
metadata fields
- OPFS path
- SQLite path
- SQLite creation flags
- transform nonce
access handle
setAssociation(sqlitePath, flags)
clearAssociation()
isAssociated()
isTransparent()
isTemporary()

For efficient look up you would also need some associative containers on the VFS instance:

A map to look up a FileEntry by OPFS path.
A map to look up a FileEntry by SQLite path.
A map to look up a FileEntry by fileId (the identifier SQLite passes to VFS methods).
A set for unassociated opaque FileEntrys.

Then to implement the VFS methods:

xOpen
- Look up FileEntry by OPFS path if it exists, otherwise take one from the set of unassociated opaque FileEntrys.
- if the FileEntry is unassociated and create flag is set, call setAssociation().
- Add FileEntry to SQLite path map.
- Add FileEntry to the fileId map.
xClose
- Look up FileEntry by fileId.
- If creation flags include delete-on-close, call xDelete.
- Remove FileEntry from fileId map.
xDelete
- Look up FileEntry by SQLite path.
- Call clearAssociation().
- If OPFS file is opaque return it to the set of unassociated opaque FileEntrys.
- Remove FileEntry from SQLite path map.
xAccess
- Look up FileEntry by SQLite path.
xRead, xWrite, xTruncate, xSync, xFileSize
- Look up FileEntry by fileId.
- Use the access handle (or you could make these operations methods on FileEntry).

Does that sound workable?

sgbeal Jul 18, 2023

i am coincidentally currently in the process of refactoring towards supporting just the sort of VFS options you mention above, and having your list to cross-check with will be very helpful. i hope to have that in place by tomorrow or Thursday.

rhashimoto · 2023-07-19T12:03:49Z

rhashimoto
Jul 19, 2023
Maintainer Author

Here's an attempt at pseudo-code for the transparency transform (and temporary file removal):

Prerequisites: metadata file access handle (no other access handles)

if no metadata copy exists:
  make atomic and durable metadata copy

truncate metadata file

for each record in metadata copy:
  # a valid record has an OPFS path and a correct digest.
  if record is not valid continue

  if record has an SQLite path:
    if record file type is not temporary:
      if record OPFS path is not transparent:
        set new OPFS path to match SQLite path
        if file exists at old OPFS path:
          # move from opaque to transparent (may require directory creation)
          move file from old OPFS path to new OPFS path
        elif file exists at new OPFS path:
          no move needed
        else: # file missing
          # This error case is a problem. If we just continue and lose the file
          # then we have reduced capacity. If we create a file with the old opaque
          # name and crash, the new empty file will be considered valid which
          # is unacceptable. If we create a file with a new opaque name and crash,
          # we have increased capacity. This should be so rare I'm going to
          # go with just continuing - applications can check capacity and adjust.
          log error
          continue?

      else: # record OPFS path is transparent
        no move needed

    else: # record file type is temporary
      # remove unneeded temporary file
      assert OPFS path is not transparent
      unset SQLite path

  else: # record has no SQLite path:
    if record OPFS path is not transparent:
      no move needed
    else: # record OPFS path is transparent
      set new OPFS path to nonce
      if file exists at old OPFS path:
        # move from transparent to opaque
        move file from old OPFS path to new OPFS path
      elif file exists at new OPFS path:
        no move needed
      else: # file missing
        log error
        create file at new OPFS path

  append possibly updated metadata record to metadata file

# finished iterating records
flush metadata file
delete metadata copy atomically

This procedure needs to be idempotent, including if any prior invocations are interrupted. That is why this code is effectively journaling the metadata file changes, and why there is a nonce in the metadata record.

On VFS start, this should be run before scanning for new or deleted transparent files.

In normal operation, files should not go missing. The mechanism where this can occur is if a user deletes a file after a crash during a transparency transform that moved that file and before the next transparency transform fixes it. This should be exceedingly rare and can probably be considered mis-usage. Applications can check capacity on start and adjust if necessary.

1 reply

rhashimoto Jul 20, 2023
Maintainer Author

My thinking was to use IndexedDB for the metadata copy/journal, because it is already atomic and durable. OPFS could be used instead - it is durable, but some extra care is needed to make it atomic.

The major browsers support FileSystemHandle.move() although this is not yet officially in the spec, and while not all moves are guaranteed to be atomic, renaming a file in the same OPFS directory should be. Atomic copy could then be as simple as:

Copy metadata to metadata-journal.pending.
Flush metadata-journal.pending.
Move metadata-journal.pending to metadata-journal.

MDN claims that FileSystemHandle.move() is only supported on Firefox. WPT already tests for it with mixed results (and right now it looks like something is wrong with the Safari testing platform), but here's a test page that works on Chrome and Safari as well.

AntonOfTheWoods · 2023-07-23T16:05:07Z

AntonOfTheWoods
Jul 23, 2023

Sorry to rudely butt in without anything useful to contribute, but can you confirm my understanding that this work is needed to be able to import/open an existing sqlite .db file with the AHP VFS?

11 replies

sgbeal Jul 29, 2023

My apologies in advance for the terse response but my computer died last night and i'm limited to a tablet until tonight at the earliest...

What would be the best way to do that before it's officially released?

We always recommend that people build their own sqlite (wasm or otherwise) from the current trunk (https://sqlite.org/src - see ext/wasm/README.md for build details), regardless of releases (which are made primarily because many people outright refuse to use anything except prepackaged releases (noting that we essentially consider the trunk to be a perpetually-rolljng release)).

I'm guessing this is to do with caching that can't work with OriginPrivateFileSystemVFS? In any case, a cold start for AccessHandlePoolVFS is still in the low hundreds of ms, which is only slightly slower than doing the same directly on a .db file with the C client. Absolutely smokin!

Those two VFSs are from this project (wa-sqlite), so Roy's the one to talk to about that. Our impl of Roy's AHP VFS isn't materially different, though (but does not produce compatible db files), so performance should be close to identical. (They "could" be made compatible but that would hinder any changes and experimentation by either project.)

AntonOfTheWoods Jul 29, 2023

Awesome, thanks. I initially tried and failed to compile so came here to ask but tried again with different instructions and managed to get trunk compiled. I have it working now and just need to understand why the importDb method is refusing my .db files!

sgbeal Jul 29, 2023

(Reposting in the proper thread. Don't know how i keep not doing that.)

I have it working now and just need to understand why the importDb method is refusing my .db files!

You might have found a bug. i've got a replacement computer but am still getting it all set up. i'll look into that tonight at the latest.

rhashimoto Jul 29, 2023
Maintainer Author

Support questions for the Official SQLite WASM library (which I abbreviate as OSW), belong in the SQLite forum, and are off-topic here. Although @sgbeal sometimes posts here, and I there, the projects are not affiliated and anything specifically about one or the other should go in the proper place. This will help people looking for similar information find it, and those people might end up helping you as well.

As an aside, with the large tables I have (400k+ rows), the difference in select speed when I join a couple of them together is monstrous - 5s+ with the OriginPrivateFileSystemVFS VFS and 100-300ms with AccessHandlePoolVFS.

That is quite a difference, higher than the difference I have seen in my own benchmarking but I haven't worked with tables that large, nor on JOINs for that matter. It's hard to say what exactly explains it without diving into the details, including things like verifying relative cache and page sizes for your measurements. Neither OriginPrivateFileSystemVFS nor AccessHandlePoolVFS do any of their own caching, so given the same SQLite configuration their cache usage should be the same and wouldn't explain any of the performance gap. Note that I'm not updating OriginPrivateFileSystemVFS any more so it will fall behind if there are any improvements in the future, and while I don't recall anything specific it's possible this may already have happened.

sgbeal Jul 29, 2023

@AntonOfTheWoods i can't reproduce the importDb failure. If you'll write me off-list we can try to work out what you're seeing: stephan/at/sqlite/org.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OPFS filesystem transparency for AccessHandlePoolVFS? #99

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 5 comments 20 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

OPFS filesystem transparency for AccessHandlePoolVFS? #99

rhashimoto Jul 9, 2023 Maintainer

Replies: 5 comments · 20 replies

rhashimoto Jul 11, 2023 Maintainer Author

rhashimoto Jul 15, 2023 Maintainer Author

rhashimoto Jul 15, 2023 Maintainer Author

Issue 1: Arbitrararily large init workloads

Issue 2: Pre-existing files getting transformed to opaque ones

Issue 3: Holding directory handles

Issue 4: Multiple VFS copies with different dirs

That said...

rhashimoto Jul 18, 2023 Maintainer Author

rhashimoto Jul 18, 2023 Maintainer Author

rhashimoto Jul 19, 2023 Maintainer Author

rhashimoto Jul 20, 2023 Maintainer Author

rhashimoto Jul 29, 2023 Maintainer Author

rhashimoto
Jul 9, 2023
Maintainer

Replies: 5 comments 20 replies

rhashimoto
Jul 11, 2023
Maintainer Author

rhashimoto Jul 15, 2023
Maintainer Author

rhashimoto Jul 15, 2023
Maintainer Author

rhashimoto Jul 18, 2023
Maintainer Author

rhashimoto Jul 18, 2023
Maintainer Author

rhashimoto
Jul 19, 2023
Maintainer Author

rhashimoto Jul 20, 2023
Maintainer Author

rhashimoto Jul 29, 2023
Maintainer Author