rsc: Create tables for tracking Blobs #1490

V-FEXrt · 2023-12-13T21:42:36Z

Note: Due to sqlite3 limitations we had to rewrite migration history to pass our tests. After initial deployment this becomes much more difficult. We are researching if we can avoid sqlite3 for our test flow.

Creates the tables and relationships required for tracking uploaded blobs. Minimally updates routes and tests so they will pass. Updating the routes to work as expected will come in a future PR

rust/migration/src/m20231213_194935_create_blob_table.rs

JakeSiFive

Approved. I suspect some of the on delete resitrcts should be cascade instead most likely

JakeSiFive · 2023-12-15T18:37:54Z

rust/migration/src/m20220101_000001_create_blob_tables.rs

+                            .to_col(BlobStore::Id)
+                            // All blobs for a given store must be manually
+                            // deleted before the blob store may be deleted.
+                            .on_delete(ForeignKeyAction::Restrict),


What are your thoughts on restrict vs cascade? What about specifically for blob store?

In the comments I gave my core justification. But in general cascade is pretty dangerous.

In this case, if you deleted a BlobStore, it would cascade to the Blobs, which would cascade to the OutputFiles, which would cascade to the Jobs.

To me that really isn't acceptable especially since we have a clear workflow for deleting all those things

Job eviction deletes a Job which cascades to its OutputFiles

Blob eviction later sees those blobs with no references so it triggers proper deletion using the backing blob store API then deletes the entry in the blob table

We can then manually remove the decommissioned BlobStore

If we make it cascade we will be guaranteeing that blobs get orphaned in the remote store without really gaining anythign in the API

JakeSiFive · 2023-12-15T18:38:37Z

rust/migration/src/m20220101_000002_create_table.rs

+                            .to_col(Blob::Id)
+                            // A blob may not be deleted if there exists a
+                            // job that depends on it.
+                            .on_delete(ForeignKeyAction::Restrict),


These in particular seem like they might be too strong? Sometimes a blob will go missing from storage and if that happens we want to delete the blob from the database, but also delete any jobs that depend on it. I think deleting stores is often not needed so I'm less concerned with that but deleting blobs seems to be more important and common.

If a blob goes missing from storage you should just delete the whole job. Then you can either delete the blob manually or let blob eviction clean up the entry

In fact, I claim you should always delete the job first. That way you stop the bleeding with clients. Deleting the blob first makes the bleeding worse for a short amount of time

I agree you should delete the job but there may be many jobs you need to delete not just one

If this is a case we see happening a lot we should write code to handle the cleanup (we can use rsc_tool or the http api to delete all jobs that depend on blob x) instead of having the database work against the guarantees we want

cascade wouldn't be as bad if we have some way to ensure we cleanup on the remote stores with a blob is deleted. I'm not sure how we'd do that but if you have any ideas I'm open to hearing them

The way I've been doing it in the local shared cache is that you scan both ways because you shouldn't trust either side. One scan checks for non-existant blobs that are referenced in the DB, the other scan checks for blobs that exist but are orphaned. This appears to be a required step because you can't atomically delete the blob on one side and the blob in the database. One must be deleted before the other so one of the scan directions is required. In practice I've found that both are helpful.

Side note: We have been using on delete cascade, if you want to change that lets change it everywhere you think it makes sense while we're changing the DB.

I'm fine with it either way, using cascade only makes the delete Easier it doesn't actully buy us any kind of assurance so I'm quite happy to keep restrict over cascade. But I think the assurances it buys us are week (but certainly greater than the absence of assurance we get from cascade).

I will warn against thinking that this protects you from the blob store and the database falling out of sync, I do not think it accomplishes that. Distributed systems are hard.

Yeah ok, I think I've landed on liking restrict with the following reasoning:

It corrects for some mistakes

We still need at least one direction of out of sync checks if not both but that's ok (its not a + to either side)

We can always do the cascade deletes manually if we need to

JakeSiFive · 2023-12-15T18:40:59Z

rust/migration/src/m20220101_000002_create_table.rs

+                            .to_col(Blob::Id)
+                            // A blob may not be deleted if there exists an
+                            // output file that depends on it.
+                            .on_delete(ForeignKeyAction::Restrict),


ditto, for instance when we delete a job, we cascade the deletion to the output file

JakeSiFive · 2023-12-15T18:42:45Z

rust/rsc/src/rsc/add_job.rs

+        stdout_id: Set(payload.stdout_id),
+        stderr_id: Set(payload.stderr_id),


minor nit: Both here and above in the DB, I think "stdfoo_blob_id" or "stdfoo_blob_key" or something like that would be better. Just seeing "stdout_id" doesn't really tell me whats happening here.

JakeSiFive · 2023-12-15T18:43:16Z

rust/rsc/src/rsc/main.rs

+                "stdout_id": blob_id,
+                "stderr_id": blob_id,


oh neat upshot!

rsc: Create tables for tracking Blobs

7567289

V-FEXrt requested a review from JakeSiFive December 13, 2023 21:42

format

f636714

JakeSiFive reviewed Dec 13, 2023

View reviewed changes

V-FEXrt added 2 commits December 15, 2023 09:55

Rework scheme, rewrite history, fix tests

45b9a46

cleanup

634371e

JakeSiFive approved these changes Dec 15, 2023

View reviewed changes

addressing comments

f86a2da

JakeSiFive approved these changes Dec 15, 2023

View reviewed changes

V-FEXrt merged commit 8501a51 into master Dec 15, 2023
12 checks passed

V-FEXrt deleted the rsc-create-blob-entities branch December 15, 2023 20:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rsc: Create tables for tracking Blobs #1490

rsc: Create tables for tracking Blobs #1490

V-FEXrt commented Dec 13, 2023 •

edited

Loading

JakeSiFive left a comment

JakeSiFive Dec 15, 2023

V-FEXrt Dec 15, 2023

JakeSiFive Dec 15, 2023

V-FEXrt Dec 15, 2023

V-FEXrt Dec 15, 2023

JakeSiFive Dec 15, 2023

V-FEXrt Dec 15, 2023

V-FEXrt Dec 15, 2023 •

edited

Loading

JakeSiFive Dec 15, 2023

JakeSiFive Dec 15, 2023

JakeSiFive Dec 15, 2023

JakeSiFive Dec 15, 2023

JakeSiFive Dec 15, 2023

JakeSiFive Dec 15, 2023

		stdout_id: Set(payload.stdout_id),
		stderr_id: Set(payload.stderr_id),

rsc: Create tables for tracking Blobs #1490

rsc: Create tables for tracking Blobs #1490

Conversation

V-FEXrt commented Dec 13, 2023 • edited Loading

JakeSiFive left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

V-FEXrt Dec 15, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

V-FEXrt commented Dec 13, 2023 •

edited

Loading

V-FEXrt Dec 15, 2023 •

edited

Loading