Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In Proofreading, Load Oversegmentation, Perform Merges Eagerly in Frontend #7654

Merged
merged 200 commits into from
Jul 31, 2024

Conversation

fm3
Copy link
Member

@fm3 fm3 commented Feb 27, 2024

This PR adds support for a locally applied HDF5 mapping (instead of applying it by the back-end). This enables the user to do interactive proofreading where (a) super-voxels can be easily identified by hovering and (b) merge operations can be done without a round-trip to the back-end and (c) split operations can be done without reloading all buckets (but a roundtrip is necessary to find out which edges to delete).

The HDF5 mapping is only applied locally when the annotation has an editable mapping or is about to have one (i.e., the user is in proofreading mode). When this condition changes, the layer is reloaded automatically to swap the responsibility of mapping the segmentation between client and server.

URL of deployed dev instance (used for testing):

Steps to test:

  • create a new annotation for a dataset with a segmentation layer that has a hdf5 mapping
  • enable the proofreading tool
  • hovering over the active segment should highlight super voxels
  • merge operations should feel very fast
  • split operations should also feel quick (no bucket reloading should be done)

TODOs:

  • Missing:

    • The frontend mapping functionality (formerly used for JSON mappings only) doesn't support uint64, but 64-bit IDs are needed for larger datasets.
      • implement cuckoo hashing for that
      • properly test with a 64 bit dataset
        • rendering and merger mode
        • proofreading with a 64 bit agglomerate mapping
          • fix bigint to proto serialization packing
          • ...
    • handleProofreadCutNeighbors and handleSkeletonProofreadingAction were not yet adapted to work with frontend mappings. Look at handleProofreadMergeOrMinCut for implementation guidance.
      • handleProofreadCutNeighbors
      • handleSkeletonProofreadingAction
    • The wkstore_adapter currently hardcodes shouldUseDataStore to true, because the backend was not yet adapted to return unmapped segment IDs from the tracing store if an editable mapping is active. If that is changed, this part of the code can be reset and brushed volume data is loaded from the tracing store correctly again. /cc @fm3
      • this works now by always requesting from the datastore when the hdf5 mapping should be applied locally. if the segmentation data was changed (thus, the tracing store would need to be requested), the mapping will be applied by the back-end (proofreading is disabled in that case)
    • There are quite a few typescript errors due to the existance of number and bigint mappings. At some points I had success using generics (T extends Map<number,number> | Map<bigint,bigint>) to let typescript understand that the two won't be mixed, but I'm not sure whether it's possible to get it to understand that for all of the code 🤔
    • skeleton-based proofreading
    • reload segmentation layer feature does not work correctly
    • the semantics of segment.id has to be fixed (mapped vs unmapped); see discussion
  • Bugs:

    • Reloading after save does not properly initialize the mapping
    • sometimes, hovering super-voxels in the data viewports doesn't work as expected
    • sometimes, the segment id is not an integer apparently and then jsConvertCellIdToRGBA fails
    • Mesh loading after proofreading actions currently triggers errors, because the getDataValue method used to determine the agglomerate IDs after the action uses the mapping from before. should be fixed thanks to Improve segment proofreading in 3D viewport #7742
    • sometimes, when splitting, the new supervoxels are shown in the data viewports and then the mapping is reset so that the old supervoxels are shown
    • sometimes, splitting [and/or?] merging produces corrupt mappings
    • [ ] double-check that mapping texture is correctly attached/detached when switching between modes doesn't seem important to me (the opposite: unloading and re-uploading the texture when switching back and forth seems wasteful)
    • cuckoo table's capacity cannot be used up to 90%
      • fix suboptimal hash function for cuckootables with a single numeric key
      • fix iteration threshold constant
      • fix recursive rehashing
    • toggling segmentation layer with "3" won't get you back to proofreading mode (annoying when no proofreading action has been done yet)
    • merging/splitting produces corrupt annotations? the tracing store route needed sorted ids (for the datastore route it is irrelevant)
  • Performance: I did not rigorously evaluate the performance yet, but browsing through the data feels laggier than before. See In Proofreading, Load Oversegmentation, Perform Merges Eagerly in Frontend #7654 (comment).

    • A quick profiling showed that 10% of the time is spent in the bucket's getValueSet methods with invocations of getValueSets taking up to 250 ms depending on the number of cache misses (note, union'ing the sets afterwards is quick in comparison, usually 5 ms) see comments below
      • Writing the mapping texture is also quick in comparison ~5-10 ms (with cuckoo, updates are below 1ms typically)
    • The mapping lookup in the shader uses a binary search approach. For mapping sizes of ~50-100k this results in up to 18 texture lookups to locate the key and another lookup for the value. I certainly hope using cuckoo hashing would increase performance.
    • avoid roundtrip for new agglomerate ids before calling refreshAffectedMeshes
    • the mapping could be incrementally written thanks to the new cuckoo table which should help with performance
    • offload value set computation to web worker (didn't help much) and do it as soon as buckets arrive to distribute the workload (this is only done now when the value sets are really needed)
    • [ ] maintain cached value union in data cube didn't help (tested in c7bb508)
    • improve diffing of mappings
      • avoid redundant work with caching
      • doublecheck that diffing in saga doesn't do redundant work

Issues:


(Please delete unneeded items, merge only when none are left open)

@fm3 fm3 changed the title Hackathon: Magic Mapping In Proofreading, Load Oversegmentation, Perform Merges Eagerly in Frontend Mar 12, 2024
@daniel-wer daniel-wer mentioned this pull request Mar 19, 2024
3 tasks
@daniel-wer
Copy link
Member

daniel-wer commented Apr 8, 2024

  • Current state:
    • Mappings are applied in the frontend. The mapping information is bulk-requested from the server once a mapping is enabled. If buckets change, the mapping information for the new segment IDs is requested from the server while old segment IDs are evicted from the maintained frontend mapping (currently throttled to 1s). The frontend mapping is updated when performing segment-based proofreading actions, making a server-round-trip obsolete in the merge case. In the split case the server is still needed to perform the min cut and to assign segments to one of the two resulting agglomerates.

moved todo-items to PR description

@philippotto
Copy link
Member

philippotto commented Jul 23, 2024

Thank you for the review and the testing report 🙏

  • If an annotation has an editable mapping, the supervoxels within the active mesh are always highlighted when hovering in the 3d view, even if the proofreading tool is not active. (I thought you fixed that, but I'm still able to reproduce)

This is the only thing I still have to look into. As this shouldn't be a big deal (impact-wise), it would be cool if you could already give the branch another test?

  • Using the "Split from all neighbors" functionality in the l4dense_motta_et_al_dev dataset for the node at 2873, 4539, 1777 a crash was triggered. Seems to happen for other locations, too.

For the record: I changed it so that these errors become soft errors (the user can retry then). there are two causes for this issue: 1) a race condition where the mapped id is not available yet (should happen very rarely; let's re-evaluate in production). 2) the dev dataset you used is cropped and the agglomerate file isn't. for that reason, the back-end serves neighbors that the front-end cannot look up. this should only be a dev-specific and therefore not critical. ideally, we would crop the agglomerate file, too.

  • When switching between the proofreading tool and other tools the mapping location is switched from frontend to backend and vice versa. This works well, but takes a while on my system (backend -> frontend). I see segmentation data during that time, but the view is essentially frozen for 5-10s. Maybe a notification similar to the one when activating a mapping could be shown during the transition period? Or is this instant on your system?

I added a simple "Reloading segmentation layer..." message that only stays for one second (everytime the layer is reloaded). Could you check again whether this feels okay now? For me, the issue with the tool switching is now more pleasant, but it's also faster on my system. Hiding the message when the (entire? or most of the?) mapping has been loaded to the gpu is probably a bit more tricky (especially, because it's hard for me to reproduce).

@philippotto
Copy link
Member

If an annotation has an editable mapping, the supervoxels within the active mesh are always highlighted when hovering in the 3d view, even if the proofreading tool is not active. (I thought you fixed that, but I'm still able to reproduce)

Luckily it was easy to fix 🎉

@daniel-wer
Copy link
Member

Awesome, code LGTM and no issues at all during retesting, except for one possible performance issue.

I tested how fast one could go merging segments and the performance was a bit underwhelming, although merging happens entirely in the frontend. See this gif:

merging_is_slow

with this console output from two different merge actions that were comparably slow:

image

Can you reproduce this and if so do you think that could be sped up?

Copy link
Member

@daniel-wer daniel-wer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :shipit:

@philippotto philippotto enabled auto-merge (squash) July 31, 2024 09:55
@philippotto philippotto merged commit ca5f2aa into master Jul 31, 2024
2 checks passed
@philippotto philippotto deleted the magic-mapping branch July 31, 2024 09:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants