Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DMS-75] Store black height in type #26

Open
wants to merge 1 commit into
base: sereja/specialize-persistentorderedset
Choose a base branch
from

Conversation

Sereja313
Copy link
Member

@Sereja313 Sereja313 commented Oct 15, 2024

Profiling branch: https://github.com/serokell/canister-profiling/tree/sereja/set-profiling
Baseline: https://github.com/serokell/motoko-base/tree/sereja/specialize-persistentorderedset (9ebd08f)

Collection benchmarks

binary_size generate max mem batch_get 50 batch_put 50 batch_remove 50 upgrade
trieset+100 211_008 574_022 47_652 131_218 288_429 268_499 352_101
persistentset_baseline+100 196_855 196_106 37_784 49_615 128_635 130_668 210_758
persistentset+100 198_472 202_891 40_856 49_622 133_081 145_795 165_064
trieset+1000 211_008 7_374_045 633_440 162_806 383_594 375_264 731_650
persistentset_baseline+1000 196_855 2_804_787 532_992 66_718 172_961 177_698 302_292
persistentset+1000 198_472 2_901_910 577_624 66_725 179_379 200_287 251_326
trieset+10000 211_008 105_695_670 682_792 192_931 457_923 462_594 3_986_772
persistentset_baseline+10000 196_855 44_279_117 360_508 82_339 214_958 227_419 374_743
persistentset+10000 198_472 45_938_541 400_508 82_346 221_497 262_370 331_581
trieset+100000 211_008 1_234_038_235 6_826_516 222_247 560_440 549_813 178_864_274
persistentset_baseline+100000 196_855 534_152_351 3_600_508 96_540 260_077 273_659 486_063
persistentset+100000 198_472 553_354_610 4_000_508 96_547 267_902 316_477 419_377
trieset+1000000 211_008 13_990_048_548 68_228_312 252_211 650_405 642_099 1_782_978_890
persistentset_baseline+1000000 196_855 6_241_486_671 36_000_508 114_567 305_638 327_353 551_652
persistentset+1000000 198_472 6_458_837_421 40_000_544 114_574 314_842 382_569 505_973

set API

size intersect union diff equals isSubset
trieset+100 100 352_496 411_306 350_935 201_896 201_456
persistentset_baseline+100 100 157_677 170_099 219_483 156_755 157_147
persistentset+100 100 137_470 137_822 163_025 157_191 156_752
trieset+1000 1000 731_650 1_079_906 912_629 2_589_090 4_023_673
persistentset_baseline+1000 1000 339_879 516_679 946_277 1_851_541 1_851_538
persistentset+1000 1000 257_226 255_676 359_488 1_851_541 1_851_538
trieset+10000 10000 3_986_854 21_412_306 5_984_106 46_174_710 31_885_381
persistentset_baseline+10000 10000 450_437 1_056_800 2_154_884 28_585_151 28_585_107
persistentset+10000 10000 366_799 388_139 565_939 29_005_151 29_005_107
trieset+100000 100000 178_863_894 209_889_623 199_028_396 521_399_350 521_399_346
persistentset_baseline+100000 100000 589_221 1_922_651 3_769_853 317_013_739 317_013_654
persistentset+100000 100000 464_937 530_657 757_591 321_213_698 321_213_531
trieset+1000000 1000000 1_782_977_198 2_092_850_787 1_984_818_266 5_813_335_155 5_813_335_151
persistentset_baseline+1000000 1000000 675_690 3_032_485 6_064_414 3_473_642_146 3_473_641_487
persistentset+1000000 1000000 571_282 683_965 950_978 3_515_640_465 3_515_640_175

new set API

size foldLeft foldRight mapfilter map
persistentset_baseline 100 85_458 86_734 158_197 315_166
persistentset 100 85_458 86_734 161_478 324_783
persistentset_baseline 1000 821_920 833_624 3_050_847 4_468_816
persistentset 1000 821_961 834_019 3_153_082 4_607_544
persistentset_baseline 10000 15_241_830 15_354_246 38_010_778 65_203_581
persistentset 10000 15_661_830 15_774_205 39_261_257 67_405_966
persistentset_baseline 100000 152_314_883 153_422_513 450_944_413 789_670_737
persistentset 100000 156_514_924 157_622_882 465_500_521 815_720_516
persistentset_baseline 1000000 1_523_084_378 1_534_107_566 5_197_944_386 9_265_810_970
persistentset 1000000 1_565_084_255 1_576_108_427 5_363_388_523 9_566_516_387

Store the black height in the type to avoid recalculating it for set
operations.
@GoPavel
Copy link

GoPavel commented Oct 17, 2024

I see that all operations except intersect, union, diff become slower on ~3-4%. However, the speed up of the intersect, union, and diff is significant.

I am wondering if can we precalculate black depth and only there.
Or maybe we can make functions like join take and return a pair (blackDepth, tree) instead of tree
WDYT?

@GoPavel GoPavel added the experiment experiment label Oct 21, 2024
@Sereja313 Sereja313 mentioned this pull request Oct 22, 2024
@Sereja313
Copy link
Member Author

I tried to calculate bh in these functions in #31 but it's not so easy to track what change we need to make to the height because it depends on the color of the parent/child node and therefore requires a lot of case analysis in internal functions which leads to significant slowdown. As for precalculating bh in just these functions, this will require going through all the paths of both trees, even those paths that we don't actually need. I tried to find examples of something like this, but there doesn't seem to be a better way than just storing the bh in the type, this is also suggested in Nipkow's book.

@GoPavel
Copy link

GoPavel commented Oct 22, 2024

I think this table makes it more clear

size intersect union diff equals isSubset
persistentset_baseline+100 100 157_677 170_099 219_483 156_755 157_147
persistentset_26+100 100 137_470 137_822 163_025 157_191 156_752
persistentset_31+100 100 186_143 193_149 234_366 156_755 156_752
persistentset_baseline+1000 1000 339_879 516_679 946_277 1_851_541 1_851_538
persistentset_26+1000 1000 257_226 255_676 359_488 1_851_541 1_851_538
persistentset_31+1000 1000 432_457 429_026 720_916 1_851_541 2_565_361
persistentset_baseline+10000 10000 450_437 1_056_800 2_154_884 28_585_151 28_585_107
persistentset_26+10000 10000 366_799 388_139 565_939 29_005_151 29_005_107
persistentset_31+10000 10000 676_919 691_488 1_452_936 28_584_115 28_584_235
persistentset_baseline+100000 100000 589_221 1_922_651 3_769_853 317_013_739 317_013_654
persistentset_26+100000 100000 464_937 530_657 757_591 321_213_698 321_213_531
persistentset_31+100000 100000 889_597 961_645 2_178_331 317_012_457 317_012_495
persistentset_baseline+1000000 1000000 675_690 3_032_485 6_064_414 3_473_642_146 3_473_641_487
persistentset_26+1000000 1000000 571_282 683_965 950_978 3_515_640_465 3_515_640_175
persistentset_31+1000000 1000000 1_137_064 1_267_131 3_149_146 3_473_639_306 3_473_638_647

@GoPavel GoPavel added experiment:do-not-merge Unsuccessful optimization experiment and removed experiment experiment labels Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
experiment:do-not-merge Unsuccessful optimization experiment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants