Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📦 Crateification #112

Draft
wants to merge 54 commits into
base: main
Choose a base branch
from
Draft

📦 Crateification #112

wants to merge 54 commits into from

Conversation

oeb25
Copy link
Collaborator

@oeb25 oeb25 commented Oct 18, 2023

This is an experiment to see how splitting core into multiple crates will affect compile times. Whether or not we want this structure is very TBD.

Currently contains the commits of #111.

flowchart TB
  alice --> stdx
  alice --> stract_config
  alice --> stract_llm
  collector --> schema
  collector --> simhash
  collector --> stdx
  collector --> stract_config
  crawler --> distributed
  crawler --> hyperloglog
  crawler --> kv
  crawler --> sonic
  crawler --> stdx
  crawler --> stract_config
  crawler --> warc
  crawler --> webgraph
  crawler --> webpage
  entity_index --> imager
  entity_index --> kv
  entity_index --> stdx
  entity_index --> tokenizer
  imager --> distributed
  imager --> kv
  imager --> stdx
  mapreduce --> distributed
  mapreduce --> sonic
  naive_bayes --> stdx
  schema --> stdx
  schema --> tokenizer
  simhash --> tokenizer
  spell --> schema
  spell --> stdx
  spell --> stract_query
  stract_cli --> stract_config
  stract_cli --> stract_core
  stract_cli --> webgraph
  stract_config --> distributed
  stract_core --> alice
  stract_core --> collector
  stract_core --> crawler
  stract_core --> distributed
  stract_core --> entity_index
  stract_core --> executor
  stract_core --> hyperloglog
  stract_core --> imager
  stract_core --> kuchiki
  stract_core --> kv
  stract_core --> mapreduce
  stract_core --> naive_bayes
  stract_core --> optics
  stract_core --> schema
  stract_core --> simhash
  stract_core --> sonic
  stract_core --> spell
  stract_core --> stdx
  stract_core --> stract_config
  stract_core --> stract_llm
  stract_core --> stract_query
  stract_core --> tokenizer
  stract_core --> warc
  stract_core --> webgraph
  stract_core --> webpage
  stract_llm --> stdx
  stract_query --> optics
  stract_query --> schema
  stract_query --> stdx
  tokenizer --> stdx
  webgraph --> executor
  webgraph --> hyperloglog
  webgraph --> kv
  webgraph --> stdx
  webpage --> kuchiki
  webpage --> naive_bayes
  webpage --> schema
  webpage --> simhash
  webpage --> stdx
  webpage --> tokenizer
  webpage --> webgraph
Loading

@oeb25 oeb25 marked this pull request as draft October 18, 2023 21:28
@oeb25 oeb25 force-pushed the crateification branch 2 times, most recently from f9a4194 to 076d756 Compare October 19, 2023 10:51
@oeb25
Copy link
Collaborator Author

oeb25 commented Oct 19, 2023

Current cargo check-incremental improvements:

Command Mean [s] Min [s] Max [s] Relative
update-deps 2.414 ± 0.020 2.389 2.454 1.56 ± 0.03
crateification 1.543 ± 0.021 1.506 1.587 1.00

And for cargo build-incremental:

Command Mean [s] Min [s] Max [s] Relative
update-deps 8.001 ± 0.030 7.971 8.058 1.20 ± 0.02
crateification 6.655 ± 0.135 6.389 6.825 1.00

@oeb25
Copy link
Collaborator Author

oeb25 commented Oct 27, 2023

Current cargo build-incremental:

Command Mean [s] Min [s] Max [s] Relative
main 7.504 ± 0.049 7.439 7.586 1.30 ± 0.02
crateification 5.771 ± 0.069 5.699 5.936 1.00

Before Index was passed, but only its InvertedIndex was ever used.
Some parts of query are left behind in core, since it depends heavly on
other mods such as ranking.
Previously there was a blanket impl for Doc, but since Doc is now
defined in another crate, this is no longer possible due to coherence.

To work around this, we require that implementors of AsRankingWebsite
also impl Doc, which is fairly straight forward if they also impl
AsRankingWebsite, which they have to.

Not as convinient, but it works!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant