Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add some rust developer docs! #3150

Open
ctb opened this issue May 10, 2024 · 10 comments
Open

add some rust developer docs! #3150

ctb opened this issue May 10, 2024 · 10 comments

Comments

@ctb
Copy link
Contributor

ctb commented May 10, 2024

how to run the tests right: #3138 (comment)

@ctb
Copy link
Contributor Author

ctb commented Jun 8, 2024

syntax for specifying a specific branch when simultaneously working on sourmash-rs and external modules (e.g. a plugin that uses the rs interface) - from #3193 / sourmash-bio/sourmash_plugin_branchwater#348

sourmash = { git = "https://github.com/sourmash-bio/sourmash.git", branch = "debug_rust_gather", features = ["branchwater"] }

And each time you push to the new sourmash branch, you'll need to run:

cargo update -p sourmash

in the dependent repository to get it to recognize the updated dependency.

@ctb
Copy link
Contributor Author

ctb commented Jun 8, 2024

this let me get tracing output:

diff --git a/src/core/Cargo.toml b/src/core/Cargo.toml
index 91b76d62..7dfd9ffa 100644
--- a/src/core/Cargo.toml
+++ b/src/core/Cargo.toml
@@ -33,6 +33,7 @@ cfg-if = "1.0"
 counter = "0.5.7"
 csv = "1.3.0"
 enum_dispatch = "0.3.13"
+env_logger = "0.11.3"
 finch = { version = "0.6.0", optional = true }
 fixedbitset = "0.4.0"
 getrandom = { version = "0.2", features = ["js"] }
diff --git a/src/core/src/index/revindex/mod.rs b/src/core/src/index/revindex/mod.rs
index 0e3e443b..4b3c5bde 100644
--- a/src/core/src/index/revindex/mod.rs
+++ b/src/core/src/index/revindex/mod.rs
@@ -12,6 +12,7 @@ use getset::{Getters, Setters};
 use nohash_hasher::BuildNoHashHasher;
 use roaring::RoaringBitmap;
 use serde::{Deserialize, Serialize};
+use env_logger::try_init;
 
 use crate::collection::CollectionSet;
 use crate::encodings::{Color, Colors, Idx};
@@ -591,6 +592,8 @@ mod test {
 
     #[test]
     fn revindex_load_and_gather_2() -> Result<()> {
+        let _ = env_logger::try_init();
+
         let mut basedir = PathBuf::from(env!("CARGO_MANIFEST_DIR"));
         basedir.push("../../tests/test-data/gather/");

command to run w/trace output:

RUST_BACKTRACE=full RUST_LOG=trace cargo test --features branchwater index::revindex::test::revindex_load_and_gather_2

@ctb
Copy link
Contributor Author

ctb commented Jun 10, 2024

also: core release procedure! #2987

@olgabot
Copy link
Collaborator

olgabot commented Jun 13, 2024

Since the tests in sourmash_plugin_branchwater are run with pytest and not cargo, how can you get the Rust backtrace for the branchwater code like #3150 (comment)?

@olgabot
Copy link
Collaborator

olgabot commented Jun 13, 2024

When developing sourmash-bio/sourmash_plugin_branchwater#354, I learned that to get the Rust code to first recompile with maturin develop (and give me compiler error messages!) before running the tests, for example, choosing only the test index_multiple_manifests with verbose mode (-v) and the Python debugger on (--pdb):

maturin develop && pytest -v -k index_multiple_manifests --pdb

@olgabot
Copy link
Collaborator

olgabot commented Jun 21, 2024

Compile documentation for the project and all dependencies with cargo doc --open, which will open an HTML page with all documentation, as described in the Rust Book

@olgabot
Copy link
Collaborator

olgabot commented Jun 28, 2024

To get the rust-analyzer VS Code extension working with an Anaconda installation of Rust, I needed to add to my .vscode/settings.json:

    "rust-analyzer.server.extraEnv": {
        "CARGO": "/Users/olgabot/anaconda3/envs/branchwater-dev/bin/cargo",
        "RUSTC": "/Users/olgabot/anaconda3/envs/branchwater-dev/bin/rustc",
    },

Hoping to save others some headache in the future!

@ctb
Copy link
Contributor Author

ctb commented Aug 20, 2024

Over in #3249, I asked:

how do you feel about adding anyhow to sourmash-rs core?

and on slack, luiz responded:

re anyhow: I usually follow this advice
https://github.com/dtolnay/anyhow?tab=readme-ov-file#comparison-to-thiserror

Use Anyhow if you don't care what error type your functions return, you just want it to be easy. This is common in application code. Use thiserror if you are a library that wants to design your own dedicated error type(s) so that on failures the caller gets exactly the information that you choose.

somewhat same with env_logger: it's for applications, not libs. tho if you're looking for it in tests you can init it there (and make it a dev-dependency)


note that thiserror is indeed part of sourmash already :)

(transferring this here because I was keeping #3249 open only because of this comment 😆 )

@ctb
Copy link
Contributor Author

ctb commented Aug 31, 2024

hmm, maturin import hook => automatically runs maturin develop when code changes are present in rust code.

Once the hook is active, any import statement that imports an editable-installed maturin project will be automatically rebuilt if necessary before it is imported.

@olgabot
Copy link
Collaborator

olgabot commented Oct 1, 2024

hmm, maturin import hook => automatically runs maturin develop when code changes are present in rust code.

Once the hook is active, any import statement that imports an editable-installed maturin project will be automatically rebuilt if necessary before it is imported.

Related to this, it seems to me that re-building librocksdb-sys takes up a substantial amount (50%+??) of the build time. Is it possible to only re-build the sourmash part and not the dependencies with maturin develop? Adding --skip-install didn't help.

Building [=======================> ] 261/266: librocksdb-sys(build)  

Her's an example build, which is a bit delayed because of a lock on the build due to rust-analyzer running in my VS Code, but still shows that librocksdb-sys takes forever!

And all that I changed for this build was adding a consuming variable:

        frequencies
            .values_mut()
            .map(| freq| 
                freq.ln()
            );

to

        let _ = frequencies
            .values_mut()
            .map(| freq| 
                freq.ln()
            );

How does that trigger a full rebuild, especially of librocks-db?

Here's a gif of the build, which took 5 minutes from 16:00 UTC to 16:05 UTC: (screencast was resized from 15MB to 10MB to be under the 10MB limit for file uploads to GitHub, sorry if it looks weird)

2024-10-01maturindeveloprebuildseverythingrocksdbtakesforever-ezgif com-resize

Other PRs complain about this behavior (PyO3/maturin#504, rerun-io/rerun#3474), and say that updating maturin to 1.3.1 fixes things, but I'm on 1.7.4... so I'm not sure how to fix things. Any thoughts?

(branchwater)
 ✘  Tue  1 Oct - 15:54  ~/sourmash_plugin_branchwater   origin ☊ olgabot/multisearch-evalue ↑1 1● 
 ec2-user@x86_64-conda-linux-gnu  maturin --version                                     
maturin 1.7.4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants