Understanding the performance difference between CLI and grep crates #1827
-
First off, this is a fantastic tool and I think it's amazing that you've pulled the core logic into their own crates so others can re-use it in tools they are making. I'm trying to do just that and am hitting a stumbling block around the performance of using the crate in my own code vs using the I'm timing my own code via: let matcher = RegexMatcher::new(format!(".*{}.*", search_term).as_str())?;
let start = Instant::now();
let mut result = 0;
Searcher::new().search_path(&matcher, i, Bytes(|lnum, line| {
result += 1;
Ok(true)
}))?;
println!("elapsed={}", start.elapsed().as_secs_f32()); On my machine, that reports Running
For my own tool, I set a bunch of release profile flags in case that was it: [profile.release]
lto = "fat"
codegen-units = 1
panic = "abort" 3x difference seems pretty large to me so I'd imagine I am missing something obvious, but I was hoping someone might be able to point me in a direction to investigate. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 8 replies
-
What happens when you run rg with |
Beta Was this translation helpful? Give feedback.
-
Ah okay, thanks for that. TL;DR: Use The issue here is subtle, and I missed the fact that your regex was Anyway, that leading Once of the things that Now in your case,
And dipping down a level to the core matcher interface, Ultimately, the |
Beta Was this translation helpful? Give feedback.
Ah okay, thanks for that. TL;DR: Use
RegexMatcher::new_line_matcher
instead ofRegexMatcher::new
, and that will fix most of the problem.The issue here is subtle, and I missed the fact that your regex was
.*foo.*
instead of just plainfoo
. You also didn't show the actual ripgrep command you used. (In the future, it's best to include as much details as you can so that others can reproduce your experiment.) It's possible you're using.*foo.*
intentionally, but in case you're not,foo
will also report lines containingfoo
anywhere. The only difference with.*foo.*
is that.*foo.*
will match the entire line instead of justfoo
.Anyway, that leading
.*
is key because it actually inhibits a cla…