Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Streaming set operations via .from_iters() and .to_iter() functions #19

Open
tzcnt opened this issue Jun 7, 2022 · 1 comment

Comments

@tzcnt
Copy link
Collaborator

tzcnt commented Jun 7, 2022

Needs

I'd like to create from_iters() and to_iter() versions of each operation. This would lazily evaluate/produce each member of the sequences. This would not be as performant as the current exponential-search version when it comes to working slice-to-slice, but would allow for a few additional use cases:

  • Reading elements from an external source (perhaps a database query known to be sorted via UNIQUE + ORDER BY) using an asynchronous row iterator instead of needing to construct a Vec first.
  • Checking if the result of an operation (or series of operations) has any members (via the Iterator::any() method), without needing to materialize the entire result. For this operation, this would be a big performance improvement.
  • Perform a map() or filter() operation on each element of a produced result, and then insert the result into a HashMap or other data structure, without needing an intermediate Vec.

Proposed implementation details:

They should be compatible with the existing slices -> slice API. This means introducing 3 new dataflow paths: slices -> iter, iters -> slice, and iter -> iter.

Proposed Syntax:

let a: SetBuf<i32> = SetBuf::new_unchecked((0..1_000_000).collect());
let b: SetBuf<i32> = SetBuf::new_unchecked((0..1_000_000).collect());

// slices -> slice (current syntax)
let inter: SetBuf<i32> = sdset::duo::OpBuilder::new(&a, &b).intersection().into_set_buf();

// slices -> iter
let inter: SetBuf<i32> = sdset::duo::OpBuilder::new(&a, &b).intersection().into_iter();

// iters -> slice
let inter: SetBuf<i32> = sdset::duo::OpBuilder::from_iters(&a.iter(), &b.iter()).intersection().into_set_buf();

// iters -> iter
let inter: SetBuf<i32> = sdset::duo::OpBuilder::from_iters(&a.iter(), &b.iter()).intersection().into_iter();

To implement this, the existing Union/Intersection/Difference/SymmetricDifference types would need to be extended with an into_iter() function that would lazily consume the input slice. This would then cover the slices -> slice and slices -> iter case.

New types UnionOfIters/IntersectionOfIters/DifferenceOfIters/SymmetricDifferenceOfIters would need to be created whose a and b input fields are &'a mut dyn Iterator<T>. These would then have the into_set_buf() and into_iter() functions implemented, covering the iters -> slice and iters -> iter case.

@Kerollmops
Copy link
Owner

I just sent you an invitation to help me maintain this repository, you can accept it, it should be in your inbox 😃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants