Add `zip_clones`, zips an iterator with clones of a value #989

barakugav · 2024-09-01T09:44:39Z

Similar to `iter.zip(repeat_n(val, n))' but does not require knowing n

phimuemue · 2024-09-01T18:28:58Z

Hi there, foremost: Thanks for this.

Honestly, I don't believe that is sufficiently useful to justify an own implementation. Is the intention to avoid the clone call for the last element? My answer then would be: If cloneing is really that expensive, why not always work with the borrowed value? Is the one clone call decisive for run-time to justify the existence of this iterator? And: Why is avoiding the clones is only relevant in zip-contexts?

If my assumptions are correct, I personally would advise against inclusion of this.

Aside: Couldn't the unsafe code be avoided if the struct held a Option<(Iter::Item, Zipped)> instead of two Options?

barakugav · 2024-09-02T06:09:12Z

I think it is useful in many use cases, take for example mine:
I have a multi threaded system with threads communicating using crossbeam channels. When a thread wants to send some non-Copy data to the others it does so by value, therefore moving, so it must clone the value to be able to send it to multiple threads (unknown number). The data packets are few hundreds bytes, but the frequency of sending is high. In addition, many times a thread is connected to a single other thread only, in which case no clone is necessary in the first place. All in all, its ugly to avoid cloning without such iterator, Arc is not ideal because it introduce another pointer indirection and allocation on the heap.
The code becomes much more elegant when using such iterator.

Avoiding the clone is not relevant only in zip contexts, but its the smoother way to clone something n-1 times when you already have a iterator that iterate n times, even if you dont know n.

Thanks for the comment regarding unsafe, ill happily fix it if you are willing to go forward with the change :)
@phimuemue

phimuemue · 2024-09-02T18:22:27Z

Hm... I hope I don't miss the forest for the trees, but I am not convinced.

Maybe @jswrenn or @Philippe-Cholet groks your examples. Could you share a boiled down version of your use case to show why zip_clones is so much better than the alternatives?

barakugav · 2024-09-03T07:15:51Z

Here is a boiled down version of my use case:

#[derive(Clone)]
struct Data {
    values: Vec<f64>,
    metadata_field1: usize,
    metadata_field2: usize,
    // ...
}

struct Component {
    inputs: Vec<Receiver<Data>>,
    outputs: Vec<Sender<Data>>,
}

impl Component {
    fn component_main(&self) {
        let mut sel = Select::new();
        for input in self.inputs.iter() {
            sel.recv(input);
        }

        loop {
            let oper = sel.select();
            let idx = oper.index();
            let data = oper.recv(&self.inputs[idx]).unwrap();

            let result = self.process_data(data);

            for output in self.outputs.iter() {
                output.send(result.clone()).unwrap();
            }
        }
    }

    fn process_data(&self, data: Data) -> Data {
        unimplemented!()
    }
}

Each Component::component_main is run by a different thread.
The components are connected using crossbeam channels, with arbitrary number of inputs/outputs.
Each component waits for input data, compute something, and pass a result data to its outputs.
The Data structs is non-Copy, few hundreds bytes, but is usually sent in high frequencies between the components.

The above implementation use a trivial approach cloning the result for each output.

for output in self.outputs.iter() {
    output.send(result.clone()).unwrap();
}

Given a zip_clones method, the implementation stays clean, but avoid the last clone.

for (output, result) in self.outputs.iter().zip_clones(result) {
    output.send(result).unwrap();
}

Without such a method, you could implement it as follows:

if !self.outputs.is_empty() {
    for output in self.outputs[..self.outputs.len() - 1].iter() {
        output.send(result.clone()).unwrap();
    }
    self.outputs[self.outputs.len() - 1].send(result).unwrap();
}

The above relay on the fact the outputs are stored in a vector, as we must know the number of outputs.
If the only information we have about the outputs is that we can iterate over them, the implementation becomes more messy.

let mut outputs = self.outputs.iter();
let mut cur = outputs.next();
if cur.is_some() {
    let mut next = outputs.next();
    while next.is_some() {
        cur.unwrap().send(result.clone()).unwrap();
        cur = next;
        next = outputs.next();
    }
    cur.unwrap().send(result).unwrap();
}

To emphasise, I want to avoid the last clone because many times a component is connected to only one other component, namely it has a single output, and no allocation should be done at all. Even for cases in which I have few outputs, the data packets are flowing in high frequencies and the last clone is redundant.
Using an Arc can solve the problem, but many components actually do want to consume the packet, either to modify it or steal its internal, so they would clone the inner of Arc anyway. Also it introduce another heap allocation and indirection, which I would love to avoid.

Add zip_clones, zips an iterator with clones of a value

f04a047

Similar to `iter.zip(repeat_n(val, n))' but does not require knowing n

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `zip_clones`, zips an iterator with clones of a value #989

Add `zip_clones`, zips an iterator with clones of a value #989

barakugav commented Sep 1, 2024

phimuemue commented Sep 1, 2024

barakugav commented Sep 2, 2024 •

edited

Loading

phimuemue commented Sep 2, 2024

barakugav commented Sep 3, 2024 •

edited

Loading

Add zip_clones, zips an iterator with clones of a value #989

Are you sure you want to change the base?

Add zip_clones, zips an iterator with clones of a value #989

Conversation

barakugav commented Sep 1, 2024

phimuemue commented Sep 1, 2024

barakugav commented Sep 2, 2024 • edited Loading

phimuemue commented Sep 2, 2024

barakugav commented Sep 3, 2024 • edited Loading

Add `zip_clones`, zips an iterator with clones of a value #989

Add `zip_clones`, zips an iterator with clones of a value #989

barakugav commented Sep 2, 2024 •

edited

Loading

barakugav commented Sep 3, 2024 •

edited

Loading