Filter seem to slow down as graph size increases #65

alexgcrombie · 2022-10-04T16:25:54Z

While porting the petrol pump example from the FRP book, I noticed that running the just the lifecycle portion (rust equivalent of listing 4.5) of the example that when sending events in debug mode took 1-2s to process a single event, and in production mode this was reduced to 200ms. The code is written to be as close to the Java code in form as I could make it (all of the inputs are modeled as sinks, which then generate the "inputs" interface that the book presents)

In another test application of my own I noticed a x5 slow down when using filter_option (but from my testing the timing is very similar for a plain filter as well), this time going from 200uS to ~1ms in debug mode, and ~27uS to 200uS in production.

One thing to note is that this is being built on Linux using the latest release of Rust.

The code is as follows for the normal case:

pub fn charger_setup( s: &SodiumCtx, charger_events: &Stream<ChargingStatus> ) -> ChargerCtx {
    let state = charger_events.map( | e: &ChargingStatus | -> ChargingStatus { e.clone() } ).hold( ChargingStatus::Disconnected );

    let updates = charger_events.map( | e: &ChargingStatus | -> UiUpdates {
        match e {
            ChargingStatus::Disconnected => UiUpdates::LedAnimation{ led: LedName::Green1, animation: OFF },
            ChargingStatus::Charging => UiUpdates::LedAnimation{ led: LedName::Green1, animation: THROB },
            ChargingStatus::Charged => UiUpdates::LedAnimation{ led: LedName::Green1, animation: SOLID },
        }
    } );

    ChargerCtx {
        state,
        updates,
    }
}

and the filter enabled code:

pub fn charger_setup_filter_option( s: &SodiumCtx, charger_events: &Stream<ChargingStatus> ) -> ChargerCtx {
    let state = charger_events.map( | e: &ChargingStatus | -> ChargingStatus { e.clone() } ).hold( ChargingStatus::Disconnected );

    let updates1 = Stream::filter_option( &charger_events.snapshot( &state, | evt: &ChargingStatus, state_: &ChargingStatus | -> Option<UiUpdates> {
            if evt != state_ {
                match evt {
                    ChargingStatus::Disconnected => Some( UiUpdates::LedAnimation{ led: LedName::Green1, animation: OFF } ),
                    ChargingStatus::Charging => Some( UiUpdates::LedAnimation{ led: LedName::Green1, animation: THROB } ),
                    ChargingStatus::Charged => Some( UiUpdates::LedAnimation{ led: LedName::Green1, animation: SOLID } ),
                }
            } else {
                None
            }
        }
    ) );

    ChargerCtx {
        state,
        updates: updates1,
    }
}

The above code is the only portion using filters, and as such it results in the following timings. For the non-filtered case:

  setup: 552us
  send( charging ): 146us
  send( charged ): 132us
  send( disconnected ): 129us

and for the filter using case:

  setup: 1699us
  send( charging ): 910us
  send( charged ): 915us
  send( disconnected ): 963us

The above are the debug timings, but the ratio between the cases remains similar even in a release build. In the release build the timings are ~23uS (with no filters) and ~160uS (with filters).

The text was updated successfully, but these errors were encountered:

alexgcrombie · 2022-10-04T16:34:03Z

The timings appear to be similar in a windows build of the lifecycle portion of the petrol pump example.

The main issue is not necessarily the increase in the time taken when using filter, but the fact that in order to maintain a responsive UI processing a single event should ideally be >1ms and definitely >25ms. The code posted above forms a pretty trivial example and is already approaching the 1ms mark (admittedly in debug)

clinuxrulz · 2022-10-07T10:40:45Z

I haven't had time to maintain sodium lately. I did read the code you emailed me, and the performance issue does seem to be on the sodium-rust side.
I suggest forking sodium rust and start profiling it internally. I suspect it may be calling collect_cycles() too often (the internal garbage collector for sodium objects). We could comment out collect_cycles or add an early return inside to see if it makes a difference to performance.
Another thing I can think of is all the reference counting is atomic at the moment (Arc). Replacing all the Arc<> with Rc<> could help too.

alexgcrombie · 2022-10-07T12:23:06Z

Thank you for taking the time to review this issue, I will give those suggestions a try. I may need some support (if you are willing) as I am pretty new to Rust once I get stuck

alexgcrombie · 2022-10-10T18:12:19Z

collect_cycles is behaving sensibly in that for a simple test that still shows this issue it is only being called 3 times.

It turns out this issue is unrelated to the use of filter but is instead more to do with the number of nodes (in what appears to be a chain). I created a test that strings together a number of maps in the following way:

#[derive(Clone,Copy)]
enum SomeVal {
    Val1,
    Val2,
    Val3,
}

fn generate_nodes( s: &Stream<SomeVal>, count: u32 ) -> Stream<SomeVal> {
    let new_s= s.map( | v: &SomeVal | -> SomeVal { v.clone() } );
    match count {
        0 => new_s,
        _ => generate_nodes( &new_s, count - 1 ),
    }
}

The test results for this show a clearly exponential trend:

Count	Node Count	Time to send 1 event (us)
0	2	44
1	3	61
2	4	93
3	5	145
4	6	299
5	7	655
6	8	1049
7	9	2300
8	10	2402
9	11	8057
10	12	12567
11	13	18865
12	14	37039
13	15	64412
14	16	195225
15	17	383411
16	18	504266
17	19	982780
18	20	2041202
19	21	3885266
20	22	7953937
21	23	15776052
22	24	32014120

The majority of this time is spend in mark_roots() and more specifically in resetting the reference count of the adjacent nodes:

fn mark_roots(&self) {
    ...
    for root in &old_roots {
         self.reset_ref_count_adj(root);
    }
    ...
}

If there were an index associated with each of the roots in old_roots in the above loop, then the time taken for each iteration would be <Index>^2. Digging further into this, the problem seems to be occurring due to reset_ref_count_adj() never exiting early. The tracing added is as follows:

fn mark_roots(&self) {
    ...
    let mut root_num = 0;
    for root in &old_roots {
        let tt1 = Instant::now();
        let depth = self.reset_ref_count_adj(root);
        let tt2 = Instant::now();
        println!( "mark_root: reset_ref: {},{},{}", root_num, depth, (tt2-tt1).as_micros() );
        root_num = root_num + 1;
    }
    ...
}

and

fn reset_ref_count_adj(&self, s: &GcNode) -> i32 {
        println!( "ref_reset: {},{},{}", s.id, s.name, s.data.visited.get() );
        if s.data.visited.get() {
            return 0;
        }
        s.data.visited.set(true);
        s.data.ref_count_adj.set(0);
        let mut depth = 0;
        s.trace(|t| {
            depth = self.reset_ref_count_adj(t) + 1;
        });
        s.data.visited.set(false);
        return depth;
    }

When running this with generate_nodes( &sink, 4 ), it produces the following pattern:

ref_reset: 0,Stream::new,false
mark_root: reset_ref: 0,0,2
ref_reset: 1,Stream::map,false
ref_reset: 0,Stream::new,false
ref_reset: 0,Stream::new,false
mark_root: reset_ref: 1,1,3
ref_reset: 2,Stream::map,false
ref_reset: 1,Stream::map,false
ref_reset: 0,Stream::new,false
ref_reset: 0,Stream::new,false
ref_reset: 1,Stream::map,false
ref_reset: 0,Stream::new,false
ref_reset: 0,Stream::new,false
mark_root: reset_ref: 2,2,7
ref_reset: 3,Stream::map,false
ref_reset: 2,Stream::map,false
ref_reset: 1,Stream::map,false
ref_reset: 0,Stream::new,false
ref_reset: 0,Stream::new,false
ref_reset: 1,Stream::map,false
ref_reset: 0,Stream::new,false
ref_reset: 0,Stream::new,false
ref_reset: 2,Stream::map,false
ref_reset: 1,Stream::map,false
ref_reset: 0,Stream::new,false
ref_reset: 0,Stream::new,false
ref_reset: 1,Stream::map,false
ref_reset: 0,Stream::new,false
ref_reset: 0,Stream::new,false
mark_root: reset_ref: 3,3,16
...
ref_reset: 5,Stream::map,false
ref_reset: 4,Stream::map,false
ref_reset: 3,Stream::map,false
ref_reset: 2,Stream::map,false
ref_reset: 1,Stream::map,false
ref_reset: 0,Stream::new,false
ref_reset: 0,Stream::new,false
ref_reset: 1,Stream::map,false
ref_reset: 0,Stream::new,false
ref_reset: 0,Stream::new,false
ref_reset: 2,Stream::map,false
ref_reset: 1,Stream::map,false
ref_reset: 0,Stream::new,false
ref_reset: 0,Stream::new,false
ref_reset: 1,Stream::map,false
ref_reset: 0,Stream::new,false
ref_reset: 0,Stream::new,false
ref_reset: 3,Stream::map,false
ref_reset: 2,Stream::map,false
ref_reset: 1,Stream::map,false
ref_reset: 0,Stream::new,false
ref_reset: 0,Stream::new,false
ref_reset: 1,Stream::map,false
ref_reset: 0,Stream::new,false
ref_reset: 0,Stream::new,false
ref_reset: 2,Stream::map,false
ref_reset: 1,Stream::map,false
ref_reset: 0,Stream::new,false
ref_reset: 0,Stream::new,false
ref_reset: 1,Stream::map,false
ref_reset: 0,Stream::new,false
ref_reset: 0,Stream::new,false
ref_reset: 4,Stream::map,false
ref_reset: 3,Stream::map,false
ref_reset: 2,Stream::map,false
ref_reset: 1,Stream::map,false
ref_reset: 0,Stream::new,false
ref_reset: 0,Stream::new,false
ref_reset: 1,Stream::map,false
ref_reset: 0,Stream::new,false
ref_reset: 0,Stream::new,false
ref_reset: 2,Stream::map,false
ref_reset: 1,Stream::map,false
ref_reset: 0,Stream::new,false
ref_reset: 0,Stream::new,false
ref_reset: 1,Stream::map,false
ref_reset: 0,Stream::new,false
ref_reset: 0,Stream::new,false
ref_reset: 3,Stream::map,false
ref_reset: 2,Stream::map,false
ref_reset: 1,Stream::map,false
ref_reset: 0,Stream::new,false
ref_reset: 0,Stream::new,false
ref_reset: 1,Stream::map,false
ref_reset: 0,Stream::new,false
ref_reset: 0,Stream::new,false
ref_reset: 2,Stream::map,false
ref_reset: 1,Stream::map,false
ref_reset: 0,Stream::new,false
ref_reset: 0,Stream::new,false
ref_reset: 1,Stream::map,false
ref_reset: 0,Stream::new,false
ref_reset: 0,Stream::new,false
mark_root: reset_ref: 5,5,68

From the above it feels like the visited.set(true) and visited.set(false) pair in reset_ref_count_adj() are being incorrectly set but I can't quite convince myself that it is indeed incorrect. Or if it is incorrect then what the correct logic around this should be

clinuxrulz · 2022-10-11T06:27:34Z

Thanks for digging into that. If I can find time, I can have a close look. So, it must be a garbage collector bug.

alexgcrombie · 2022-10-11T10:01:11Z

It does indeed appear to be the case. It would be much appreciated if you could give a quick description of how the GC algorithm is supposed to work. I am planning on looking into it a bit further in parallel, so would be good to have some guidance with the way it is intended to work

clinuxrulz · 2022-10-13T10:32:27Z

https://pages.cs.wisc.edu/~cymen/misc/interests/Bacon01Concurrent.pdf

The non-concurrent one in that pdf.

If I remember correctly, ref_count_adj is the temporary negative count (represented as positive integer) when detecting cycles. I.E. If ref_count - ref_count_adj == 0, then a cycle is found. Otherwise reset its ref_count_adj to zero at end if no cycle found.

clinuxrulz · 2022-11-25T21:46:52Z

@alexgcrombie I believe you are right about that visited.set(true) and visited.set(false) pair causing problems in reset_ref_count_adj(). The false needs to be set later after the loop through the old_roots. Also the old_roots (and possibly roots) can probably make use of a HashSet instead of a Vec to avoid duplicate traces/visits.

clinuxrulz · 2022-11-27T06:30:32Z

@alexgcrombie I split reset_ref_count_adj into two so that visited is set to false on a second pass of the roots. You can try it out to see if it helps with that exponential growth. It seemed faster when running the regular tests.

Also constructing your graph within a transaction helps a lot too. collect_cycles only gets executed outside transactions.

RadicalZephyr mentioned this issue Feb 21, 2024

Add a benchmark showing the exponential growth RadicalZephyr/sodium-rust#6

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filter seem to slow down as graph size increases #65

Filter seem to slow down as graph size increases #65

alexgcrombie commented Oct 4, 2022

alexgcrombie commented Oct 4, 2022

clinuxrulz commented Oct 7, 2022

alexgcrombie commented Oct 7, 2022

alexgcrombie commented Oct 10, 2022 •

edited

Loading

clinuxrulz commented Oct 11, 2022

alexgcrombie commented Oct 11, 2022

clinuxrulz commented Oct 13, 2022 •

edited

Loading

clinuxrulz commented Nov 25, 2022 •

edited

Loading

clinuxrulz commented Nov 27, 2022 •

edited

Loading

Filter seem to slow down as graph size increases #65

Filter seem to slow down as graph size increases #65

Comments

alexgcrombie commented Oct 4, 2022

alexgcrombie commented Oct 4, 2022

clinuxrulz commented Oct 7, 2022

alexgcrombie commented Oct 7, 2022

alexgcrombie commented Oct 10, 2022 • edited Loading

clinuxrulz commented Oct 11, 2022

alexgcrombie commented Oct 11, 2022

clinuxrulz commented Oct 13, 2022 • edited Loading

clinuxrulz commented Nov 25, 2022 • edited Loading

clinuxrulz commented Nov 27, 2022 • edited Loading

alexgcrombie commented Oct 10, 2022 •

edited

Loading

clinuxrulz commented Oct 13, 2022 •

edited

Loading

clinuxrulz commented Nov 25, 2022 •

edited

Loading

clinuxrulz commented Nov 27, 2022 •

edited

Loading