[Perf] Alcor Control Agent Performance Profiling #441

xieus · 2020-10-21T22:14:45Z

Request

Set up a performance profiling framework for ACA
Collect latency and throughput metrics for large payload
Optimize ACA multiple threading
Look into the narrow down locking scope to improve performance at high-concurrency situations
Investigate on OVS DB batch insertion to improve performance

xieus · 2020-10-21T22:16:34Z

Linked to an umbrella issue #440.

er1cthe0ne · 2020-11-25T00:51:14Z

Per issue description, I will break down the ACA performance profiling task into two major areas.

ACA handling of large payload

Framework to use: aca_tests to create large payload and send to ACA
Example payload could be 1 port create plus 10, 100, ...1000, 10,000, 100,000 neighbors
Collect latency and throughput metrics
Identify bottleneck and problematic areas (possibly OVS)
Optimize ACA multiple threading model, do we want to limit the max parallel thread to use = number of CPU * 2?
Can we bundle a batch (e.g. 10) of similar neighbors to process in a single call? It may help with the locking mechanism of ACA internal structures.

ACA handling of packet in message from OVS

Framework to use: cbench (https://github.com/mininet/oflops/tree/master/cbench) to ACA as an openflow controller
Use the payload generated from cbench, test the latency mode then throughput mode
Collect latency and throughput metrics
Identify bottleneck and problematic areas
When we have on demand L3 routing rules implemented, it is possible for VM to quickly create a lot of new connections to a new neighbor which will generate a lot of packet in message to ACA for process. We need to confirm ACA can handle this
If ACA slow down is observed, consider spining up more threads to handle mulitple packet in message in parallel

Other Notes

Can we use framework like SeaStar to improve ACA threading model? https://github.com/futurewei-cloud/chogori-seastar-rd

xieus added P1 Priority 1 perf testing Performance Testing labels Oct 21, 2020

xieus added this to the Version 1.0.2020.11.30 milestone Oct 21, 2020

xieus assigned er1cthe0ne Oct 21, 2020

er1cthe0ne added P0 Priority 0 and removed P1 Priority 1 labels Nov 21, 2020

er1cthe0ne mentioned this issue Dec 10, 2020

[Perf] Improve performance with massive L2 neighbor handling futurewei-cloud/alcor-control-agent#176

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Perf] Alcor Control Agent Performance Profiling #441

[Perf] Alcor Control Agent Performance Profiling #441

xieus commented Oct 21, 2020 •

edited

Loading

xieus commented Oct 21, 2020

er1cthe0ne commented Nov 25, 2020 •

edited

Loading

[Perf] Alcor Control Agent Performance Profiling #441

[Perf] Alcor Control Agent Performance Profiling #441

Comments

xieus commented Oct 21, 2020 • edited Loading

xieus commented Oct 21, 2020

er1cthe0ne commented Nov 25, 2020 • edited Loading

xieus commented Oct 21, 2020 •

edited

Loading

er1cthe0ne commented Nov 25, 2020 •

edited

Loading