Skip to content
This repository has been archived by the owner on Feb 25, 2022. It is now read-only.

Use shaded dependency on JCTools instead of copy and paste #18

Open
nitsanw opened this issue Mar 1, 2016 · 7 comments
Open

Use shaded dependency on JCTools instead of copy and paste #18

nitsanw opened this issue Mar 1, 2016 · 7 comments

Comments

@nitsanw
Copy link

nitsanw commented Mar 1, 2016

Currently this library uses some queue implementations lifted in spirit and to some extent code out of JCTools. The 2 projects share a contributor, @akarnokd.
I would recommend this project directly shades JCTools and uses the originals. Versioned dependencies allow for clear release boundaries and well understood state of code while also enabling future improvements and corrections to be fed in easily. This is why everyone does it...

@akarnokd
Copy link
Contributor

akarnokd commented Mar 1, 2016

Rsc is a research effort and is thought as a basis for other reactive libraries which can have their own notion of scheduling and queueing. Therefore, Rsc was designed to be parametrizable by the queue implementation through a Supplier in each relevant operator. We did this because Reactor was mainly using Disruptor as its queue-like backing data structure. Thus, if one chooses to drive Rsc with JCTools queues, he/she is free to do so.

I didn't use JCTools directly because:

  • Rsc isn't really meant to be a standalone library at this point, but we have fluent operators for test and benchmark convenience. I use it in comparison benchmarks because I know exactly hot it is implemented and how it should behave (unlike RxJava 1.x which has still some hidden skeletons everywhere).
  • We don't need more than the 2 queue types we currently have. There are some places which could benefit from a growable, array-backed MPSC but we don't have the means to detect what XpYc queues the user supplies.
  • First wanted to explore the non-queue related optimization possibilities the operator-fusion gives
  • We can afford inlining the AtomicReferenceArray whereas JCTools can't really.
  • Our queues are small and relatively short lived (max up to 1M elements transferred), the peculiar fill-drain pattern it receives doesn't seem to benefit from lookahead and seems to be hurt by the excess padding and more complex class hierarchy.

@nitsanw
Copy link
Author

nitsanw commented Mar 1, 2016

The choice is yours. Let me see if I understand the argument:

  • "Rsc was designed to be parametrizable by the queue implementation through a Supplier in each relevant operator. We did this because Reactor was mainly using Disruptor as its queue-like backing data structure. Thus, if one chooses to drive Rsc with JCTools queues, he/she is free to do so." : we are therefore discussing the default implementation source. In my experience most people stick with the default, so I would urge care in choosing it. That people can throw in other impls make no difference. I'm concerned with sloppy dependencies, so people making a clear dependency is not an issue.
  • "...I use it in comparison benchmarks because I know exactly how it is implemented and how it should behave..." - equally, picking any one version impl would work. This is the same point as "First wanted to explore the non-queue related optimization possibilities the operator-fusion gives".
  • "We don't need more than the 2 queue types we currently have." - neither here nor there IMO. The number of classes is immaterial, if it's good enough to copy it's probably worth depending on. I think this is an argument for least effort, it's easier to copy than to fix the build script. We are all more familiar with copy and paste than the fiddly innards of maven and gradle.
  • "We can afford inlining the AtomicReferenceArray whereas JCTools can't really." - I thought you weren't concerned with queue performance, just wanted a known quantity? In the context of a framework I'd argue the Atomic option is for anti-Unsafe environments, where allot of assumptions go out the window anyhow. Where Unsafe is available using the SpscGrowable will likely give better results.

I understand JCTools may not be a good fit if you want to avoid the paddind etc. But the AtomicSpsc is notionally targeting this exact usecase. Why not collaborate to provide a common solution?
This hits on another obejction to C&P dependencies, they make a bad relationship in that they block the consuming library authors from improving the origin.

@akarnokd
Copy link
Contributor

akarnokd commented Mar 1, 2016

I thought you weren't concerned with queue performance, just wanted a known quantity

With our Spsc, it's more about allocation than queue performance at the moment. SpscAtomicArrayQueue allocates 1 parent class, 2 AtomicLong, 1 AtomicReferenceArray and 1 backing array. By inlining the field updaters, Rsc's Spsc allocates 1 parent class and 1 array. Unfortunately, the linked-array spsc can only save on the 2 AtomicLong instances compared to JCTools' version. This allocation amount is dominant for short-lived reactive sequences.

Why not collaborate to provide a common solution?

I don't think you could encode all the configuration option in a single class: to pad index fields or not, pad elements or not, use lookahead or not, support exact capacity or not. I consider both Rsc and JCTools as a template where one can extract and customize them to his/her liking.

I'll allocate some time and do a comparative benchmark of the queues in our dataflow context.

@nitsanw
Copy link
Author

nitsanw commented Mar 1, 2016

AFU change can be implemented for JCTools, might be worth it. I'm similarly open to extending the AtomicRefArray (you objected last we discussed it, but I actually have no issue with it. If people down cast from Queue or use the other methods directly they are shooting their own foot.)

"..I don't think you could encode all the configuration option in a single class.." This is something I mean to address using bytecode generation.

@akarnokd
Copy link
Contributor

akarnokd commented Mar 1, 2016

I've run a comparison benchmark and got interesting results.

image

with errors:

image

I expected unsafe to be better, but I'm surprised how the growable and unbounded variants do even better in some cases.

Also, Rsc's spsc queue has around 5-30% more overhead; could be the index trashing or the field updater overhead.

image

@akarnokd
Copy link
Contributor

akarnokd commented Mar 1, 2016

It seems the main source of lost throughput was the index trashing:

image

The rsc column is the original plain SpscArrayQueue, the rsc1 had just item padding and the rsc2 has index padding but no item padding.

Comparing against the unsafe version:

image

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

No branches or pull requests

3 participants