Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8274771: Map, FlatMap and OrElse fluent bindings for ObservableValue #675

Closed
wants to merge 27 commits into from

Conversation

hjohn
Copy link
Collaborator

@hjohn hjohn commented Nov 18, 2021

This is an implementation of the proposal in https://bugs.openjdk.java.net/browse/JDK-8274771 that me and Nir Lisker (@nlisker) have been working on. It's a complete implementation including good test coverage.

This was based on #434 but with a smaller API footprint. Compared to the PoC this is lacking public API for subscriptions, and does not include orElseGet or the conditionOn conditional mapping.

Flexible Mappings
Map the contents of a property any way you like with map, or map nested properties with flatMap.

Lazy
The bindings created are lazy, which means they are always invalid when not themselves observed. This allows for easier garbage collection (once the last observer is removed, a chain of bindings will stop observing their parents) and less listener management when dealing with nested properties. Furthermore, this allows inclusion of such bindings in classes such as Node without listeners being created when the binding itself is not used (this would allow for the inclusion of a treeShowingProperty in Node without creating excessive listeners, see this fix I did in an earlier PR: #185)

Null Safe
The map and flatMap methods are skipped, similar to java.util.Optional when the value they would be mapping is null. This makes mapping nested properties with flatMap trivial as the null case does not need to be taken into account in a chain like this: node.sceneProperty().flatMap(Scene::windowProperty).flatMap(Window::showingProperty). Instead a default can be provided with orElse.

Some examples:

void mapProperty() {
  // Standard JavaFX:
  label.textProperty().bind(Bindings.createStringBinding(() -> text.getValueSafe().toUpperCase(), text));

  // Fluent: much more compact, no need to handle null
  label.textProperty().bind(text.map(String::toUpperCase));
}

void calculateCharactersLeft() {
  // Standard JavaFX:
  label.textProperty().bind(text.length().negate().add(100).asString().concat(" characters left"));

  // Fluent: slightly more compact and more clear (no negate needed)
  label.textProperty().bind(text.orElse("").map(v -> 100 - v.length() + " characters left"));
}

void mapNestedValue() {
  // Standard JavaFX:
  label.textProperty().bind(Bindings.createStringBinding(
    () -> employee.get() == null ? ""
        : employee.get().getCompany() == null ? ""
        : employee.get().getCompany().getName(),
    employee
  ));

  // Standard JavaFX + Optional:
  label.textProperty().bind(Bindings.createStringBinding(
      () -> Optinal.ofNullable(employee.get())
          .map(Employee::getCompany)
          .map(Company::getName)
          .orElse(""),
     employee
  ));

  // Fluent: no need to handle nulls everywhere
  label.textProperty().bind(
    employee.map(Employee::getCompany)
            .map(Company::getName)
            .orElse("")
  );
}

void mapNestedProperty() {
  // Standard JavaFX:
  label.textProperty().bind(
    Bindings.when(Bindings.selectBoolean(label.sceneProperty(), "window", "showing"))
      .then("Visible")
      .otherwise("Not Visible")
  );

  // Fluent: type safe
  label.textProperty().bind(label.sceneProperty()
    .flatMap(Scene::windowProperty)
    .flatMap(Window::showingProperty)
    .orElse(false)
    .map(showing -> showing ? "Visible" : "Not Visible")
  );
}

Note that this is based on ideas in ReactFX and my own experiments in https://github.com/hjohn/hs.jfx.eventstream. I've come to the conclusion that this is much better directly integrated into JavaFX, and I'm hoping this proof of concept will be able to move such an effort forward.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (3 reviews required, with at least 1 Reviewer, 2 Authors)
  • Change requires a CSR request to be approved

Issues

  • JDK-8274771: Map, FlatMap and OrElse fluent bindings for ObservableValue
  • JDK-8277456: Map, FlatMap and OrElse fluent bindings for ObservableValue (CSR)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jfx pull/675/head:pull/675
$ git checkout pull/675

Update a local copy of the PR:
$ git checkout pull/675
$ git pull https://git.openjdk.org/jfx pull/675/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 675

View PR using the GUI difftool:
$ git pr show -t 675

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jfx/pull/675.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Nov 18, 2021

👋 Welcome back jhendrikx! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Ready for review label Nov 18, 2021
@mlbridge
Copy link

mlbridge bot commented Nov 18, 2021

@kevinrushforth
Copy link
Member

This will need an API review followed by an implementation review.

/csr
/reviewers 3

@openjdk openjdk bot added the csr Need approved CSR to integrate pull request label Nov 18, 2021
@openjdk
Copy link

openjdk bot commented Nov 18, 2021

@kevinrushforth has indicated that a compatibility and specification (CSR) request is needed for this pull request.
@hjohn please create a CSR request for issue JDK-8274771. This pull request cannot be integrated until the CSR request is approved.

@openjdk
Copy link

openjdk bot commented Nov 18, 2021

@kevinrushforth
The number of required reviews for this PR is now set to 3 (with at least 1 of role reviewers).

@hjohn
Copy link
Collaborator Author

hjohn commented Nov 19, 2021

@kevinrushforth I've created the CSR here: https://bugs.openjdk.java.net/browse/JDK-8277456 -- it is my first CSR, if you any feedback on it I'd appreciate it.

@hjohn
Copy link
Collaborator Author

hjohn commented Dec 15, 2021

I've rebased this PR on the current master to pull in the JUnit 5 upgrade, and I've updated the tests in this PR to use JUnit 5.

@nlisker
Copy link
Collaborator

nlisker commented Dec 30, 2021

I will start my review this week.

Copy link
Collaborator

@nlisker nlisker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the review of the API. I suggested also adding the examples in the POC or similar to the relevant methods.

I have already commented on the implementation while we were discussing it in the mailing list, so that review will be minimal.

I will review the tests soon, but after a quick look they seem fine.

Don't worry about the CSR for now. When all reviewers agree on the API you can copy the final result to the CSR.

Unrelated to the review, will it makes sense in the future to make all bindings lazily register listeners like LazyObjectBinding, perhaps when we introduce Subscription?

@hjohn
Copy link
Collaborator Author

hjohn commented Jan 5, 2022

Unrelated to the review, will it makes sense in the future to make all bindings lazily register listeners like LazyObjectBinding, perhaps when we introduce Subscription?

That would need to be very well tested. There are some noticeable differences in behavior vs the standard bindings:

  1. Standard bindings can more easily be GC'd (accidentally usually, but it could be intentional), take for example:
     textProperty.addListener((obs, old, current) -> System.out.println(current));
     textProperty.concat("A").addListener((obs, old, current) -> System.out.println(current));

These behave very different. The first listener keeps working as you'd expect, while the second one can stop working as soon the GC runs. This is because concat results in an ObservableValue that is weakly bound. Compare this to:

 textProperty.map(x -> x + "A").addListener((obs, old, current) -> System.out.println(current));

This keeps working and will not be GC'd by accident.

  1. Standard bindings will always cache values. This means that when getValue is called, it will just return the cached value instead of calling computeValue. If computeValue is really expensive (unwise since this happens on the FX thread) then this cost is paid each time for Lazy bindings, at least when they're not bound to anything else (unobserved) and you are just using getValue. Furthermore, for a chain of Lazy bindings that is unobserved, this would propagate through the entire chain. As soon as they're observed though, they become non-lazy and values are cached.

In effect, Lazy bindings behave the same as standard bindings when they're observed but their behavior changes when not observed: they never become valid and they stop caching values

This has pros and cons:

Pro: Lazy bindings can be garbage collected when not referenced and not actively being used without the use of weak listeners. See the first example where the binding keeps working when used by println lambda. This is in contrast to traditional bindings which can be garbage collected when unreferenced by user code even if actively used(!!). This is a huge gotcha and one of the biggest reasons to use the lazy model.

Pro: Lazy bindings don't register unnecessary listeners to be aware of changed or invalidated values that are not used by anyone. A good example is the problems we saw about a year ago where Node had created an isShowing property which bounds to its Scene and Window. This looks harmless, until you realize that a listener is created on these properties for each Node in existence. If a Scene has tens of thousands of Nodes then that means that the Scene#windowProperty will have a listener list containing tens of thousands of entries. This list is not only expensive to change (when a node is added or removed) but also expensive to act on (when the scene, window or its showing state changes). And for what? Almost nobody was actually using this property, yet listeners had to be added for each and every Node. In effect with lazy bindings, it is much cheaper now to create properties that create convenient calculated values which will only register listeners or compute their values when in actual use.

Con: lazy bindings never become valid when not observed. This means that getValue will always have to recompute the value as the value is not cached. However, if you register an invalidation listener the binding becomes observed and it will start behaving like a traditional binding sending invalidation events and caching the current value. A small "pro" here could be that this means that intermediate values in a long binding chain don't take up memory (and are prevented from GC), however that goes away as soon as the binding is observed.

In summary: I think lazy bindings are far superior in the experience that they offer, but it does come at a cost that values may need to be recomputed every time when the bindings are unobserved. However, doing substantial work in computeValue is probably unwise anyway as it blocks the FX thread so making lazy binding the default behavior in well behaving code could be of only minor impact.

Comment on lines 150 to 152
* @return an {@link ObservableValue} which provides a mapping of the value
* held by this {@code ObservableValue}, and provides {@code null} when
* this {@code ObservableValue} holds {@code null}, never null
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea to mention that it can hold null.
I slightly prefer to say that the returned ObservableValue holds the result of the mapping rather than holds the mapping. I don't really mind it, but it's the phrasing used in the method description "holds the result of applying a mapping". "The mapping" itself could be mistaken for the mapping Function in my opinion. If you think it's clear, you can change it to that phrasing, it's also fine.

Copy link
Collaborator

@nlisker nlisker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good. I added some minor grammar comments. I think that the API is in a good spot.

Copy link
Collaborator

@nlisker nlisker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API looks good to me. After you get the other reviewers to look at it you can update the CSR.

As for the fluent binding tests:

  • The private fields in ObservableValueFluentBindingsTest can be final.
  • Most tests that I have seen in JavaFX use the assert overloads that include a message that explains what the value should be or what it means if the assertion failed. I don't know how much of a requirement it is. I can help write these if you want.
  • There is a utility class JMemoryBuddy that was added to JavaFX to test for memory leaks. This would have been helpful in the GC tests, but it seems that the class is not suited too well for this case (it requires you to null you own reference etc.). I think the way you did it is fine, but maybe that class should be updated (in a different patch) to be more welcoming for these checks.

I would also add tests of chaining the observables in common use cases. I wrote some myself to test it for map:

ObservableValue<String> observableValue1 = property.map(v -> v + "Z");
ObservableValue<String> observableValue2 = observableValue1.map(v -> v + "X");
            
assertEquals("AZ", observableValue1.getValue());
assertEquals("AZX", observableValue2.getValue());
            
property.set("B");
            
assertEquals("BZ", observableValue1.getValue());
assertEquals("BZX", observableValue2.getValue());

for orElse:

ObservableValue<String> observableValue1 = property.map(v -> v + "Z");
ObservableValue<String> observableValue2 = observableValue1.orElse("K");
            
assertEquals("AZ", observableValue1.getValue());
assertEquals("AZ", observableValue2.getValue());
            
property.set("B");
            
assertEquals("BZ", observableValue1.getValue());
assertEquals("BZ", observableValue2.getValue());

property.set(null);
            
assertNull(observableValue1.getValue());
assertEquals("K", observableValue2.getValue());

You can clean these up and use them or write your own. flatMap should also have one. I can clean mine up and post it if it helps (I did some dirty testing there).

Copy link
Collaborator

@nlisker nlisker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LazyObjectBindingTest tests look good. I was thinking about adding a test that checks the compyteValue calls when a listener is attached (should increase) and when not attached (should not increase), but I think this behavior is already tested in ObjectBinding itself, so I'll leave it to you.

Rewrote and extended ObservableFluentBindingsTest for more clarity
Use JMemoryBuddy in ReferenceAsserts helper
@hjohn
Copy link
Collaborator Author

hjohn commented Jan 27, 2022

  • Most tests that I have seen in JavaFX use the assert overloads that include a message that explains what the value should be or what it means if the assertion failed. I don't know how much of a requirement it is. I can help write these if you want.

I'm a bit at a loss there, many of the asserts are think are self explanatory already, and a text describing the events that led to this assert would basically describe the same as the path of all @Nested classes + method name. For example:

  • ObservableFluentBindingsTest
    • When_map_Called
      • WithNotNullReturns_ObservableValue_Which
        • When_orElse_CalledReturns_ObservableValue_Which
          • WhenObserved
            • ShouldApplyMapThenOrElseOperation

I've however cleaned up the entire test using different values that more clearly show what is happening (like Left after map becomes Left+map and after a second map it becomes Left+map+map2). For orElse I use Empty now to make it more clear. I also added assertObserved and assertNothingIsObserved methods to get rid of the ugly checking of the values list to see what the test listener was observing.

Please let me know if that helps.

  • There is a utility class JMemoryBuddy that was added to JavaFX to test for memory leaks. This would have been helpful in the GC tests, but it seems that the class is not suited too well for this case (it requires you to null you own reference etc.). I think the way you did it is fine, but maybe that class should be updated (in a different patch) to be more welcoming for these checks.

I've applied it in the helper class I was using.

I would also add tests of chaining the observables in common use cases. I wrote some myself to test it for map:

I added additional nested classes for Map which adds another nesting that applies another Map or an OrElse. Same for FlatMap.

I see the fluent binding test mostly as a test to ensure the basic functionality and to ensure things are getting garbage collected when they should and not when they shouldn't. If we want to test more specific cases without having to worry about GC, perhaps another test class is better suited.

You can clean these up and use them or write your own. flatMap should also have one. I can clean mine up and post it if it helps (I did some dirty testing there).

That's good, I hope you didn't find anything surprising :)

Copy link
Collaborator

@nlisker nlisker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests look good. I'm happy with the coverage and added one comment about a missing case. My own sanity checks also work as I expect.

Will approve when these are addressed as I have already reviewed the API and implementation.

Copy link
Collaborator

@nlisker nlisker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good. I left a few minor optional comments after doing a quick re-review of everything. You can wait until the other reviews with these if you want.

@hjohn
Copy link
Collaborator Author

hjohn commented Jul 6, 2022

Thanks @hjohn for the detailed explanation. However, I'm not convinced that this is the right model. Many applications will use a layered architecture where a short-lived property is bound to a long-lived property:

var label = new Label(); // scoped to the lifetime of a dialog window
var caption = model.captionProperty(); // scoped to the lifetime of the application

label.textProperty().bind(caption);

This works fine, and doesn't require users to think about unbinding the property before it goes out of scope. I would argue that this is a fundamental feature of the JavaFX property system which makes it easy to reason about code.

I'm not entirely in agreement here, but also not entirely disagreeing. I mean, if there was a way to make this work perfectly, then I agree that this "automatic" unsubscribing is a great mechanism. However, because you are relying on the GC the code is actually subtly broken right from the start -- you just don't notice it so easily:

  1. Listeners which are considered to be "gone" (closed the UI) may still be working in the background listening to long lived models, which can trigger all kinds of things. For example, I've seen a UI bound to a long lived model that would store a "last selected" item when the selection changes. When that UI is closed and reopened you can sometimes see this storage happening twice, or three or four times, because the old listeners weren't GC'd yet. Explicit management is required to avoid such problems.

  2. Code that may seem to work fine during a quick testing session may stop working when exercised a bit longer and a GC occurs. This is quite a common thing that JavaFX newcomers encounter. They're playing around using the expression mechanism, code seems to work, but after using a UI for a while, it suddenly stops updating something. They forgot to make a hard reference somewhere.

  3. The issues I highlighted in my earlier post. Why is there a difference in how listeners work when it is a property vs an expression? Why does it work differently when using a binding? Replacing a binding with a listener and vice versa can have surprising consequences.

The GC issues are some of the hardest to debug, as the problem may appear or disappear (depending on the type) at any time during debugging if the GC kicks in. Bug reports from users are going to be even harder to analyze as they're basically going to be describing something that occurs infrequently and without any pattern to it, and you can't reproduce it. You'll have to ask things like the exact GC used, any GC parameters, all things that are normally completely irrelevant but now may have serious repercussions. Conservative GC's may not even be suitable to use in combination with JavaFX (when not using explicit unbinding/unsubscribing). With a regular memory leak I can make a heap dump and see that a piece of UI has 500 copies in memory. While not fun, finding a GC root then is often sufficient to find the offending reference and solve the problem.

Any alternative to this model involves some form of explicit lifetime management (conditionOn, Subscription, etc). However, as we can see with listeners (that need to be manually added and removed), explicit lifetime management is one of the most abundant sources of bugs.

I don't disagree that is a big source of bugs, however the trade off made introduces a new category of bugs that is fleeting and an order of magnitude harder to reproduce and reason about.

Coming back to my first example, what I would like to see is the following code work as expected:

var label = new Label(); // scoped to the lifetime of a dialog window
var caption = model.captionProperty(); // scoped to the lifetime of the application

// I shouldn't have to worry about this code:
label.textProperty().bind(caption.map(String::toUpperCase));

However, not only does this code leak memory, it also makes it far too easy to leak memory. The API looks and feels natural and easy to use, but hides a bug that will not be visible to most users. I would argue that the potential for bugs is on the same order of magnitude as with explicitly added listeners, but much more subtle as it's hidden in a nice and fluent API.

Just to make sure we are on same page, it leaks a bit more memory than it would if it was using concat or a regular property, as the listener stub is leaked + the mapping itself. The label is not leaked. The leak "fixes" itself (in both cases) once caption is updated: the left over listener stub which label added gets removed, which in turn causes an unsubscribe on the mapping, which in turn unsubscribes itself from caption. The mapping is then eligible for GC.

Here's my conclusion: I think the "reachability" model is superior to manual and explicit lifetime management, and I think we should make it work before integrating this PR. As currently proposed, the new map and flatMap APIs have the potential to cause lots of bugs, or force users to use weird constructs to prevent memory leaks.

I think something might be possible. I personally however find that for anything but the most trivial FX apps, explicit management (although not for the case you highlighted above) is preferable to avoid hard to pin down bugs related to untimely GC's. Explicit management could be far simpler than it currently is, with a little bit of help from JavaFX.

An isShowing property on Node in combination with conditionOn (since while is unfortunately a keyword) would make binding/listeners between properties with different lifecycles a lot simpler:

  label.textProperty().bind(caption.conditionOn(label::isShowing));  // the mapping is optional, this already has value

Note that if caption never changes, just binding it regularly will leave stubs behind with the new and old mechanism. Not a big memory leak, but it may add up if the binding is in something like a frequently occurring dialog or progress indicator or something.

@hjohn
Copy link
Collaborator Author

hjohn commented Jul 6, 2022

Let's say we introduce a marker interface that represents a thread-safety guarantee with regards to addListener and removeListener:

interface ConcurrentListenerAccess {}

Now we can use the existing Disposer mechanism to automatically schedule listeners to be removed from ConcurrentListenerAccess implementations when the bound property is eligible for collection.

It's sufficient to implement ConcurrentListenerAccess for LazyObjectBinding to make map and flatMap work the way I'd expect it to work.

You mean by Disposer something like com.sun.prism.impl.Disposer? I see that it has a ReferenceQueue in it, so I'm guessing that's the mechanism you are referring to. How would you want this triggered? I think a thread blocking on the queue might be best.

I think this is a very nice idea to potentially tackle this problem.

@hjohn
Copy link
Collaborator Author

hjohn commented Jul 6, 2022

@hjohn Thanks for your detailed analysis of the bindings GC situation. The conclusion I take from all of this is that the current implementation is fine, but we might consider a follow-up issue to clean up dead listener stubs.

From @mstr2 analysis, it is slightly more than just the stub in the case of the fluent bindings. See illustration below, assuming caption is the only hard reference, the dashed boxes represent things that are eligible for GC, dashed lines represent weak references. If caption is modified, triggering a clean-up, then everything is eligible for GC aside from caption:

image

It might be good to fix this right away.

@mstr2
Copy link
Collaborator

mstr2 commented Jul 7, 2022

I'm not entirely in agreement here, but also not entirely disagreeing. I mean, if there was a way to make this work perfectly, then I agree that this "automatic" unsubscribing is a great mechanism. However, because you are relying on the GC the code is actually subtly broken right from the start -- you just don't notice it so easily:

  1. [...]
  2. [...]
  3. The issues I highlighted in my earlier post. Why is there a difference in how listeners work when it is a property vs an expression? Why does it work differently when using a binding? Replacing a binding with a listener and vice versa can have surprising consequences.

I think a model with the following properties would be more consistent and easier to understand than that we currently have:

  1. An ObservableValue is eligible for collection if it is not reachable from user code, independent of whether it has listeners.
  2. A derived ObservableValue keeps its base value alive, i.e. the base value is reachable through the derived value.

The second property ensures that it will be sufficient to keep a reference to the last ObservableValue in an expressions like caption.map(String::toUpperCase).map(String::toLowerCase() without losing any of the intermediate values.

It is true that manually added listeners may suffer from unexpected collection, but something has to give. It seems like a reasonable trade-off to make listeners more complicated than binding expressions (which also hopefully discourages the use of manually added listeners). I would also argue that it is easier to explain that it's not the listener that's different in properties v. expressions, it's the reachability of the ObservableValue that's different.

Just to make sure we are on same page, it leaks a bit more memory than it would if it was using concat or a regular property, as the listener stub is leaked + the mapping itself. The label is not leaked. The leak "fixes" itself (in both cases) once caption is updated: the left over listener stub which label added gets removed, which in turn causes an unsubscribe on the mapping, which in turn unsubscribes itself from caption. The mapping is then eligible for GC.

Thanks for this clarification, I wasn't aware that the leaked listener is actually removed when the source property is changed. That makes the issue a little bit less concerning, but I'd still prefer a solution that doesn't require the source property to change.

@mstr2
Copy link
Collaborator

mstr2 commented Jul 7, 2022

You mean by Disposer something like com.sun.prism.impl.Disposer? I see that it has a ReferenceQueue in it, so I'm guessing that's the mechanism you are referring to. How would you want this triggered? I think a thread blocking on the queue might be best.

I think this is a very nice idea to potentially tackle this problem.

We already have com.sun.javafx.property.adapter.Disposer, which polls a ReferenceQueue on a separate thread. It's used for JavaBean properties.

@nlisker
Copy link
Collaborator

nlisker commented Jul 7, 2022

I didn't get into the fine details of the GC discussion yet, but I have a few points I would like to share:

  1. @hjohn's example 4, which seems to be a point of some disagreement, is a de facto issue. Even if the technicalities or semantics imply that this is correct behavior, StackOverflow is littered with "why did my listener suddenly stopped working?" questions that are caused by this, and I have fell into this pitfall more than once myself (by starting from something like example 2, then making a slight change that transforms the code into something like example 4).
    I didn't look yet into the possible alternatives discussed here for dealing with this, but I do believe that going with "this behavior is technically correct" is not the best practical decision, provided that changing it will be backwards compatible.
  2. On the subject of Disposer and cleaning references in a background thread, the JDK has Cleaner (there is a nice recent Inside Java post about it), which seems to do what we need, at least abstractly. I didn't look at the code to see if it fits our needs exactly. It also uses a ReferenceQueue with PhantomReferences.
  3. It is clear to me that whatever GC/referencing model we choose to go with eventually (with whatever combination of fluent vs. regular bindings, listeners vs. bindings, expressions vs. properties...), we will have to document what is expected of the user, and we will have to do it very well. Currently, there are some mentions in the docs of listeners holding strong references, and weak references being used in the implementation ("user beware"), but we will need to be a lot more explicit in my opinion. Something like @hjohn's set of examples or some snippets from this discussion would be a good starting point. If we don't want to commit to some detail, we should at least document the current behavior with a note that it might change. It's just too easy to mess this up (even before this PR was considered). I don't mind taking care of such a PR when an implementation is agreed upon.

@hjohn
Copy link
Collaborator Author

hjohn commented Jul 7, 2022

Going a step further, I think the idea of actively disposing dead listeners can help us solve this problem. But there's a catch: we can't just assume that we can safely call removeListener from a disposer thread (nor from the JavaFX application thread, for that matter).

There is no requirement that an ObservableValue implementation be safe to use with regards to concurrent reads and writes. In order to remove a listener on an arbitrary thread, we need to introduce a stronger thread-safety guarantee than we currently have. We can't do that in general, as there are countless ObservableValue implementations that would be broken after this change. But we can do it for mapped bindings (or any built-in binding implementations).

Now we can use the existing Disposer mechanism to automatically schedule listeners to be removed from ConcurrentListenerAccess implementations when the bound property is eligible for collection.

I've done some experimenting using a Cleaner suggested by @nlisker; I changed the private WeakListener class inside of StringPropertyBase and then added synchronized to all its addListener and removeListener methods:

private static class Listener implements InvalidationListener, WeakListener {
    private static final Cleaner CLEANER = Cleaner.create();

    private final WeakReference<StringPropertyBase> wref;
    private final Cleanable cleanable;

    public Listener(StringPropertyBase ref, Observable observable) {
        this.wref = new WeakReference<>(ref);
        
        Cleanable c = () -> observable.removeListener(this);

        this.cleanable = observable instanceof ConcurrentListenerAccess ? CLEANER.register(ref, c::clean) : c;
    }

    @Override
    public void invalidated(Observable observable) {
        StringPropertyBase ref = wref.get();
        if (ref == null) {
            cleanable.clean();
        } else {
            ref.markInvalid();
        }
    }

    @Override
    public boolean wasGarbageCollected() {
        return wref.get() == null;
    }
}

Doing 100.000 throwaway binds on a single property, this mechanism manages to keep up quite well using standard GC settings. Within seconds, all 100.000 stubs have been cleaned up (note: 100.000 is already an insane amount). If I go higher, the cleaning mechanism can't quite keep up, but that's primarily because the operation becomes quite expensive due to the large listener list involved (cleaning doesn't start immediately, and if the list of listeners managed by ExpressionHelper has grown close to a million items, removing items one by one is quite slow).

It's sufficient to implement ConcurrentListenerAccess for LazyObjectBinding to make map and flatMap work the way I'd expect it to work.

From what I can see, we'd need to change all locations where bind is implemented using a weak listener, and add a Cleaner. Those same classes need their add/remove listener methods synchronized (putting synchronized on the static helper methods in ExpressionListener doesn't look like a good idea as it would block listener changes to unrelated properties).

If there is agreement on this approach, I could create a separate MR for this (since it could be implemented and merged before this one) or integrate the changes into this one.

@kevinrushforth
Copy link
Member

If there is agreement on this approach, I could create a separate MR for this (since it could be implemented and merged before this one) or integrate the changes into this one.

I need to read through the recent GC discussion one more time, but I like the approach of using a Cleaner to get rid of stale bindings that would otherwise be released when the base property changes. I think the cleanup could be done as a follow-on fix, in which case it wouldn't block getting this in. The cleanup should also go into JavaFX 19, but that could go in after RDP1 (whereas this feature cannot).

@mstr2
Copy link
Collaborator

mstr2 commented Jul 8, 2022

I was concerned that the current implementation might leave behind a stale listener that is never collected. If that was the case, I would not be in favor of integrating this PR.

But as it turns out, the stale listener is cleared when the source property changes, which doesn't make it as serious an issue as I originally thought. I see no problem with integrating this PR now, and fix the cleanup issue later.

@hjohn
Copy link
Collaborator Author

hjohn commented Jul 8, 2022

Remember this program? :)

// Sample program that shows stub leakage:
public static void main(String[] args) throws InterruptedException {
  StringProperty x = new SimpleStringProperty();

  // Startup
  Thread.sleep(20000);

  for(int i = 0; i < 10000000; i++) {
    new SimpleStringProperty().bind(x);
  }

  // Bindings added
  System.gc();  // 400 MB of uncollectable stubs does not get GC'd
  Thread.sleep(20000);

  // Triggers "stub" clean-up (which fails miserably because of lots of array copying)
  x.set("A");

  System.gc();
  Thread.sleep(20000);
}

image

I modified ExpressionHelper to use a custom Collection class that has O(log N) performance for the only things it uses (add at end, remove a specific listener, iteration). It has the exact same characteristics as ArrayList, that is, it is ordered, it allows duplicates and it maintains the exact positions when duplicates are part of the list, which means the call order of listeners is exactly the same (a LinkedHashMap<Listener, counter> would not have this characteristic). Furthermore, it is very similar to an ArrayList, which means relatively low overhead (there is no Entry wrapper for each item in the "list") and good cache locality when iterating.

Performance looks now like this:

image

The price for this is increased memory use (you can see that having 10.000.000 binding stubs took around 400 MB before, and now it takes about twice that). The extra memory use is only paid when there is more than 1 listener (as per how ExpressionHelper operates).

@kevinrushforth
Copy link
Member

@hjohn If there are no more questions or concerns raised in the next few hours, you can integrate this PR. Please file the follow-up JBS bug to address the cleanup.

@kevinrushforth
Copy link
Member

Additionally, can you file a docs bug to clarify the GC behavior of bindings and mappings as suggested by @nlisker in point 3 of this comment?

@nlisker
Copy link
Collaborator

nlisker commented Jul 8, 2022

Additionally, can you file a docs bug to clarify the GC behavior of bindings and mappings as suggested by @nlisker in point 3 of this comment?

@hjohn If the followup work on this issue is too much and you want to focus on the implementation, you can assign it to me.

@mstr2
Copy link
Collaborator

mstr2 commented Jul 8, 2022

I've done some experimenting using a Cleaner suggested by @nlisker;
[...]
From what I can see, we'd need to change all locations where bind is implemented using a weak listener, and add a Cleaner.

Yes, this must be implemented for all *PropertyBase classes. We can save a few bytes for non-ConcurrentListenerAccess observables with an implementation like this:

private static class Listener implements InvalidationListener, WeakListener {
    private final WeakReference<StringPropertyBase> wref;
    private final Cleaner.Cleanable cleanable;

    public Listener(StringPropertyBase ref, Observable observable) {
        this.wref = new WeakReference<>(ref);

        if (observable instanceof ConcurrentListenerAccess) {
            cleanable = CLEANER.register(ref, this::removeListener);
        } else {
            cleanable = null;
        }
    }

    // Note: dispose() must be called in StringPropertyBase.unbind()
    public void dispose() {
        if (cleanable != null) {
            cleanable.clean();
        } else {
            removeListener();
        }
    }

    private void removeListener() {
        StringPropertyBase ref = wref.get();
        if (ref != null) {
            ref.observable.removeListener(this);
        }
    }

    @Override
    public void invalidated(Observable observable) {
        StringPropertyBase ref = wref.get();
        if (ref == null) {
            dispose();
        } else {
            ref.markInvalid();
        }
    }

    @Override
    public boolean wasGarbageCollected() {
        return wref.get() == null;
    }
}

Those same classes need their add/remove listener methods synchronized (putting synchronized on the static helper methods in ExpressionListener doesn't look like a good idea as it would block listener changes to unrelated properties).

Not sure I'm following here. Do you want to implement this pattern for all property implementations? If we just want to implement it for mapped bindings, only the LazyObjectBinding class needs to be thread-safe.

Note that it's not enough for the addListener and removeListener methods to be synchronized. All reads and writes must be protected, which (in the case of LazyObjectBinding) includes the invalidate and isObserved methods.

But if we do that, the lock tax must be paid on every single access to the ExpressionHelper field (because ExpressionHelper is not thread-safe). Locking any and all usages of ExpressionHelper can have performance implications that we need to evaluate carefully.

However, I think we might get away with an implementation that only requires locking for the addListener/removeListener methods, but (crucially) not for fireValueChangedEvent:

private volatile ConcurrentExpressionHelper<T> helper = null;

@Override
public synchronized void addListener(InvalidationListener listener) {
    helper = ConcurrentExpressionHelper.addListener(helper, this, listener);
}

@Override
public synchronized void removeListener(InvalidationListener listener) {
    helper = ConcurrentExpressionHelper.removeListener(helper, listener);
}

@Override
public synchronized void addListener(ChangeListener<? super T> listener) {
    helper = ConcurrentExpressionHelper.addListener(helper, this, listener);
}

@Override
public synchronized void removeListener(ChangeListener<? super T> listener) {
    helper = ConcurrentExpressionHelper.removeListener(helper, listener);
}

protected void fireValueChangedEvent() {
    ConcurrentExpressionHelper.fireValueChangedEvent(helper);
}

This implementation works by creating immutable specializations of ConcurrentExpressionHelper. When a listener is added, a new ConcurrentExpressionHelper instance is created, which doesn't interfere with an existing instance that is currently in use by ConcurrentExpressionHelper.fireValueChangedEvent. TheConcurrentExpressionHelper.Generic specialization is mutable, but uses the lock-free ConcurrentLinkedQueue to store its listeners.

ConcurrentExpressionHelper avoids locking the (probably most frequently invoked) fireValueChangedEvent method, but sacrifices cache locality when a large number of listeners is added. We should probably benchmark all of these proposals before deciding on a solution.

@hjohn
Copy link
Collaborator Author

hjohn commented Jul 8, 2022

/integrate

@openjdk openjdk bot added the sponsor Ready to sponsor label Jul 8, 2022
@openjdk
Copy link

openjdk bot commented Jul 8, 2022

@hjohn
Your change (at version d66f2ba) is now ready to be sponsored by a Committer.

@kevinrushforth
Copy link
Member

/sponsor

@openjdk
Copy link

openjdk bot commented Jul 8, 2022

Going to push as commit 60c75b8.
Since your change was applied there have been 9 commits pushed to the master branch:

  • 178d898: 8276056: Control.skin.setSkin(Skin) fails to call dispose() on discarded Skin
  • fc6a602: 8289587: IllegalArgumentException: Color.rgb's red parameter (-16776961) expects color values 0-255
  • 4dcd9e0: 8289171: Blank final field 'dialog' may not have been initialized in scene.control.Dialog:521
  • 28b8220: 8279297: Remove Shape::setMode method
  • 77ecfb7: 8289390: Fix warnings: type parameter E is hiding the type E
  • 704baa3: 8289396: Fix warnings: Null pointer access: The variable instance can only be null at this location
  • b9f3907: 8289751: Multiple unit test failures after JDK-8251483
  • 222b2b1: 8251483: TableCell: NPE on modifying item's list
  • b3eca1f: 8283346: Optimize observable ArrayList creation in FXCollections

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jul 8, 2022
@openjdk openjdk bot closed this Jul 8, 2022
@openjdk openjdk bot removed ready Ready to be integrated rfr Ready for review sponsor Ready to sponsor labels Jul 8, 2022
@openjdk
Copy link

openjdk bot commented Jul 8, 2022

@kevinrushforth @hjohn Pushed as commit 60c75b8.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@hjohn
Copy link
Collaborator Author

hjohn commented Jul 8, 2022

Not sure I'm following here. Do you want to implement this pattern for all property implementations? If we just want to implement it for mapped bindings, only the LazyObjectBinding class needs to be thread-safe.

Yes, true, only making it thread-safe for that class should remove a chain of fluent bindings. I think however that it would be good to implement this for as many classes as we can as the stub cleaning is normally only triggered on invalidation/changes (and as I recently discovered, when ExpressionHelper resizes its backing list).

Note that it's not enough for the addListener and removeListener methods to be synchronized. All reads and writes must be protected, which (in the case of LazyObjectBinding) includes the invalidate and isObserved methods.

Yes, true, I only fixed synchronization issues in my experiment and didn't look much further than that yet.

But if we do that, the lock tax must be paid on every single access to the ExpressionHelper field (because ExpressionHelper is not thread-safe). Locking any and all usages of ExpressionHelper can have performance implications that we need to evaluate carefully.

Yeah, we shouldn't do that, it synchronizes all accesses to all property lists everywhere, it would be an easy "fix" as it's in one place, but it doesn't feel right.

However, I think we might get away with an implementation that only requires locking for the addListener/removeListener methods, but (crucially) not for fireValueChangedEvent:

This implementation works by creating immutable specializations of ConcurrentExpressionHelper. When a listener is added, a new ConcurrentExpressionHelper instance is created, which doesn't interfere with an existing instance that is currently in use by ConcurrentExpressionHelper.fireValueChangedEvent. TheConcurrentExpressionHelper.Generic specialization is mutable, but uses the lock-free ConcurrentLinkedQueue to store its listeners.

I see you have been playing with improving ExpressionHelper as well :) I noticed there is a bug in the current one, where a copy of the list is made each time when locked is true, even when the list was already copied for the "current" lock. I ran into this problem right after I fixed the performance issue (it wasn't obvious before as it was already very slow, but basically an extra copy is happening in some circumstances on top of the array copying that is done to remove an entry).

The situation where locked is true doesn't happen that often for normal code, but it happens specifically in the case when a call to invalidated is cleaning up dead stubs...

ConcurrentExpressionHelper avoids locking the (probably most frequently invoked) fireValueChangedEvent method, but sacrifices cache locality when a large number of listeners is added. We should probably benchmark all of these proposals before deciding on a solution.

Yes, I think that's a good idea, I considered using ConcurrentLinkedQueue as well since we don't need a structure with index based access, but I couldn't find one with good O performance for remove(T) that wouldn't subtly change the current semantics. FWIW, here's the quick Collection implementation I whipped up: https://gist.github.com/hjohn/8fee1e5d1a9eacbbb3e021f8a37f582b

And the changes in ExpressionHelper (including different locking): https://gist.github.com/hjohn/f88362ea78adef96f3a54d97e2405076

@chengenzhao
Copy link

@hjohn
Sorry to append message here, but I don't know other places where we could talk about this topic
Is it a good idea if Binding class provide method like this?
Inspired by this PR
69A930A1704FF89F6DE3E7F50785D8C1

@hjohn
Copy link
Collaborator Author

hjohn commented Jan 6, 2023

@hjohn Sorry to append message here, but I don't know other places where we could talk about this topic Is it a good idea if Binding class provide method like this? Inspired by this PR

@chengenzhao The only way to do this right now using the fluent bindings is like this:

Rectangle rect = new Rectangle();

rect.widthProperty()
  .flatMap(x -> rect.heightProperty())
  .map(y -> rect.getWidth() * rect.getHeight())
  .addListener((obs, old, a) -> System.out.println("Area = " + a));

It is not very pretty.

@chengenzhao
Copy link

chengenzhao commented Jan 7, 2023

@hjohn
Is it that what we really need here is a reduce function?
since we already have map(thanks to this PR) then we probably need a reduce function to zip those map products?
e.g.

textProperty.bind(Bindings.reduce(widthProperty, heightProperty, (w, h) -> "Area is "+w.doubleValue()*h.doubleValue()));

my implementation

  @FunctionalInterface
  public interface Callable2<T0, T1, T> {
    T call(T0 t0, T1 t1);
  }

  public static <T0, T1, T> ObjectBinding<T> reduce(ObservableValue<T0> t0, ObservableValue<T1> t1, Callable2<T0, T1, T> callable) {
    return javafx.beans.binding.Bindings.createObjectBinding(() -> callable.call(t0.getValue(), t1.getValue()), t0, t1);
  }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

8 participants