Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide Stream collectors for persistent collections. #88

Open
bowbahdoe opened this issue Sep 20, 2020 · 3 comments
Open

Provide Stream collectors for persistent collections. #88

bowbahdoe opened this issue Sep 20, 2020 · 3 comments

Comments

@bowbahdoe
Copy link

bowbahdoe commented Sep 20, 2020

One of the major differences between pcollections and other persistent collections libraries like vavr is the desire to remain interoperable with the standard java collections framework. However, because persistent collections do not allow for mutation, we miss out on a good chunk of the standard library's methods for working with collections.

As an example, PVector implements java.util.List but does not support any of the mutating methods on that interface. For some there are natural replacements - add, addAll, remove and removeAll are supplanted by plus, plusAll, minus, and minusAll. For others though, there is no natural replacement: sort, retainAll, and clear have no direct analogue.

This contrasts with classes like vavr's Vector which are fairly full featured in terms of the methods they support on account of not needing to maintain this compatibility.

io.vavr.Vector.of(1, -1, 2, 3).filter(x -> x <= 2).sorted();

The analogue with regular java collections pre-java 8 would be

List<Integer> xs = new ArrayList<>();
xs.add(1);
xs.add(-1);
xs.add(2);
xs.add(3);
xs.retainAll(x -> x <= 2);
xs.sort();

But with java 8, we get the streams api, which lets us do those larger functional transformations in the same manner as vavr albeit via explicitly targeting a collection type as the end result..

List.of(1, -1, 2, 3)
  .stream()
  .filter(x -> x <= 2)
  .sorted()
  .collect(Collectors.toList());

Which is all a long winded way of saying, it would make a lot of sense to provide collector implementations for targeting persistent collections.

It matches up with the standard java way of doing things and would be a decent usability win. The main downside is that to do this would require a major version bump to make Java 8 the minimum and maybe working out some of the other issues with regards to performance. Specifically, if there is ever added a notion of transients or some other kind of mutable intermediate representation it would be transparent to swap the implementation of a collector provided by the library, perhaps without exposing that functionality until it is fully designed or maybe never.

public interface PVector<E> extends PSequence<E> {
    public static <T> Collector<T, ?, PVector<T>> collector() {
        return Collector.of(Empty::pvector, PVector::plus, PVector::plusAll);
    }
    // ...
}

TreePVector.from(List.of(1, -1, 2, 3))
  .stream()
  .filter(x -> x <= 2)
  .sorted()
  .collect(PVector.collector());

There would be other benefits to moving to Java 8 as well - chaos and major version bumps are ladders. Perhaps a PVector.empty() could supplant Empty.pvector() or similar api improvements and deprecations.

@hrldcpr
Copy link
Owner

hrldcpr commented Aug 19, 2022

We've upgraded to Java 8 now 🚀 e08ad47

@mwilliamson
Copy link

I knocked something together for my own use that I think does the trick for PVector. Specifically, it creates a mutable builder class for vectors. I had a couple of questions (although other thoughts are welcome!):

  1. Are there any existing mutable builder classes available? I had a quick look through the code and issues, which would appear to suggest not, hence defining my own.

  2. The implementation below uses PVector as the intermediate representation. Would it be more efficient to use a normal Java list as the intermediate representation, and then only to convert that into a PVector as the finishing step?

import org.pcollections.Empty;
import org.pcollections.PVector;

import java.util.stream.Collector;

public class PCollectors {
    private PCollectors() {
    }

    public static <T> Collector<T, PVectorCollector<T>, PVector<T>> toVector() {
        return Collector.of(
            PVectorCollector::new,
            PVectorCollector::add,
            PVectorCollector::addAll,
            PVectorCollector::toVector
        );
    }

    private static class PVectorCollector<T> {
        private PVector<T> vector = Empty.vector();

        void add(T other) {
            vector = vector.plus(other);
        }

        PVectorCollector<T> addAll(PVectorCollector<T> other) {
            vector = vector.plusAll(other.vector);
            return this;
        }

        PVector<T> toVector() {
            return vector;
        }
    }
}

@bowbahdoe
Copy link
Author

bowbahdoe commented May 31, 2023

In the generics, exposing the intermediate as ? would be appropriate - especially when that class is private.

public static <T> Collector<T, ?, PVector<T>> toVector() {

Which lets us separate this sort of question

Would it be more efficient to use a normal Java list as the intermediate representation

From API questions like - should this live in a PCollectors sort of thing or on the interfaces themselves?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants