Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge with or incorporate parts of TupleVectors? #73

Open
oschulz opened this issue May 18, 2021 · 5 comments
Open

Merge with or incorporate parts of TupleVectors? #73

oschulz opened this issue May 18, 2021 · 5 comments

Comments

@oschulz
Copy link
Contributor

oschulz commented May 18, 2021

In a conversation on Discourse @cscherrer and I noticed that there's a lot of overlap between TupleVectors.jl and TypedTables.jl.

Both do store arrays of NamedTuples column-wise. TupleVectors doesn't have a Tables.jl interface yet, but does have some GeneralizedGenerated tricks and a few other things that I don't think TypedTables.jl has.

@cscherrer encouraged me to open up an issue regarding a possible merge of the two packages (see the Discourse thread).

@andyferris
Copy link
Member

Yes, the other obvious one is in this category is StructArrays.jl, which I understand is well regarded.

The philisophy here has evolved towards mostly being just an AbstractArray of named tuples and columnar storage. Obviously the original intention has been to support manipulation of tabular data/relations/dataframes in a type-stable way. To that end we provide a different default show and a Tables.jl interface, and use functions in Base and e.g. those in SplitApplyCombine.jl to manipulate relational data.

I would generally welcome collaboration - I see a general need for these kind of containers, and it's preferable to keep the ecosystem cohesive. However it could potentially be disruptive to existing users to drop the "table-ness" here. I have also been experimenting with e.g. tables whose columns are dictionaries (each column having identical keys, obviously), partitioned/grouped tables, etc.

How would you see something like this working?

@cscherrer
Copy link

I had planned to add a Tables interface to TupleVectors, so I don't think we'd need to drop the table-ness [insert SQL joke here].

I think the biggest thing would be connecting some methods from NestedTuples. For the show methods, maybe we could add a type parameter, so show could dispatch on that?

@andyferris
Copy link
Member

I sometimes wonder if there should be a really lightweight Table that e.g. wraps an AbstractArray and defines the show methods and so-on.

Or - if we can improve the Base definition for AbstractArray{<:NamedTuple}?

@oschulz
Copy link
Contributor Author

oschulz commented May 19, 2021

Speaking as a user, I love the "table-ness"! :-)

@oschulz
Copy link
Contributor Author

oschulz commented May 19, 2021

Speaking as a user, I love the "table-ness"! :-)

I mean the fact that it implements the Tables.jl interface (SplitApplyCombine I haven't used much).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants