Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Binary relation collections (from CanCan). #219

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

matthewhammer
Copy link
Contributor

@matthewhammer matthewhammer commented Feb 10, 2021

Binary relation representation for base library, as:

  • Module with purely-functional representation (Rel)
  • Module with OO wrapper and useful idioms for a canister state's relations (RelObj), as in CanCan.

See comments in committed code for more details.

Before merging:

  • some unit tests for each new module
  • complete doc for each module.

Opening this PR now so that interested parties can see/use it while it's WIP. cc @gobengo

@gobengo
Copy link

gobengo commented Feb 11, 2021

What do you think about including example usage of these or, better, a test? (I don't know if it's even possible to add a test in motoko)

Copy link
Contributor

@rossberg rossberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I'd just tweak the naming a little, see below.

}
};

public func keyOf0<X, Y>( rel : Rel<X, Y>, x : X) : Trie.Key<X> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the names are a bit cryptic. May I suggest:
keyOf => key
keyOf0 => keyLeft
keyOf1 => keyRight
getRelated0 => iterRelatedLeft
getRelated1 => iterRelatedRight

I'd also expect a has(rel, x, y) predicate to be available.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it better when you choose the names --- Thanks!

src/Rel.mo Show resolved Hide resolved
Comment on lines +73 to +74
hash = hash_ ;
equal = equal_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
hash = hash_ ;
equal = equal_
hash = hash;
equal = equal;

Comment on lines +93 to +94
public func empty<X, Y>( hash_ : HashPair<X, Y>,
equal_ : EqualPair<X, Y>) : Rel<X, Y> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public func empty<X, Y>( hash_ : HashPair<X, Y>,
equal_ : EqualPair<X, Y>) : Rel<X, Y> {
public func empty<X, Y>(
hash : HashPair<X, Y>,
equal : EqualPair<X, Y>
) : Rel<X, Y> {

Comment on lines +98 to +99
hash = hash_ ;
equal = equal_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
hash = hash_ ;
equal = equal_
hash = hash;
equal = equal;

src/Rel.mo Outdated Show resolved Hide resolved
///
/// See also: Rel module.
module {
public class RelObj<X, Y>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not name the class Rel as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me!

case null { null };
case (?(trie, stack2)) {
switch trie {
case (#empty) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: excessive indentation here, does this use tabs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No tabs.

Backstory: I'm using an emacs mode that insists of doing this to me. It's the same emacs mode we have in the repo. I've been hobbling along with it for two years and it's always been broken around switch/case indentation.

Copy link
Contributor

@kritzcreek kritzcreek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice data structure but why does it need to go into base?

I think base should be minimal. It should wrap compiler provided functionality and primitives and provide the most basic data structures that 95% of programs are going to need.

I don't think a binary relation like this fits that bill, I can totally see the use-case but I doubt almost every Motoko program will need this. I'd expect this kind of a data structure to live in a library instead so I can pull it in when I need it.

@nomeata
Copy link
Contributor

nomeata commented Feb 12, 2021

Might be worth putting on our weekly meeting agenda…pingng @stanleygjones on this.

@matthewhammer
Copy link
Contributor Author

I'd expect this kind of a data structure to live in a library instead so I can pull it in when I need it.

Fair points, @kritzcreek. Parts of me agree with you.

If you made this into a package, what would you call this library (of two modules)?

What would the intended scope of this library be, precisely? These two modules, only?

While parts of me agree, parts of me also disagree, especially as I try to answer such questions.

Here are my arguments for inclusion in base--- Let's discuss in a meeting, as @nomeata suggests.

I needed this structure for CanCan, which is a tiny, simple app. I also needed them for the Produce Exchange. So, in 2/2 cases, I used these structures (or similar ones) in the same way. That seems to argue in favor of inclusion into base.

Perhaps we can separate the issue of "binary relations in base" versus these particular modules in base?

I admit that these modules are not strictly necessary to implement these apps; they are required for me to implement them. I like to speak about problems in terms of their mathematical structure, and most apps will have some domain-specific relational structure, whether its modeled explicitly as a collection of binary relations using this API or not.

But if we want to represent binary relations (I argue that these belong in base, in some form), why not choose this implementation to start?

Co-authored-by: Andreas Rossberg <[email protected]>
@matthewhammer
Copy link
Contributor Author

matthewhammer commented Feb 12, 2021

@gobengo asks

What do you think about including example usage of these or, better, a test? (I don't know if it's even possible to add a test in motoko)

See the PR's unchecked to dos: One of them is (and has been) "some unit tests for each new module". That's our standard for each module in base, at this point.

As far as example usage, the other unchecked to do item above is "docs...".

@kritzcreek
Copy link
Contributor

If you made this into a package, what would you call this library (of two modules)?
What would the intended scope of this library be, precisely? These two modules, only?
While parts of me agree, parts of me also disagree, especially as I try to answer such questions.

I'll prefix this by saying that not knowing a good library name for a piece of code is not an argument for its inclusion into base.

Now to be a little more constructive 😄

If you made this into a package, what would you call this library (of two modules)?

The rest of the world seems to call it a bimap:

What would the intended scope of this library be, precisely? These two modules, only?

The Haskell, Rust, and JS libraries just contain the bimap data structure, the Java and C++ libraries are both examples of giant stdlib extensions.

I needed this structure for CanCan, which is a tiny, simple app. I also needed them for the Produce Exchange. So, in 2/2 cases, I used these structures (or similar ones) in the same way. That seems to argue in favor of inclusion into base.

I think the reason we need these relational structures more often for canisters than other ecosystems is to make good on our "no database" promise. Maybe there's a relational-algebra package waiting to be written that would include this bimap?

@matthewhammer
Copy link
Contributor Author

matthewhammer commented Feb 12, 2021

The rest of the world seems to call it a bimap.

No.

A bidirectional map is not the same thing as a binary relation, though every bidirectional map is a (restricted kind of) binary relation.

Critically, the interesting relations on social graphs (follows, followers, etc) are not bimaps.

Your links are about the former. This PR is about the latter.

@matthewhammer matthewhammer changed the title Feat: CanCan collections using base that may belong here. Feat: Binary relation collections (from CanCan). Feb 12, 2021
@matthewhammer
Copy link
Contributor Author

matthewhammer commented Feb 12, 2021

Just to be clear: "bimaps" are for creating efficient bidirectional associations for the (very particular) case where there is a bijection between data in two datasets.

This PR is about a distinct case where directional (and thus bidirectional) associations are not unique. I can have several followers. I can follow several people.

This PR is about representing binary relations, generally (not invertable functions, specifically).

@kritzcreek
Copy link
Contributor

Critically, the interesting relations on social graphs (follows, followers, etc) are not bimaps.
Your links are about the former. This PR is about the latter.

Interesting! So this is even closer to a typical "n to m" relation in database schema design than I had thought.

Allright let's talk about it on Tuesday. I'm even more convinced now that this might be the start of a relational-algebra package now.

@matthewhammer
Copy link
Contributor Author

You say "relational algebra" (which makes perfect sense), and I anticipate hearing "graph queries" and GraphQL once we start unpacking the problem.

I guess the value of Rel here is that it's just very modest. It's just for binary relations (directed graphs) without any edge data.

I agree that as we consider graphs (or "hyper graphs", for non-binary relations) and edge data, this all starts to become something like a graph database pretty quickly.

@chenyan-dfinity
Copy link
Contributor

Once we start to think about algebra, we can talk about differential operators, the change of the graph, the high order change of the graph, the incremental/reactive graph database :)

@matthewhammer
Copy link
Contributor Author

matthewhammer commented Feb 13, 2021

Here's but one datapoint in the graphql tech space: https://hasura.io/ and their open source project: https://hasura.io/opensource/

@matthewhammer
Copy link
Contributor Author

From https://graphql.org/

Describe what’s possible with a type system

Yeah, I think the GraphQL folks and I may have different definitions of that phrase means, and in particular, what actually qualifies (mathematically) as a "type system". But I appreciate the attempt at PL theory lip service, nevertheless.

@kritzcreek
Copy link
Contributor

kritzcreek commented Feb 15, 2021

and I anticipate hearing "graph queries" and GraphQL once we start unpacking the problem.

Certainly not from me :D But even if we were to look at GraphQL, you'll see that Hasura implements it by compiling it to a bunch of SQL, so we'd probably want to start with that either way.

the incremental/reactive graph database :)

Graph databases have the nichest of use-cases in the real world (at least for web-apps) and are overused by people that are bored of Postgres working too well for them. GraphQL (despite its name) has nothing to do with graph databases... So I'd say a graph database project sounds interesting, but not related to this PR.

@kritzcreek
Copy link
Contributor

Graph databases have the nichest of use-cases in the real world (at least for web-apps) and are overused by people that are bored of Postgres working too well for them. GraphQL (despite its name) has nothing to do with graph databases... So I'd say a graph database project sounds interesting, but not related to this PR.

Allright that was way too strongly worded... my apologies (it was a knee-jerk reaction after having to salvage people's data from MongoDBs in a previous job).

I think a Graph Database is really interesting, but I'd like us to start of by exploring a "simple" version of tables and Sql queries. Now I understand I can't just allocate your time however I want, so I started an experiment of my own. This way we can evaluate our approaches against different use-cases and maybe improve both of them :)

I'm starting to realize that maybe your approach is the more Motoko-ish way of going about it, because I constantly find myself wanting to access raw memory and casting data around :D

@ggreif ggreif force-pushed the master branch 2 times, most recently from d52aecd to 08507fc Compare October 21, 2022 12:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants