Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GAP-24] Agreement Permissions Management #56

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

stranger80
Copy link
Contributor

The fundamental model in Golem assumes that visibility and rights to manage an Agreement, and all related domain entities (Activity, Invoice, DebitNote, Payment) are limited to the Requestor and Provider nodes which signed this Agreement. For more complex Golem application scenarios, however, a mechanism is required whereby the permissions can be delegated to a different node/identity. This GAP includes a summary of proposed permission model, enhancements to relevant APIs and some considerations on possible implementations.

@stranger80 stranger80 added the GAP: Feature GAP type: Feature label Aug 5, 2022
@stranger80
Copy link
Contributor Author

Notes from 26.01.2023:

  • Add comments re purpose of GAP-24 as a building block to enable higher-level "swarm supervisors" which will be subject of separate GAP. Point at the fact that "supervisor swarms" will need permissions management on higher level (GAOM descriptors? payment channels?), and some consensus protocols are required. This is not in scope of GAP-24.
  • prevent race conditions between setAgreementPermissions calls - make sure the call does include a method of prevention , like expect the call to include the latest ACL version number, and expect the API call to respond with error message "newer version exists"
  • how do we deal with "rogue grantee" - who can disconnect all other Grantees while Owner is offline
    • maybe to grant setAgreementPermissions further, there must be some consensus between all grantees?


**Note** that one of the parameters required for `AcceptDebitNote`/`AcceptInvoice` Payment API calls is an `allocationId`. The ACL allows for delegation of rights to an Agreement and associated entities, but there is no equivalent rights management for funds managed/owned by Agreement Owner.

Therefore a rights Grantee, who is willing to Accept an incoming Debit Note or Invoice must pass `allocationId` of their own Allocation.
Copy link
Contributor

@shadeofblue shadeofblue Jan 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, in this simple model, it's really unclear how the payment process would work in case there is more than one grantee ... who pays? anyone? do we have a "race" to pay? what incentive would grantees have to be the ones to pay? what if multiple grantees sent their AcceptDebitNote or AcceptInvoice messages independently and agree to pay?

Copy link
Contributor Author

@stranger80 stranger80 Feb 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • we abstract from incentives and intents to pay - we only allow for different requestors to pay.

  • should we raise a "bug" to core team to ensure AcceptDebitNote/AcceptInvoice (as well as Reject) is race-resistant, ie. only a single accept on Invoice/DebitNote should end with 'OK', subsequent attempts should return info "invoice/debitnote already accepted". Must also ensure for this to be exercised on provider-side, as there must be a single-source-of truth and atomicity achieved for Accept operations.

    • we ignore scenarios where a malicious provider may want to change the sequence of accept/reject messages (or hold/ignore incoming messages)
  • how to solve the need to propagate Agreement/Activity related events over the "entitled" Requestors. We need to discuss this with the core team [ACTION - setup workshop with core team].

  • "Rogue grantee" problem. What options do we have?
    0. Do not consider "rogue grantee" problem at all - assume that 'grantees' are reliable

    1. Have a consensus mechanism over setAgreementPermissions
      a. any 'setAgreementPermissions' permission amendment must be controlled by a forum of grantees
      b. any permission amendment must be individually controlled by a forum
      c. only change in set of grantees must be controlled by a forum
    2. Implement a multi-grantee consensus protocol on an upper layer (eg. Supervisor protocol),
    3. Do not allow propagation of rights by 'grantees' (initially)
      • first implement ACLs without ability to grant rights by grantees (this will prevent fully autonomous dapps for time being)
      • subsequently solve the 'Rogue grantee' problem and allow for rights propagation (grantees grant rights)
    4. Go with 3. and specify requirements for a generic "credentials" mechanism which would attest consensus between multiple parties, etc. This calls for a dedicated GAP [ACTION: create placeholder] to specify requirements and suggest potential implementations which could be applied to any API call, or any Golem Entity, etc.

MB: I like 4 :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Rogue grantee" problem

  • Option II doesn't solve a problem. We can have consensus on upper layer, but yagna must be consensus aware, otherwise one grantee can still Agreement despite consensus being against him.
    • But in this case grantees keeping consensus correctly could drop the Agreement and recreate it with new Node without giving permissions to malicious actor

Other options:

  • (Initially) split upgrading permissions from downgrading. I would say that giving permissions to new Node is less harmful than stealing Agreement

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mtg w/Witek on 14.02.:

  1. On API "races" - this must simply be specified as a requirement, and included in test scenarios. NOTE: we should explictly list relevant Test Cases in the test suite. Also include "conflict races", ie. one requestor Accepts another Rejects...
  2. On entity and event propagation:
    a. currently yagna (on network level) does not have QoS features, ie. it does not support "retry" on messages sent between sides. This means that if Requestor is offline, the Provider is unable to reliably send TerminateAgreement message (currently one side is unable to Terminate Agreement while the other side is offline). This seems to be a gap from the point of view of "Offline Requestor model"!!!
    b. it seems there is a gap in yagna core which currently makes 'attach/detach' impossible
    c. agreementId/nodeId propagation:
    1. either the network must notify the new grantee with agreementIds
    2. or we add REST API call to "force synchronization" with a specified node

Copy link
Contributor

@approxit approxit Feb 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small collection of my afterthoughts, which I use in our next session:

Assuming that:

  • Requestor should connect to provider as a client-server pattern
  • Provider is responsible for final validation and enforcing ACL (aka backend validation)
  • Requestor should locally complain if some action is done regardless of no permission (aka frontend validation)
  • Provider will receive ACL with collection of permissions and grantees node ids

Finally... Requestor should receive similar set of ACL with similar data (but from grantee perspective) - permissions and provider id.
In result, requestor should at least have its own copy of ACL (with provider id), to continue working after kinda-disaster/kinda-offline recovery.

Having only activity id is not enough to revive provider id, without any help of access to supervising entity.

Copy link
Contributor Author

@stranger80 stranger80 Feb 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update on 24.02.:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7.03.2023:

  • Ad ACL permissions, grantees, delegation - amend ACL to include various "credential sets" required (optional) where node identity is one of credential types, but also other credential types may be required (eg. a multi-sig artifact).
  • Kamil to propose an illustrative example of ACL
  • Ad. Agreement information propagation across network (required to forward the AgreementId to a new grantee)
    • Step 1: (suboptimal) each node knowing Agreement's ACL should periodically distribute it to affected parties (all grantees as listed in ACL) - can we extract all the affected parties straight.
    • Step 2: (optimal) A dedicated, generic GAP describing the "new node welcome notification" mechanism, as well as node "synchronization with network"
      • Ask Witek to draft brief input into the GAP - summarize the concept as discussed on the meeting

Copy link
Contributor

@approxit approxit Mar 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below is example for defining structure with separate collection of credential providing methods and separate collection of different permission sets.
Because those entries are interconnected and they need to refer from one to each other, using custom names help significantly.

Specific key names and values are provisional and are subject of change.

Relation between authTypes and permissionGroups could be included in permissionGroups (as shown below) or could be moved to analogical structure in authTypes.

{
	"version": 123,
	"authTypes": {
		"basic": {
			"type": "basic",
			"password": "..."
		},
		"token": {
			"type": "bearer",
			"token": "..."
		},
		"cert": {
			"type": "pem",
			"privateKey": "...",
		},
		"nodesOwned": {
			"type": "nodeId",
			"nodes": [
				"node_id_kj24",
				"node_id_9fd8",
				"node_id_gq68",
			]
		},
		"nodesNotOwned": {
			"type": "nodeId",
			"nodes": [
				"node_id_kj43",
				"node_id_ml23",
				"node_id_8s0d"
			]
		}
	},
	"permissionGroups": {
		"group1": {
			"authTypes": [
				"basic",
				"nodeOwned"
			],
			"permissions": [
				"req:createActivity",
				"req:exec",
				"req:destroyActivity"
			]
		},
		"group2": {
			"authTypes": [
				"token",
				"nodeNotOwned",
				"cert"
			],
			"permissions": [
				"req:terminateAgreement",
				"req:acceptInvoice",
				"req:rejectInvoice"
			]
		}
	}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorporated the proposed ACL improvements in the GAP draft, with some tweaks (eg. not "authTypes", but rather "attestationTypes", etc.)
We need to discuss the potentially useful attestation types, as we must not include any secrets in the ACL itself!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notes after 14.03.:

  • Continue on "generic attestations" - need to align on what do the "content-dependent" attestations include? The attestation artifact (eg. ACL list) must be:
    • Canonicalized to allow reliable verification
    • Specified element of the request - how is this specified?
      • API method does specify which element of it can be covered by attestation (eg. ACL content in SetAgreementPermissions)
      • ...any other option?

@stranger80
Copy link
Contributor Author

How do we ensure the Requestor R who was granted access to Agreement X hosted on Provider P - is aware of this Provider's Identity?

A. "Self-sustained", Provider-active Agreement propagation - ensure that the Requestor receives information about Agreements it is entitled to:

  1. Each setAgreementPermissions processed by eg. Owner shall trigger "Provider P has agreement A" messages from Owner node to all entitled Requestors
  2. Use GAP-30 Node presence notification:
  • A node and knows ACL for Agreement A
  • Requestor R joins the network, and sends presence notification "broadcast" to all nodes
  • Provider P or any other entitled node (listed in ACL) receives presence notification from R and realizes this R is mentioned on agreement ACL
  • ...therefore that node sends a "Provider P has Agreement A" message to R
  • R records info that Agreement A is hosted on P, so that if a REST client calls a "getAgreement" API, R knows where to forward this request (to P)

B. "Externally-populated" Agreement propagation - a REST API call must be made by Agent to notify the Requestor Daemon that "Provider P has Agreement A"

@stranger80
Copy link
Contributor Author

Add Both A. and B. scenarios above as alternatives in GAP. Analyze drawbacks, and suggest a way forward (choose one option).


This implementation requires that a newly entitled Grantee node is populated with `AgreementId->ProviderId` info, provided via a new method in Market API:

**New `synchronizeAgreement` method in Market API**
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call this attachAgreement

- Error otherwise

**Effect:**
- The `synchronizeAgreement` message is forwarded to Provider and validated. If the Agreement is hosted by that Provider, and Requestor/Grantee is entitled - its details (all relevant entities) are sent to the Grantee node.
Copy link
Contributor Author

@stranger80 stranger80 Mar 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also record the Requestor as interested in getting all relevant notfications - so that Provider notifies all relevant requestors actively.

@nieznanysprawiciel
Copy link
Contributor

From the Provider point of view, how do we if operation is finished and succeeded?
Let's take TerminateAgreement as an example. In current implementation operation succeeds if we are able to deliver termination message to Requestor daemon.
We are planing to extend the ownership of Agreement to multiple daemons, so should all messages be delivered, to treat Agreement as terminated?
If it is enough to deliver to only single Node, than how do we decide which one is enough?

Even if in case of consensus we have special Node to communicate with Provider, how does Provider knows which one is it?

@stranger80
Copy link
Contributor Author

From the Provider point of view, how do we if operation is finished and succeeded? Let's take TerminateAgreement as an example. In current implementation operation succeeds if we are able to deliver termination message to Requestor daemon. We are planing to extend the ownership of Agreement to multiple daemons, so should all messages be delivered, to treat Agreement as terminated? If it is enough to deliver to only single Node, than how do we decide which one is enough?

Even if in case of consensus we have special Node to communicate with Provider, how does Provider knows which one is it?

This concern seems to relate to a number of aspects:

  1. Is TerminateAgreement a "transactional" operation, and how is transaction consistency achieved (eg. what protocol?)
  2. Which nodes are actually sides of the "transaction" as mentioned in point 1 above?
  3. Which nodes need to be notified about the effect of the "transaction"?

A sample scenario involving TerminateAgreement has been illustrated here:
https://app.diagrams.net/#G1pmX9W4VBr6mrtdNskKgYLzaDAdhnGRVi

Re. 1 - do we have any transactional or quasi-transactional protocol implemented on TerminateAgreement operation? It seems implied from statement: In current implementation operation succeeds if we are able to deliver termination message to Requestor daemon. @nieznanysprawiciel are you able to explain what "transactional protocol" exists if any, in current implementation? Ie. are you able to comment on the sequence diagram linked above?
Also: which timestamp is considered the Agreement Termination date/time? It feels that it should also be confirmed during the "transaction"

Re. 2,3 - we agreed on the call that only Requestor sending the TerminateAgreement message, and The Provider, are the sides of the transaction. All other nodes (grantees) are only interested in the effect of the "transaction". In other words, the transaction must be resolved between Requestor and Provider, and once successful - the Grantees should be notified by a "AgreementTerminated" event messages. The transaction success must not depend on the delivery of event notification messages.

Another side note: It seems that only TerminateAgreement initiated by Requestor is a transactional operation. TerminateAgreement initiated by Provider does not require a two-sided transaction - in this case Provider is the source of truth.

@nieznanysprawiciel
Copy link
Contributor

Re. 1 Yagna daemon sends Agreement termination GSB message to other side. If we receive GSB acknowledge from other side, than we are saving event (and changing Agreement state) in database.

Entity initiating termination sends timestamp and other party accepts it <- Note that here we have problem with potential fraud, in which one entity claims that it terminated long before current timestamp. This problem remains unsolved here

Re.2,3 Am I understanding correctly, that Provider doesn't need to deliver message anywhere? He just claims that he terminated Agreement and grantees will later get the information if no one was present at the moment of termination?

In general I think that making Agreement termination a claim, that doesn't require other party's presence, would be beneficial, since we don't want to keep Agreements with someone that is not available.
But here we must deal with claims, that Agreement was terminated in the past (the problem that I mentioned in relation to timestamps synchronization). Of course this problems doesn't hurt as long as we don't need any arbitrage between sides of Agreement.

And the last think: I assume we take the same approach to all other operations between Nodes?

@stranger80
Copy link
Contributor Author

Re. 1 Yagna daemon sends Agreement termination GSB message to other side. If we receive GSB acknowledge from other side, than we are saving event (and changing Agreement state) in database.

Entity initiating termination sends timestamp and other party accepts it <- Note that here we have problem with potential fraud, in which one entity claims that it terminated long before current timestamp. This problem remains unsolved here

Re.2,3 Am I understanding correctly, that Provider doesn't need to deliver message anywhere? He just claims that he terminated Agreement and grantees will later get the information if no one was present at the moment of termination?

In general I think that making Agreement termination a claim, that doesn't require other party's presence, would be beneficial, since we don't want to keep Agreements with someone that is not available. But here we must deal with claims, that Agreement was terminated in the past (the problem that I mentioned in relation to timestamps synchronization). Of course this problems doesn't hurt as long as we don't need any arbitrage between sides of Agreement.

And the last think: I assume we take the same approach to all other operations between Nodes?

RE. 2,3. continued:
Am I understanding correctly, that Provider doesn't need to deliver message anywhere?

No, what we are saying is - the TerminateAgreement is a "transactional" operation between two nodes, and that requires "ack" messages, etc. And once the "transaction" is complete, the interested entitled nodes (eg. remaining Grantees) should receive AgreementTerminated event.

Re. I assume we take the same approach to all other operations between Nodes?

Yes, for all "transactional" operations.

Do we need to discuss the impact on event distribution mechanism? Eg. should all grantees be notified automatically, or should theyrequest for eventson a "pull" basis?

- Only possible when Grantee Id can be inferred from the ACL. Some attestation types (eg. based on X.509 certificates) do not explicitly indicate entitled node Ids. Therefore it may not always be possible to determine Grantee nodes and notify them appropriately.

**Recommendation:** Implementation 1.

Copy link
Contributor Author

@stranger80 stranger80 Apr 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A. Add attachAgreement to ACL permissions list

B. Add the following aspect:

  1. Event propagation

Provider is expected to actively notify a set of Grantees with Agreement (and respective entity) events. However the list of recipient nodes for events needs to be maintained by the Provider. Following logic is proposed:

  • A successful call to attachAgreement from a granteeId will add this Grantee to the list of event "subscribers"
    • A new grantee added to "subscribers" list triggers sending full history of relevant events to that grantee (* do we want to optimize this, eg. for a grantee who was offline and re-connects itself?)
  • Each event shall be forwarded by the Provider to all recipients from "event subscriber" list
    • In case event message delivery is failing for a given grantee (delivery confirmation to be decided during implementation) - that grantee is removed from "event subscriber" list
  • Whenever Grantee disconnects, and then reconnects itself - it must call the attachAgreement again
  • In situations where Grantee isn't aware it has been removed from "subscriber list" (eg. network went down, event message delivery failed, Provider removed the Grantee from list)
    • (Option 1) A simple "health check" mechanism is proposed - where Grantee side, for all Agreements in non-terminated state, performs a periodic "am I still subscribed" call to Provider. If False is returned, yagna daemon calls attachAgreement again, in order to get the latest state of the Agreement. (This option is specific to Agreement synchronization and is not a generic mechanism)
    • (Option 2) Implement message delivery guarantee on GSB transport level, which would need to have following features:
      • Message delivery can be either guaranteed or not guaranteed (as required)
      • Message delivery sequence can be either guaranteed or not guaranteed (as required)
        • User code is able to specify the message sequence indicator (as per functionality, eg. specify that Agreement timestamp is to be used as sequence indicator)
          If the above is available, the event message delivery is guaranteed, and the upper layer (Agreement-specific) does not need to worry about this.
          Option 2 is definitely a more substantial change, and requires a broader discussion... it probably should be considered a separate GAP...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GAP: Feature GAP type: Feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants