Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
Cargo.lock
.tgops.toml
.tgops
.idea
notes/
docs/book/
data/
gha-creds-*.json
Expand Down
4 changes: 2 additions & 2 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -89,14 +89,14 @@ reqwest = { version = "0.12.12", default-features = false, features = [
] }
#
# webvh
didwebvh-rs = "=0.1.10"
didwebvh-rs = "0.1.10"

# Python bindings
pyo3 = { version = "0.26", features = ["serde"] }
pythonize = "0.26"

# serialize
serde = { version = "1.0.220", features = ["derive"] }
serde = { version = "1.0", features = ["derive"] }
serde_json = { version = "1.0" }
serde_with = { version = "3.14", features = ["base64"] }
bs58 = "0.5"
Expand Down
48 changes: 48 additions & 0 deletions docs/ADR-001-relationship-state-machine.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# ADR 001: Relationship State Machine

## Status
Proposed

## Context
The current TSP SDK implementation lacks a formal state machine for managing relationship lifecycles. This leads to several issues:
1. **Undefined States**: The `ReverseUnidirectional` status is defined but rarely used, leading to ambiguity when a node receives a relationship request.
2. **Concurrency Issues**: If two nodes request a relationship with each other simultaneously, both end up in a `Unidirectional` state, with no clear resolution path.
3. **No Timeouts**: There is no mechanism to handle lost messages or unresponsive peers during the handshake process.
4. **Idempotency**: Duplicate control messages are not handled consistently.

## Decision
We will implement a formal `RelationshipMachine` to govern state transitions.

### 1. State Machine Definition

The state machine will transition based on `RelationshipEvent`s.

| Current State | Event | New State | Action/Notes |
| :--- | :--- | :--- | :--- |
| `Unrelated` | `SendRequest` | `Unidirectional` | Store `thread_id` |
| `Unrelated` | `ReceiveRequest` | `ReverseUnidirectional` | Store `thread_id` |
| `Unidirectional` | `ReceiveAccept` | `Bidirectional` | Verify `thread_id` matches. |
| `ReverseUnidirectional` | `SendAccept` | `Bidirectional` | Verify `thread_id` matches. |
| `Bidirectional` | `SendCancel` | `Unrelated` | |
| `Bidirectional` | `ReceiveCancel` | `Unrelated` | |
| `Unidirectional` | `SendRequest` | `Unidirectional` | Idempotent (retransmission) |
| `Unidirectional` | `ReceiveRequest` | *Conflict Resolution* | See Concurrency Handling |

### 2. Concurrency Handling
When a node in `Unidirectional` state (sent a request) receives a `RequestRelationship` from the target (meaning they also sent a request):
- **Compare `thread_id`s**: The request with the *lower* `thread_id` (lexicographically) wins.
- **If my `thread_id` < their `thread_id`**: I ignore their request (or reject it). I expect them to accept my request.
- **If my `thread_id` > their `thread_id`**: I accept their request. I cancel my pending request state and transition to `ReverseUnidirectional` (effectively accepting their flow).

### 3. Timeout & Retry
- **Timeout**: A `request_timeout` field will be added to `VidContext`. If a `Unidirectional` state persists beyond the timeout (e.g., 60s), it transitions back to `Unrelated`.
- **Retry**: Before timing out, the system may attempt retransmissions.

### 4. Idempotency
- **Duplicate Request**: If in `ReverseUnidirectional` or `Bidirectional` and receive the same `RequestRelationship` (same `thread_id`), ignore it or resend the previous response.
- **Duplicate Accept**: If in `Bidirectional` and receive `AcceptRelationship` with the same `thread_id`, ignore it.

## Consequences
- **Robustness**: Relationship establishment will be reliable under network jitter and concurrency.
- **Complexity**: The `store.rs` logic will become more complex.
- **Breaking Changes**: Existing tests that manually manipulate state might fail and need updating to respect the state machine.
12 changes: 6 additions & 6 deletions examples/tests/cli_tests.rs
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ fn test_send_command_unverified_receiver_default() {
"receive",
&marc_did,
])
.timeout(Duration::from_secs(2))
.timeout(Duration::from_secs(20))
.assert()
.stderr(predicate::str::contains("received relationship request"))
.stdout(predicate::str::contains("Oh hello Marc"))
Expand Down Expand Up @@ -158,7 +158,7 @@ fn test_send_command_unverified_receiver_ask_flag() {
"--ask",
])
.write_stdin(input)
.timeout(Duration::from_secs(2))
.timeout(Duration::from_secs(20))
.assert()
.stderr(predicate::str::contains(
"Message cannot be sent without verifying the receiver's DID",
Expand All @@ -182,7 +182,7 @@ fn test_send_command_unverified_receiver_ask_flag() {
"--ask",
])
.write_stdin(input)
.timeout(Duration::from_secs(2))
.timeout(Duration::from_secs(20))
.assert()
.stdout(predicate::str::contains(
"Do you want to verify receiver DID",
Expand All @@ -199,7 +199,7 @@ fn test_send_command_unverified_receiver_ask_flag() {
"--one",
&marc_did,
])
.timeout(Duration::from_secs(2))
.timeout(Duration::from_secs(20))
.assert()
.stderr(predicate::str::contains("received relationship request"))
.success();
Expand Down Expand Up @@ -256,7 +256,7 @@ fn test_webvh_creation_key_rotation() {
"receive",
&bar_did,
])
.timeout(Duration::from_secs(2))
.timeout(Duration::from_secs(20))
.assert()
.stderr(predicate::str::contains("received relationship request"))
.stdout(predicate::str::contains("Oh hello Marc"))
Expand Down Expand Up @@ -293,7 +293,7 @@ fn test_webvh_creation_key_rotation() {
"receive",
&bar_did,
])
.timeout(Duration::from_secs(2))
.timeout(Duration::from_secs(20))
.assert()
.stdout(predicate::str::contains("Oh hello Marc"))
.failure();
Expand Down
2 changes: 1 addition & 1 deletion tsp_sdk/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ arbitrary = { workspace = true, optional = true }
async-trait = "0.1.88"

# webvh
didwebvh-rs = { workspace = true, optional = true }
didwebvh-rs = { optional = true, version = "0.1.10" }

# not used directly, but we need to configure
# the JS feature in this transitive dependency
Expand Down
2 changes: 1 addition & 1 deletion tsp_sdk/src/definitions/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ pub struct MessageType {
}

#[cfg_attr(feature = "serialize", derive(Serialize, Deserialize))]
#[derive(Clone, Debug)]
#[derive(Clone, Debug, PartialEq)]
pub enum RelationshipStatus {
_Controlled,
Bidirectional {
Expand Down
2 changes: 2 additions & 0 deletions tsp_sdk/src/error.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ pub enum Error {
InvalidRoute(String),
#[error("Relationship Error: {0}")]
Relationship(String),
#[error("Relationship State Error: {0}")]
State(#[from] crate::relationship_machine::StateError),
#[error("Error: missing private vid {0}")]
MissingPrivateVid(String),
#[error("Error: missing vid {0}")]
Expand Down
3 changes: 3 additions & 0 deletions tsp_sdk/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,9 @@ pub use secure_storage::AskarSecureStorage;
pub use secure_storage::SecureStorage;

pub use definitions::{Payload, PrivateVid, ReceivedTspMessage, RelationshipStatus, VerifiedVid};
pub use relationship_machine::{RelationshipEvent, RelationshipMachine};

pub mod relationship_machine;
pub use error::Error;
pub use store::{Aliases, SecureStore};
pub use vid::{ExportVid, OwnedVid, Vid};
Loading