Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add kinesis log stream #86

Merged
merged 5 commits into from
Dec 13, 2022
Merged

Conversation

vasco-santos
Copy link
Contributor

@vasco-santos vasco-santos commented Dec 9, 2022

Adds kinesis ucan log stream to ucanto service. Once a UCAN invocation is handled by the service, it is sent to Amazon Kinesis data streams for post processing (JSON with invocation CID, invocation bytes, and decoded invocation).

Kinesis log stream has its own stack named UcanStreamStack which will include resources needed for post processing of ucan stream ops. ApiStack depends on UcanStreamStack given it will use its stream, as well as its data further down the line to get content like user facing stats

Per https://www.notion.so/UCAN-LOG-0f3870fc4b404f5cbf646bf16b463365

Implementation details:

  • Invocation view content
    • { carCid: string, value: { att: UCAN.Capabilities, aud: 'did:${string}:${string}', iss: 'did:${string}:${string}' } }
    • having att, audience and issuer should be enough for all the operations we intend to perform. Skipped prf, exp, nbf, fct, nnc, v, and signature.
    • see format in comment below

Other notes:

Follow ups:

Needs:

@seed-deploy seed-deploy bot temporarily deployed to pr86 December 9, 2022 15:21 Inactive
@seed-deploy
Copy link

seed-deploy bot commented Dec 9, 2022

View stack outputs

@seed-deploy seed-deploy bot temporarily deployed to pr86 December 9, 2022 16:36 Inactive
@seed-deploy seed-deploy bot temporarily deployed to pr86 December 9, 2022 16:39 Inactive
@seed-deploy seed-deploy bot temporarily deployed to pr86 December 9, 2022 16:46 Inactive
@seed-deploy seed-deploy bot temporarily deployed to pr86 December 9, 2022 16:51 Inactive
@vasco-santos vasco-santos changed the title feat/add kinesis log stream feat: add kinesis log stream Dec 12, 2022
@seed-deploy seed-deploy bot temporarily deployed to pr86 December 12, 2022 09:34 Inactive
@vasco-santos
Copy link
Contributor Author

vasco-santos commented Dec 12, 2022

EDIT: SEE COMMENTS LATER AS THIS WAS MODIFIED

Data written into kinesis stream

{
    "carCid": "bafyreibr5w4fjaxg5da5gwovujsz6vmxp4dwtsm73iiiwki7ym5gon7p5q",
    "data": {
        "att": [
            {
                "nb": {
                    "link": {
                        "code": 514,
                        "version": 1,
                        "hash": {
                            "0": 18,
                            "1": 32,
                            "2": 132,
                            "3": 245,
                            "4": 3,
                            "5": 183,
                            "6": 72,
                            "7": 151,
                            "8": 231,
                            "9": 183,
                            "10": 40,
                            "11": 33,
                            "12": 244,
                            "13": 229,
                            "14": 22,
                            "15": 202,
                            "16": 14,
                            "17": 50,
                            "18": 153,
                            "19": 111,
                            "20": 251,
                            "21": 187,
                            "22": 157,
                            "23": 131,
                            "24": 194,
                            "25": 123,
                            "26": 227,
                            "27": 195,
                            "28": 143,
                            "29": 11,
                            "30": 74,
                            "31": 2,
                            "32": 43,
                            "33": 98
                        }
                    },
                    "size": 225
                 },
            "can": "store/add",
            "with": "did:key:z6Mkf4Tc5v4LWessC2cExyQNjprsxJPGfv82pvNmzUXchdPM"
            }
        ],
        "aud": "did:key:z6MkhcbEpJpEvNVDd3n5RurquVdqs5dPU16JDU5VZTDtFgnn",
        "iss": "did:key:z6MkmE8f1sWQoxvMuFC1K3LABAHK8xzdLjGys5ad8AxSNt9i"
    }
}

@seed-deploy seed-deploy bot temporarily deployed to pr86 December 12, 2022 09:49 Inactive
@olizilla
Copy link
Contributor

Could we toString CIDs before sending them to the stream?

@olizilla
Copy link
Contributor

What is the plan for events like the ObjectCreated from s3 when the CAR is written? it feels like we can only know the state of the system by following the event bus and the ucan log.

My feeling is that we live in a world where not every event has an associated ucan (yet) and we should design for that rather than setting ourselves the task of turning every non ucan event into an invocation. Perhaps we could see this stream as the events stream which includes ucans and non-ucan events.

but... if we wanted to, in the awkward case of store/add we could sign the store/add ucan into the object metadata for the presigned put url, which would then require the user to provide that ucan again as a amz metadata header when making the request... or it could be a new store/put ucan... but whatever it is we'd need it sent in the store/add invocation, so we can get it signed into the url... as we can't otherwise validate it on PUT... having something like that in the object metadata we could then tie the put event back to the user invocation.

@alanshaw
Copy link
Member

What is the plan for events like the ObjectCreated from s3 when the CAR is written? it feels like we can only know the state of the system by following the event bus and the ucan log.

My feeling is that we live in a world where not every event has an associated ucan (yet) and we should design for that rather than setting ourselves the task of turning every non ucan event into an invocation. Perhaps we could see this stream as the events stream which includes ucans and non-ucan events.

but... if we wanted to, in the awkward case of store/add we could sign the store/add ucan into the object metadata for the presigned put url, which would then require the user to provide that ucan again as a amz metadata header when making the request... or it could be a new store/put ucan... but whatever it is we'd need it sent in the store/add invocation, so we can get it signed into the url... as we can't otherwise validate it on PUT... having something like that in the object metadata we could then tie the put event back to the user invocation.

I think we should use receipts for this. We already know that we want "events" to be sent back to the user for when certain async operations have happened within the system and I think receipts might be what that is. A series of signed "receipts" allows us to get the accountability we need, but also communicate information back to the user about their upload over a websocket or something.

IMHO this time round we should build the system where things don't really happen without some UCAN tracability, and I'd like to at least see how far we can get with that until it's a big enough pain or simply not possible. In this case I think there is some real utility in generating a UCAN receipt for the put event that can be run through the UCAN log as well as communicated back to the user.

We should definitely add the store/add invocation CID to the meta so we can backfill receipts if needs be.

@vasco-santos
Copy link
Contributor Author

@olizilla / @alanshaw all the comments above are really related to #83 ... This PR is built on top of it as mentioned above. Let's please keep the context there. I will address comments made here there.

@seed-deploy seed-deploy bot temporarily deployed to pr86 December 13, 2022 09:52 Inactive
@seed-deploy
Copy link

seed-deploy bot commented Dec 13, 2022

Stack outputs updated

@seed-deploy seed-deploy bot temporarily deployed to pr86 December 13, 2022 11:43 Inactive
@seed-deploy seed-deploy bot temporarily deployed to pr86 December 13, 2022 13:28 Inactive
/**
* @param {any} value
*/
export const replaceAllLinkValues = (value) => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked into npm modules available, but could not get something useful for it. If there are suggestions of modules you know that could help please let me know.

This small function goes through the Object through all its depth and replaces all Link instances by a JSON based on proposal storacha-network/ucanto#171 + multiformats/js-multiformats#228

Once this lands in multiformats, we will be able to drop this adapter function.

@seed-deploy seed-deploy bot temporarily deployed to pr86 December 13, 2022 13:49 Inactive
@seed-deploy seed-deploy bot temporarily deployed to pr86 December 13, 2022 14:31 Inactive
@seed-deploy seed-deploy bot temporarily deployed to pr86 December 13, 2022 14:36 Inactive
@vasco-santos
Copy link
Contributor Author

vasco-santos commented Dec 13, 2022

@olizilla @alanshaw we should be done here. I think decision we still need to make is whether we want proofs added also in the stream.

Note: CI is currently failing because it cannot install Node.js https://nodejs.org/dist/v16.17.1/node-v16.17.1-linux-x64.tar.xz ...

@vasco-santos vasco-santos mentioned this pull request Dec 13, 2022
1 task
@seed-deploy seed-deploy bot temporarily deployed to pr86 December 13, 2022 14:49 Inactive
@seed-deploy seed-deploy bot temporarily deployed to pr86 December 13, 2022 17:33 Inactive
@seed-deploy seed-deploy bot temporarily deployed to pr86 December 13, 2022 17:46 Inactive
@seed-deploy seed-deploy bot temporarily deployed to pr86 December 13, 2022 18:06 Inactive
@seed-deploy
Copy link

seed-deploy bot commented Dec 13, 2022

Stack outputs updated

@seed-deploy seed-deploy bot temporarily deployed to pr86 December 13, 2022 18:15 Inactive
@vasco-santos
Copy link
Contributor Author

Added ts property and proof array as discussed with @alanshaw

{
    "carCid": "bafyreibfxctwjqwm5e6mlgr3wgvbazk2srjonlz5a3sj4lyuddxz2qnfau",
    "value": {
        "att": [
            {
                "nb": {
                    "root": {
                        "/": "bafkreibsuf3ruarx3di42cwa2m5h6wxwhoxc5wc46evivzuagabvvwp4za"
                    },
                    "shards": [
                        {
                            "/": "bagbaiera5txbleogmuwm3qrqarhbhomddigohbcsg4ijddvieafn2qcfa43a"
                        }
                    ]
                },
                "can": "upload/add",
                "with": "did:key:z6Mkf4Tc5v4LWessC2cExyQNjprsxJPGfv82pvNmzUXchdPM"
            }
        ],
        "aud": "did:key:z6MkhcbEpJpEvNVDd3n5RurquVdqs5dPU16JDU5VZTDtFgnn",
        "iss": "did:key:z6MkmE8f1sWQoxvMuFC1K3LABAHK8xzdLjGys5ad8AxSNt9i",
        "prf": [
            {
                "/": "bafyreihbmenadjeqpvosrehclwyerqm3sbrmyaqzoj3rzthdkpkvoan6va"
            },
            {
                "/": "bafyreihbmenadjeqpvosrehclwyerqm3sbrmyaqzoj3rzthdkpkvoan6va"
            }
        ]
    },
    "ts": 1670955551515
}

https://us-east-2.console.aws.amazon.com/kinesis/home?region=us-east-2#/streams/details/pr86-w3infra-ucan-stream/dataViewer (latest 2 messages)

Copy link
Member

@alanshaw alanshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

upload-api/ucan-invocation.js Outdated Show resolved Hide resolved
@seed-deploy seed-deploy bot temporarily deployed to pr86 December 13, 2022 19:01 Inactive
@vasco-santos vasco-santos merged commit 3393ab2 into main Dec 13, 2022
@vasco-santos vasco-santos deleted the feat/add-kinesis-log-stream branch December 13, 2022 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants