-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Time interval usage metrics by logs from UCAN Stream #190
Milestone
Comments
Available Options
1. Kinesis Data Analytics for Apache Flink + Amazon timestream
References
2. Kinesis Data Firehose + S3 Data Lake + Athena
References
|
This was referenced Apr 13, 2023
vasco-santos
changed the title
Real time usage metrics by logs from UCAN Stream
Time interval usage metrics by logs from UCAN Stream
Apr 13, 2023
travis
added a commit
that referenced
this issue
Oct 11, 2023
…le queries over the UCAN logs (#191) This PR has an implementation of option 2 from #190 (comment) This encompasses a fair amount of functionality - partitioning the UCAN logs into S3 buckets, configuring a Glue database and tables, adding example queries to Athena and more. A partial list of functionality follow: - implement UCAN log partitioning in S3 - first partition by `type` - everything in "workflows" shows up in "receipts" so this reduces the amount of data scanned by ~50% - next partition by `op` to allow us to create tables that only query a specific operation (ie, `store/add` or `provider/add`) - this lets us add operation-specific Glue table schemas with much less clutter in result types than we'd need if we tried to defined all possible inputs and outputs in a single table - finally partition by date to allow queries to only load recent data - use these partitions to implement standalone tables for receipts in general and the `store/add`, `upload/add` and `provider/add` UCANs specifically, - add queries that demonstrate the use of all of these tables - add dynamo connector so we can join the UCAN logs to our Dynamo tables in queries - add queries that demonstrate using the Dynamo and Glue tables together --------- Co-authored-by: Travis Vachon <[email protected]> Co-authored-by: Travis Vachon <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Context
Within w3up infrastructure, through the UCAN stream we track system wide metrics and space metrics. These metrics allow us to know overall system metrics and w3up users to know about their total usage. However, we have no visibility on real time volume of usage by each space. Based on the operation of older APIs, we see value on knowing usage volume, so that we can proactively avoid abused and get to know patterns.
Requirements
The text was updated successfully, but these errors were encountered: