Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Individual service-scoped metrics POC #16237

Draft
wants to merge 8 commits into
base: develop
Choose a base branch
from

Conversation

mhess-swl
Copy link
Member

@mhess-swl mhess-swl commented Oct 30, 2024

Description

This prototype attempts to provide a mechanism for a service to create and own its own metrics. In effect, this means that any metrics created/owned by a single service would not be accessible to other services.

Background

Modularizing the services codebase was done, in part, to manage the interwoven complexity of the node software by creating service boundaries. These service boundaries are, by design, minimally dependent on other services. However, the metrics we currently use are not organized along these boundaries. For example, the OpWorkflowMetrics class creates a handful of metrics for every transaction type; similarly, the ThrottleMetrics class disregards any notion of service ownership.

This usage of metrics hasn't necessarily been problematic, but it doesn't lend itself to the stated design goal (i.e. minimizing inter-service dependencies).

A Potential Solution

Various functionalities common across services are currently channeled to query/transaction handlers via <X>Context interfaces (e.g. HandleContext, FeeContext). We also use a carefully-constructed readable/writable store pattern in which each service can only write to the pieces of state that it owns. Applying these two ideas to metrics could give us a mechanism that operates inside the domain of each service.

This POC leverages a new factory interface called the ServiceMetricsFactory to enforce privacy of metrics between services. Using the StoreFactory as a guide, the ServiceMetricsFactory uses a transaction body, ServicesScopeLookup, and a service-specific interface to locate and access each service's proprietary metrics. This metrics factory uses the service name, derived from each transaction body, to verify that the service requesting the metrics is the service that owns them.

The ServiceMetricsFactory is then injected into the appropriate context objects–in this PR, a new ServiceMetricsContext and the HandleContext–providing each service with familiar access to its own defined metrics within its own handlers. The factory is then invoked via a call similar to context.writableStore(<Entity-Specific Store>.class): context.metrics(<Service-Specific Metrics Interface>.class). The handlers of a service can then dictate how that service's metrics apply to a handler's designated transaction type.

Contract Metrics and Token Metrics
This PR implements proprietary metrics for the Smart Contract service and the Token service. The Contract Service makes use of the new ServiceMetricsContext in some of its pureCheck methods, incrementing a simple counter indicating how many times each transaction type was parsed and checked by pureCheck. The token service implements a slightly more useful metric; acting in TokenMintHandler's handle() method, this metric counts the total number of unique NFT serial numbers as they are successfully minted.

It's worth noting that both services have defined their metric interfaces and behavior in their own modules. For the Contract Service, ExampleContractMetrics is defined as an interface in the hedera-smart-contract-service API module; ExampleTokenMetrics is likewise defined in Token Service's API module. The implementations of these interfaces live in hedera-smart-contract-service-impl and hedera-token-service-impl. The necessary 'glue' to register and create these metrics lives in the SPI and app implementation modules.

Finally, it's also worth noting that the NftTransferSuite#transferNfts hapi test works correctly with this implementation in both test and testEmbedded mode, appropriately invoking the metric in its expected place. This code also correctly rejects a request in a Token Service handler to retrieve a Contract Service's metrics (uncomment the line in TokenMintHandler#handle to see this behavior).

Potential Design Flaws

This is a POC; some shortcuts were taken for convenience and speed. This PR is meant to solicit feedback on the concepts embodied in this code more than to critique the code itself. Pending further review and discussion, the following issues are noted and should likely change:

  • A new com.hedera.node.app.spi.MetricsService interface extends the Service interface, which is then extended by the RpcService interface, to facilitate registration and initialization of each service's metrics. These extensions are not necessary; a production implementation would likely define MetricsService entirely separate from the Service hierarchy
  • Registering a metrics instance on a Service interface may not be appropriate, because each service could still in theory manipulate metrics of another service it depends on inside of initMetrics(). This has to be balanced with Dagger's call graph, however, in that 1️⃣ a Metrics instance needs to exist, 2️⃣ the Metrics object needs to somehow be provided to each service, and 3️⃣ there must be a place for service-specific metrics initialization following startup but prior to the beginning of transaction handling
  • The ServiceMetricsFactory implementation maintains an awkward relationship of serviceName -> metricsInterface and metricsInterface -> metricsInstance
  • Various documentation, test updates, and tests are missing. I would like to get sign-off on these concepts before making the implementation more rigorous
  • Factory registration of service metrics instances in the Hedera#init method may be better-suited to a different location

Related issue(s):

Closes #16224

@mhess-swl mhess-swl added the Feature Enhancement Enhancing an existing feature driven by business requirements. Typically backwards compatible. label Oct 30, 2024
@mhess-swl mhess-swl added this to the v0.57 milestone Oct 30, 2024
@mhess-swl mhess-swl self-assigned this Oct 30, 2024
@mhess-swl mhess-swl changed the title Per service metrics poc feat: Individual service-scoped metrics POC Oct 30, 2024
@mhess-swl mhess-swl marked this pull request as ready for review October 30, 2024 16:36
@mhess-swl mhess-swl requested review from a team and tinker-michaelj as code owners October 30, 2024 16:36
@mhess-swl mhess-swl marked this pull request as draft October 30, 2024 16:37
Signed-off-by: Matt Hess <[email protected]>
@mhess-swl mhess-swl removed this from the v0.57 milestone Oct 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature Enhancement Enhancing an existing feature driven by business requirements. Typically backwards compatible.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

POC for enabling services to manage their own metrics
1 participant