feat: Individual service-scoped metrics POC #16237
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This prototype attempts to provide a mechanism for a service to create and own its own metrics. In effect, this means that any metrics created/owned by a single service would not be accessible to other services.
Background
Modularizing the services codebase was done, in part, to manage the interwoven complexity of the node software by creating service boundaries. These service boundaries are, by design, minimally dependent on other services. However, the metrics we currently use are not organized along these boundaries. For example, the
OpWorkflowMetrics
class creates a handful of metrics for every transaction type; similarly, theThrottleMetrics
class disregards any notion of service ownership.This usage of metrics hasn't necessarily been problematic, but it doesn't lend itself to the stated design goal (i.e. minimizing inter-service dependencies).
A Potential Solution
Various functionalities common across services are currently channeled to query/transaction handlers via
<X>Context
interfaces (e.g.HandleContext
,FeeContext
). We also use a carefully-constructed readable/writable store pattern in which each service can only write to the pieces of state that it owns. Applying these two ideas to metrics could give us a mechanism that operates inside the domain of each service.This POC leverages a new factory interface called the
ServiceMetricsFactory
to enforce privacy of metrics between services. Using theStoreFactory
as a guide, theServiceMetricsFactory
uses a transaction body,ServicesScopeLookup
, and a service-specific interface to locate and access each service's proprietary metrics. This metrics factory uses the service name, derived from each transaction body, to verify that the service requesting the metrics is the service that owns them.The
ServiceMetricsFactory
is then injected into the appropriate context objects–in this PR, a newServiceMetricsContext
and theHandleContext
–providing each service with familiar access to its own defined metrics within its own handlers. The factory is then invoked via a call similar tocontext.writableStore(<Entity-Specific Store>.class)
:context.metrics(<Service-Specific Metrics Interface>.class)
. The handlers of a service can then dictate how that service's metrics apply to a handler's designated transaction type.Contract Metrics and Token Metrics
This PR implements proprietary metrics for the Smart Contract service and the Token service. The Contract Service makes use of the new
ServiceMetricsContext
in some of itspureCheck
methods, incrementing a simple counter indicating how many times each transaction type was parsed and checked bypureCheck
. The token service implements a slightly more useful metric; acting inTokenMintHandler
'shandle()
method, this metric counts the total number of unique NFT serial numbers as they are successfully minted.It's worth noting that both services have defined their metric interfaces and behavior in their own modules. For the Contract Service,
ExampleContractMetrics
is defined as an interface in thehedera-smart-contract-service
API module;ExampleTokenMetrics
is likewise defined in Token Service's API module. The implementations of these interfaces live inhedera-smart-contract-service-impl
andhedera-token-service-impl
. The necessary 'glue' to register and create these metrics lives in the SPI and app implementation modules.Finally, it's also worth noting that the
NftTransferSuite#transferNfts
hapi test works correctly with this implementation in bothtest
andtestEmbedded
mode, appropriately invoking the metric in its expected place. This code also correctly rejects a request in a Token Service handler to retrieve a Contract Service's metrics (uncomment the line inTokenMintHandler#handle
to see this behavior).Potential Design Flaws
This is a POC; some shortcuts were taken for convenience and speed. This PR is meant to solicit feedback on the concepts embodied in this code more than to critique the code itself. Pending further review and discussion, the following issues are noted and should likely change:
com.hedera.node.app.spi.MetricsService
interface extends theService
interface, which is then extended by theRpcService
interface, to facilitate registration and initialization of each service's metrics. These extensions are not necessary; a production implementation would likely defineMetricsService
entirely separate from theService
hierarchyService
interface may not be appropriate, because each service could still in theory manipulate metrics of another service it depends on inside ofinitMetrics()
. This has to be balanced with Dagger's call graph, however, in that 1️⃣ aMetrics
instance needs to exist, 2️⃣ theMetrics
object needs to somehow be provided to each service, and 3️⃣ there must be a place for service-specific metrics initialization following startup but prior to the beginning of transaction handlingServiceMetricsFactory
implementation maintains an awkward relationship of serviceName -> metricsInterface and metricsInterface -> metricsInstanceHedera#init
method may be better-suited to a different locationRelated issue(s):
Closes #16224