-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat_: add ability to process prometheus metrics in telemetry client #5782
base: develop
Are you sure you want to change the base?
Conversation
d10769e
to
e8f9704
Compare
Jenkins BuildsClick to see older builds (13)
|
telemetry/client.go
Outdated
func (c *Client) ProcessReuglarStoryRetrievedMsgs(data MetricPayload) { | ||
fmt.Println(data) | ||
|
||
postBody := map[string]interface{}{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing to consider is to drop any processing on client and just push the metric as it is structured by Prometheus to the Telemetry server - then we can do whatever we want/need with it on the server - simplifies client quite a lot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be great
for { | ||
select { | ||
case <-ctx.Done(): | ||
fmt.Println("exit") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should probably remove
telemetry/client.go
Outdated
func (c *Client) ProcessReuglarStoryRetrievedMsgs(data MetricPayload) { | ||
fmt.Println(data) | ||
|
||
postBody := map[string]interface{}{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be great
telemetry/client.go
Outdated
@@ -413,3 +450,26 @@ func (c *Client) UpdateEnvelopeProcessingError(shhMessage *types.Message, proces | |||
c.logger.Error("Error sending envelope update to telemetry server", zap.Error(err)) | |||
} | |||
} | |||
|
|||
func (c *Client) ProcessReuglarStoryRetrievedMsgs(data MetricPayload) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be ProcessRegularStoreRetrievedMsgs
?
|
||
w.wg.Add(1) | ||
go func() { | ||
w.wg.Done() | ||
defer w.wg.Done() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure how this wait group is intended to behave cc @richard-ramos
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding the defer is correct. This is a bug i introduced when doing the refactoring
telemetry/client.go
Outdated
@@ -180,6 +193,8 @@ func (c *Client) Start(ctx context.Context) { | |||
c.telemetryCacheLock.Unlock() | |||
|
|||
if len(telemetryRequests) > 0 { | |||
d, _ := json.MarshalIndent(telemetryRequests, "", " ") | |||
fmt.Println(string(d)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✂️
telemetry/client.go
Outdated
} | ||
|
||
//client.promMetrics.Register("waku_connected_peers", GaugeType, nil, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✂️
telemetry/client.go
Outdated
@@ -413,3 +450,26 @@ func (c *Client) UpdateEnvelopeProcessingError(shhMessage *types.Message, proces | |||
c.logger.Error("Error sending envelope update to telemetry server", zap.Error(err)) | |||
} | |||
} | |||
|
|||
func (c *Client) ProcessReuglarStoryRetrievedMsgs(data MetricPayload) { | |||
fmt.Println(data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✂️
gatherer := prometheus.DefaultGatherer | ||
metrics, err := gatherer.Gather() | ||
if err != nil { | ||
log.Fatalf("Failed to gather metrics: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we should use Fatalf
. If i remember correctly, this panic
s, and IMO not being able to push metrics isn't an end of the world situation!
e8f9704
to
acf238e
Compare
Thank you for opening this pull request! We require commits to follow the Conventional Commits, but with Details:
|
@@ -52,7 +52,7 @@ var ( | |||
|
|||
func init() { | |||
prom.MustRegister(EnvelopesReceivedCounter) | |||
prom.MustRegister(EnvelopesRejectedCounter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this metric supposed to be removed?
@@ -1347,7 +1351,7 @@ func (w *Waku) OnNewEnvelopes(envelope *protocol.Envelope, msgType common.Messag | |||
trouble = true | |||
} | |||
|
|||
common.EnvelopesValidatedCounter.Inc() | |||
common.EnvelopesValidatedCounter.With(prometheus.Labels{"pubsubTopic": envelope.PubsubTopic(), "type": msgType}).Inc() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that we need the meaning of common.MissingMessageType
to be consistent moving forward, otherwise it can mess up our metrics. i.e. only messages returned by the periodic store query can have this type
This PR adds a general avility to telemetry client to work with Prometheus metrics exposed by status-go and underlying libraries.
It then uses an extended
waku2_envelopes_validated_total
(the extension is storing pubsub topic and type in labels, which allows us to filter for messages received due to regular store queries - status-im/telemetry#23) metric to showcase how this new feature works besically snapshoting and filtering recorded values for that metric based onMessageType = missing
and then adding it to the telemetry request cache (which is regularly published to telemetry service)This allows us to add new metrics via Prometheus - hence make them standardized reusable (and also tap into metrics provided by libraries - e.g. go-waku)
This is still a draft or the most part