OpenTelemetry integration#699
Conversation
✅ Deploy Preview for opal-docs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
…ck latency of data update events feat(prometheus_metrics.py): create data_update_latency histogram to monitor latency of data update events
…l into prometheus_integration
…etrics to use opal_server.metrics.prometheus_metrics for better organization chore(requirements.txt): add prometheus_client to dependencies for metrics tracking functionality
…c to track updates per topic feat(prometheus_metrics.py): introduce data_update_count_per_topic counter for monitoring data updates by topic
… to enhance observability fix(api.py): increment policy bundle request count and measure latency for bundle generation fix(callbacks.py): observe size of changed directories in policy update notifications fix(task.py): track policy update count and latency when triggering policy watcher
|
Hey @psardana, thank you for this contribution! 💎 Can you please add documentation about the metrics and explain how to set it up? Notice that there are conflicts against the main branch, please make sure to rebase from master. Looking forward for this! 🙏 |
Thank you for the review! I have added commits for documentation, docker compose and fixed the label names. |
danyi1212
left a comment
There was a problem hiding this comment.
Looks very good! 🌟
I've left some comments about specific areas and improvements.
Upon review, some instrumented parts appear to align more with tracing rather than pure metrics. This led to some unnatural metrics and duplications, having separate latency - count and error metrics.
To address this, we suggest exploring OpenTelemetry, which offers native Prometheus integration alongside robust tracing capabilities.
Here's a proposed mapping of the current metrics to OpenTelemetry:
opal_server_data_update-> Traceopal_server_policy_update-> Traceopal_server_policy_bundle_request-> Traceopal_server_policy_bundle_size-> Metricopal_server_active_clients-> Metricopal_client_data_subscriptions-> Metricopal_client_data_update_trigger-> Traceopal_client_data_update_apply-> Trace (new)opal_client_policy_update_apply-> Trace (new)opal_client_policy_store_status-> Metric
We believe this approach will provide a more comprehensive observability solution.
Please let us know your thoughts, and let's work together to enhance OPAL's observability 💎
danyi1212
left a comment
There was a problem hiding this comment.
Look very good! Kudus on the work. I've left a few small details to fix and improve, let me know if you have question and when they are ready
Fixes Issue
closes #701
Changes proposed
Check List (Check all the applicable boxes)
Screenshots
Note to reviewers