Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

INVALID_ARGUMENT: One or more TimeSeries could not be written: Points must be written in order. #247

Closed
joao-melo-ingka opened this issue May 16, 2023 · 9 comments
Assignees
Labels
bug Something isn't working priority: p2

Comments

@joao-melo-ingka
Copy link

joao-melo-ingka commented May 16, 2023

I've been experimenting with this and found the following issue.

To reproduce I've just injected the opentelemetry agent + this extension + custom configuration to setup specific resource (global).

shadow.com.google.api.gax.rpc.InvalidArgumentException: shadow.io.grpc.StatusRuntimeException: INVALID_ARGUMENT: One or more TimeSeries could not be written: Points must be written in order. One or more of the points specified had an older start time than the most recent point.: global{} timeSeries[3]: workload.googleapis.com/process.runtime.jvm.classes.loaded{instrumentation_source:io.opentelemetry.runtime-metrics,instrumentation_version:1.25.1-alpha}; Points must be written in order. One or more of the points specified had an older start time than the most recent point.: global{} timeSeries[6]: workload.googleapis.com/http.client.duration{instrumentation_source:io.opentelemetry.google-http-client-1.19,http_method:GET,http_status_code:200,net_peer_name:metadata.google.internal,instrumentation_version:1.25.1-alpha}; Points must be written in order. One or more of the points specified had an older start time than the most recent point.: global{} timeSeries[1]: workload.googleapis.com/process.runtime.jvm.classes.unloaded{instrumentation_source:io.opentelemetry.runtime-metrics,instrumentation_version:1.25.1-alpha}
	at shadow.com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:92)
	at shadow.com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:98)
	at shadow.com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:66)
	at shadow.com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97)
	at shadow.com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:67)
	at shadow.com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1132)
	at shadow.com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
	at shadow.com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1270)
	at shadow.com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1038)
	at shadow.com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:808)
	at shadow.io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:574)
	at shadow.io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:544)
	at shadow.io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
	at shadow.io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
	at shadow.io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
	at shadow.com.google.api.gax.grpc.ChannelPool$ReleasingClientCall$1.onClose(ChannelPool.java:541)
	at shadow.io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:576)
	at shadow.io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:70)
	at shadow.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:757)
	at shadow.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:736)
	at shadow.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
	at shadow.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)
	Suppressed: shadow.com.google.api.gax.rpc.AsyncTaskException: Asynchronous task failed
		at shadow.com.google.api.gax.rpc.ApiExceptions.callAndTranslateApiException(ApiExceptions.java:57)
		at shadow.com.google.api.gax.rpc.UnaryCallable.call(UnaryCallable.java:112)
		at shadow.com.google.cloud.monitoring.v3.MetricServiceClient.createTimeSeries(MetricServiceClient.java:1729)
		at shadow.com.google.cloud.monitoring.v3.MetricServiceClient.createTimeSeries(MetricServiceClient.java:1661)
		at shadow.com.google.cloud.opentelemetry.metric.CloudMetricClientImpl.createTimeSeries(CloudMetricClientImpl.java:40)
		at shadow.com.google.cloud.opentelemetry.metric.InternalMetricExporter.createTimeSeriesBatch(InternalMetricExporter.java:219)
		at shadow.com.google.cloud.opentelemetry.metric.InternalMetricExporter.export(InternalMetricExporter.java:204)
		at shadow.com.google.cloud.opentelemetry.metric.GoogleCloudMetricExporter.export(GoogleCloudMetricExporter.java:90)
		at io.opentelemetry.sdk.metrics.export.PeriodicMetricReader$Scheduled.doRun(PeriodicMetricReader.java:162)
		at io.opentelemetry.sdk.metrics.export.PeriodicMetricReader.shutdown(PeriodicMetricReader.java:92)
		at io.opentelemetry.sdk.metrics.SdkMeterProvider.shutdown(SdkMeterProvider.java:133)
		at io.opentelemetry.sdk.OpenTelemetrySdk.shutdown(OpenTelemetrySdk.java:105)
		at io.opentelemetry.sdk.OpenTelemetrySdk.close(OpenTelemetrySdk.java:112)
		... 1 more
Caused by: shadow.io.grpc.StatusRuntimeException: INVALID_ARGUMENT: One or more TimeSeries could not be written: Points must be written in order. One or more of the points specified had an older start time than the most recent point.: global{} timeSeries[3]: workload.googleapis.com/process.runtime.jvm.classes.loaded{instrumentation_source:io.opentelemetry.runtime-metrics,instrumentation_version:1.25.1-alpha}; Points must be written in order. One or more of the points specified had an older start time than the most recent point.: global{} timeSeries[6]: workload.googleapis.com/http.client.duration{instrumentation_source:io.opentelemetry.google-http-client-1.19,http_method:GET,http_status_code:200,net_peer_name:metadata.google.internal,instrumentation_version:1.25.1-alpha}; Points must be written in order. One or more of the points specified had an older start time than the most recent point.: global{} timeSeries[1]: workload.googleapis.com/process.runtime.jvm.classes.unloaded{instrumentation_source:io.opentelemetry.runtime-metrics,instrumentation_version:1.25.1-alpha}
	at shadow.io.grpc.Status.asRuntimeException(Status.java:539)
	... 14 more``` 
@psx95
Copy link
Contributor

psx95 commented May 17, 2023

Hi @joao-melo-ingka, could you clarify what exactly you mean by 'this extension' ?

Also, could you share your custom configuration ?

I'm assuming your overall goal here is to auto-instrument your application using opentelemetry java agent and export the telemetry data to Google Cloud.

@psx95 psx95 self-assigned this May 17, 2023
@dashpole dashpole added bug Something isn't working priority: p2 labels Aug 14, 2023
@dashpole
Copy link
Contributor

Closing as obsolete. If you are still impacted, feel free to reopen and respond to #247 (comment)

@KengoTODA
Copy link

I faced a similar error with the Library Instrumentation for HikariCP:

One or more TimeSeries could not be written: Points must be written in order.
One or more of the points specified had an older start time than the most recent point.:
generic_node{node_id:localhost,location:global,namespace:} timeSeries[0-153]:
workload.googleapis.com/db.client.connections.use_time{pool_name:my-database,instrumentation_version:1.32.0-alpha,instrumentation_source:io.opentelemetry.hikaricp-3.0,service_name:my-service}
errors:{status:{code:3} point_count:13}

In our case, exporter is running on Cloud Run written in Koltin/Ktor, and collector is also running on Cloud Run but using different service. The javaagent for OpenTelemetry auto-instrumentation is running in the service that exports HikariCP metrics.

We use otel/opentelemetry-collector-contrib:0.89.0 as the container image of the collector service, with the following setting:

receivers:
  otlp:
    protocols:
      grpc:
      http:
processors:
  batch:
    send_batch_size: 8192
    timeout: 10s

exporters:
  googlecloud:

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [googlecloud]
    metrics:
      receivers: [otlp]
      processors: [batch]
      exporters: [googlecloud]

Here is diagram regarding relation among services:

flowchart LR
  subgraph CloudRun
    subgraph exporting service
      instance
      another-instance
    end
    another-instance --> collector
    instance --> collector
  end
  collector --> monitoring[Cloud Monitoring]
Loading

I expected that adding instance ID to metrics attributes solves this issue, so introduced Google Cloud official Resource Detectors, but it caused another Duplicate TimeSeries encountered. Only one point can be written per TimeSeries per request error.

That's all what I know in this timing. I'll update some when I found some progresses.

@dashpole dashpole reopened this Dec 11, 2023
@dashpole
Copy link
Contributor

@KengoTODA which monitored resource are your metrics being written to (the ones that succeed)?

@dashpole
Copy link
Contributor

generic_node{node_id:localhost,location:global,namespace:}

This means it is missing either service.name or service.instance.id. Are those both set?

@KengoTODA
Copy link

generic_node{node_id:localhost,location:global,namespace:}

This means it is missing either service.name or service.instance.id. Are those both set?

Probably not, because the instrumentation for HikariCP is not Google Cloud specific. And when I set them by Google Cloud official resource provider, I found the following error:

I expected that adding instance ID to metrics attributes solves this issue, so introduced Google Cloud official Resource Detectors, but it caused another Duplicate TimeSeries encountered. Only one point can be written per TimeSeries per request error.

@KengoTODA
Copy link

It seems that my current configuration is not enough to use the Google Cloud official Resource Detectors. Unlike Auto-Configuration for OpenTelemetry in Google Cloud, Resource Detectors are not designed as extension for javaagent, so I cannot enable them via otel.javaagent.extensions system property.

As a trial, I enabled the Auto-Configuration for OpenTelemetry in Google Cloud as a javaagent extension, and it worked but metrics had no instance ID nor service ID in their attributes yet: generic_node{location:global,node_id:localhost,namespace:}

I guess that what I need is the interaction regarding how to use Google Cloud official Resource Detectors with OpenTelemetry javaagent. Please share if you have some pointers or references. Thanks in advance!

@KengoTODA
Copy link

I'd like to share the current progress on my side:

I guess that what I need is the interaction regarding how to use Google Cloud official Resource Detectors with OpenTelemetry javaagent.

I initially considered using the OpenTelemetry Java Agent but decided against it due to the following reasons:

  • In order to add instance-specific metadata to the OpenTelemetry resource attribute, it is necessary to include the Google Cloud official Resource Detectors in the classpath of the Java agent. However, this library does not provide a shaded package, so I was concerned that maintaining the Dockerfile would become complex.
  • I find it risky to add JAR files to the classpath of the Java agent since it implies that I would need to add them to the bootstrap classpath (as I understand it).

Instead, I have opted for manual instrumentation to use resource detectors and the necessary instrumentations. I have confirmed that instance-specific metadata have already been set to the resource in metrics, and provided to the collector running on other Cloud Run service. For example:

// Koin module for OpenTelemetry
private val openTelemetryModule = module {
    single<OpenTelemetry> {
        if (isDevelopmentEnv()) {
            return@single OpenTelemetry.noop()
        }

        val resource = GCPResource().createResource(DefaultConfigProperties.create(mapOf()))
        logger.debug("Detected attributes are: {}", resource.attributes)

        return@single AutoConfiguredOpenTelemetrySdk.builder()
            .addMeterProviderCustomizer { providerBuilder, _ ->
                providerBuilder.addResource(resource)
            }
            // ...
            .build()
            .openTelemetrySdk
    }
}

@dashpole
Copy link
Contributor

Opened #278 to track adding support for resource detection with the java agent. It might be blocked on #266

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority: p2
Projects
None yet
Development

No branches or pull requests

4 participants