[batchprocessor] Batch processor timeout accumulates drift over time, causing misaligned batching windows

### Component(s)

processor/batch

### Describe the issue you're reporting

## Problem description

The batch processor's timeout-based flushing mechanism accumulates delays over time, causing batches to drift significantly from their intended schedule. This creates issues for use cases that require consistent time alignment.


## Current behaviour

When using timeout-based batching (e.g. timeout: 60s), we observe that:

* Individual batches are delayed: Each batch arrives to the next processor slightly after the configured interval (60.2 - 60.3 seconds instead of 60.0)
* Delays accumulate over time : The accumulation of the delay causes the batching window to shift over time



## Evidence

Real-world timing data with 60s timeout configuration, logging at what time the metrics arrive to the next processor:

```
1:09:08.684
1:10:08.733 (49 ms drift)
1:11:08.980 (296 ms drift)
1:12:09.487 (803 ms drift)
1:13:09.962 (1278 ms drift) 

...

2:07:39.094 (30+ seconds drift)
```
(drift = difference between ideal batch start time and actual start time)

Script to reproduce the problem: [github.com/franm7/batch-processor-script](url)


## Root Cause Analysis

Current implementation in startLoop:
```
func (b *shard[T]) startLoop() {
    ...
    for {
        select {
        ...
        case <-timerCh:
            if b.batch.itemCount() > 0 {
                b.sendItems(triggerTimeout)
            b.resetTimer()
```
Once the timer fires (timerCh receives a value), it triggers a flush of the current batch (if non-empty), and then after that the timer is reset for the next interval:

```
func (b *shard[T]) resetTimer() {
    if b.hasTimer() {
        b.timer.Reset(b.processor.timeout)
```

Small delays from goroutine scheduling, flush processing time, and timer inaccuracies compound over time because individual delays are never accounted for - if a delay happens, nothing corrects it and it shifts all future batching windows.


## Impact

The drift makes it difficult to aggregate data that should be grouped together. When we expect batches to arrive at consistent intervals for time-based processing, the delay causes incorrect grouping.



## Questions:

1) Are maintainers aware of this drift accumulation behavior? Is it considered expected or unintended?
2) Would the community be interested in exploring solutions for use cases requiring time-aligned batching?


### Tip

<sub>[React](https://github.blog/news-insights/product-news/add-reactions-to-pull-requests-issues-and-comments/) with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding `+1` or `me too`, to help us triage it. Learn more [here](https://opentelemetry.io/community/end-user/issue-participation/).</sub>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[batchprocessor] Batch processor timeout accumulates drift over time, causing misaligned batching windows #13717

Component(s)

Describe the issue you're reporting

Problem description

Current behaviour

Evidence

Root Cause Analysis

Impact

Questions:

Tip

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[batchprocessor] Batch processor timeout accumulates drift over time, causing misaligned batching windows #13717

Description

Component(s)

Describe the issue you're reporting

Problem description

Current behaviour

Evidence

Root Cause Analysis

Impact

Questions:

Tip

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions