You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Who is this for and what problem do they have today?
when storage_min_free_bytes is hit
we get :
this in redpanda logs rejecting produce request: no disk space; bytes free less than configurable threshold
metric redpanda_storage_disk_free_space_alert goes to degraded state
clients get a variety of errors/behaviours..... and customers (well at least 2 recently )
with franz/go you just hang....(will timeout)
have to turn on DEBUG to see
4:33:46.086 DEBUG wrote Produce v7 {"broker": "2", "bytes_written": 123, "write_wait": "26.571µs", "time_to_write": "23.094µs", "err": null}
14:33:46.087 DEBUG read Produce v7 {"broker": "2", "bytes_read": 62, "read_wait": "52.466µs", "time_to_read": "958.691µs", "err": null}
14:33:46.087 DEBUG retry batches processed {"wanted_metadata_update": true, "triggering_metadata_update": true, "should_backoff": false}
14:33:46.087 DEBUG produced {"broker": "2", "to": "jason-test[0{retrying@-1,1(BROKER_NOT_AVAILABLE: The broker is not available.)}]"}
14:33:46.087 INFO metadata update triggered {"why": "produce request had retry batches"}
txns/.. /app/CharityWorker/Kafka/TitanProducer.cs:line 87 at Confluent.Kafka.Impl.SafeKafkaHandle.CommitTransaction(Int32 millisecondsTimeout)at CharityWorker.Kafka.TitanProducer.BatchWrite(IEnumerable1 titanMessages) in /a....10:59:34.532 CharityWorker ERROR CharityWorker.Services.OutboxBackgroundService Error when sending outbox messages, error=Error when writing to kafka System.InvalidOperationException: Error when writing to kafka---> Confluent.Kafka.KafkaTxnRequiresAbortException: 1 message(s) timed out on geo_charity-charity_transaction_detail-itp1 [3]at Confluent.Kafka.Impl.SafeKafkaHandle.CommitTransaction(Int32 millisecondsTimeout)at
What are the success criteria?
In the disk full scenario clients see an error message similar to
cannot write to redpanda - disk full
Why is solving this problem impactful?
Customer can and have spent a fair bit of time troubleshooting not realising that it disk free issue.
Customers have specifically asked is there anything that could be changed in product to make the error more specific for clients...
e.g cannot write to redpanda - disk full type of thing
Additional notes
The text was updated successfully, but these errors were encountered:
Who is this for and what problem do they have today?
when storage_min_free_bytes is hit
we get :
this in redpanda logs
rejecting produce request: no disk space; bytes free less than configurable threshold
metric
redpanda_storage_disk_free_space_alert
goes to degraded stateclients get a variety of errors/behaviours..... and customers (well at least 2 recently )
with franz/go you just hang....(will timeout)
have to turn on DEBUG to see
txns/..
/app/CharityWorker/Kafka/TitanProducer.cs:line 87 at Confluent.Kafka.Impl.SafeKafkaHandle.CommitTransaction(Int32 millisecondsTimeout)at CharityWorker.Kafka.TitanProducer.BatchWrite(IEnumerable1 titanMessages) in /a....10:59:34.532 CharityWorker ERROR CharityWorker.Services.OutboxBackgroundService Error when sending outbox messages, error=Error when writing to kafka System.InvalidOperationException: Error when writing to kafka---> Confluent.Kafka.KafkaTxnRequiresAbortException: 1 message(s) timed out on geo_charity-charity_transaction_detail-itp1 [3]at Confluent.Kafka.Impl.SafeKafkaHandle.CommitTransaction(Int32 millisecondsTimeout)at
What are the success criteria?
In the disk full scenario clients see an error message similar to
cannot write to redpanda - disk full
Why is solving this problem impactful?
Customer can and have spent a fair bit of time troubleshooting not realising that it disk free issue.
Customers have specifically asked is there anything that could be changed in product to make the error more specific for clients...
e.g
cannot write to redpanda - disk full
type of thingAdditional notes
The text was updated successfully, but these errors were encountered: