You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
During a long-running test with lots of data to push to ES, I got into a situation where smallfile wrapper yielded several test results but the test results never made it to ES. This is bad because we lose valuable information about what went wrong and valuable partial results. For example: for uuid ca33d6d7-7cf1-5b08-91f6-95ca34bbdac8 in dev ES server, I saw that several results were missing ffor rename operation in sample 1 when I ran ```
However, when I looked in the pod log file for pod 10 here , I see that the rename test completed successfully and an ES document was indeed generated for it (yield). However, later in sample 1 of that test, the cleanup operation raised an exception (redis timeout) and this aborted the pod, I suspect before the documents in flight could reach ES. Specifically I never saw this log message:
In all other cases where the test finishes, it finishes with 0 duplicates, 0 failures and 0 retries and thousands of results.
I'd like to have some sort of mechanism to checkpoint the ES documents if there is an exception so that any in-flight documents get to ES before the test proceeds to the next operation, do people agree with this? What's the most economical way to get this behavior? Can we catch the exception somehow before the pod exits and get it to complete sending in-flight documents to ES?
The text was updated successfully, but these errors were encountered:
During a long-running test with lots of data to push to ES, I got into a situation where smallfile wrapper yielded several test results but the test results never made it to ES. This is bad because we lose valuable information about what went wrong and valuable partial results. For example: for uuid ca33d6d7-7cf1-5b08-91f6-95ca34bbdac8 in dev ES server, I saw that several results were missing ffor rename operation in sample 1 when I ran ```
python3 analyze-smf-test-results.py ca33d6d7-7cf1-5b08-91f6-95ca34bbdac8
However, when I looked in the pod log file for pod 10 here , I see that the rename test completed successfully and an ES document was indeed generated for it (yield). However, later in sample 1 of that test, the cleanup operation raised an exception (redis timeout) and this aborted the pod, I suspect before the documents in flight could reach ES. Specifically I never saw this log message:
In all other cases where the test finishes, it finishes with 0 duplicates, 0 failures and 0 retries and thousands of results.
I'd like to have some sort of mechanism to checkpoint the ES documents if there is an exception so that any in-flight documents get to ES before the test proceeds to the next operation, do people agree with this? What's the most economical way to get this behavior? Can we catch the exception somehow before the pod exits and get it to complete sending in-flight documents to ES?
The text was updated successfully, but these errors were encountered: