Tolerate more not completed measurements for CPD uploads
Recently CPD team rolled out upload completion token feature to all users. Pressure on the system increased. Now became more common situations when upload completed, but because of Datastore limitations we can't see confirmation of it for some measurements.
I've checked 6 recent failures. For all of them amount of timeout measurements were less than 3% (less than 15 in absolute numbers, the biggest percent of failures was for 80 measurements, 2 of which timed out).
Bug: b/182111579
Change-Id: Ia5af367870d1cf7d28b9422c4114c6b85c41f865
Reviewed-on: https://webrtc-review.googlesource.com/c/src/+/228562
Reviewed-by: Artem Titov <titovartem@webrtc.org>
Reviewed-by: Mirko Bonadei <mbonadei@webrtc.org>
Commit-Queue: Andrey Logvin <landrey@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#34749}
diff --git a/tools_webrtc/perf/catapult_uploader.py b/tools_webrtc/perf/catapult_uploader.py
index a10dd84..6818bd1 100644
--- a/tools_webrtc/perf/catapult_uploader.py
+++ b/tools_webrtc/perf/catapult_uploader.py
@@ -122,8 +122,8 @@
# failed. Check it, so it doesn't increase flakiness of our tests.
# TODO(crbug.com/1145904): Remove check after fixed.
def _CheckFullUploadInfo(url, upload_token,
- min_measurements_amount=100,
- max_failed_measurements_amount=1):
+ min_measurements_amount=50,
+ max_failed_measurements_percent=0.03):
"""Make a HTTP GET requests to the Performance Dashboard to get full info
about upload (including measurements). Checks if upload is correct despite
not having status "COMPLETED".
@@ -135,8 +135,8 @@
for the status check.
min_measurements_amount: minimal amount of measurements that the upload
should have to start tolerating failures in particular measurements.
- max_failed_measurements_amount: maximal amount of failured measurements to
- tolerate.
+ max_failed_measurements_percent: maximal percent of failured measurements
+ to tolerate.
"""
headers = _CreateHeaders(_GenerateOauthToken())
http = httplib2.Http()
@@ -160,9 +160,10 @@
])
if (measurements_cnt >= min_measurements_amount and
- not_completed_state_cnt <= max_failed_measurements_amount):
- print('Not all measurements were uploaded. Measurements count: %d, '
- 'failed to upload: %d' %
+ (not_completed_state_cnt / (measurements_cnt * 1.0) <=
+ max_failed_measurements_percent)):
+ print('Not all measurements were confirmed to upload. '
+ 'Measurements count: %d, failed to upload or timed out: %d' %
(measurements_cnt, not_completed_state_cnt))
return True