mozilla :: #taskcluster

15 Apr 2017
01:43jonasfjstandups: processed a huge backlog of github PRs...
01:43standupsOk, submitted #44825 for https://www.standu.ps/user/jonasfj/
08:59grenadeif there's anyone around who knows how to resolve 400 errors from the queue when uploading artifacts, help would be much appreciated with bug 1356771
08:59firebothttps://bugzil.la/1356771 NEW, nobody@mozilla.org queue refusing artifacts from gecko-1-b-win2012-beta
09:28grenadegps: i've never deliberately upgraded these drivers. i'll give it a shot
14:13garndtgrenade: did you start getting this to work? As I was digging through logs to determine what was going on (I didn't find the cause), I noticed that there was a successful task uploading odyssey.zip https://tools.taskcluster.net/task-inspector/#PZrXgDaHThK01s1f_aIX0Q/
14:14garndtunfortunately the 400's earlier seem to be coming from this block: https://github.com/taskcluster/generic-worker/blob/ecc94782fb3b53cc74bf40d36437b3bcc1730184/artifacts.go#L372-L378
14:14garndtand that doesnt' give anymore information about the response error text, just the code
14:14garndtthere are multiple spots int eh create artifact process in the queue that can return an "input error", which is what will give the 400 I believe
14:15garndtoh I see now
14:15garndtartifact expiration was > task expiration
14:16garndtso the queue does give back that error, and probably would be useful to dump that in the task log rather than panicing and causing the worker to crash
14:16garndthttps://github.com/taskcluster/taskcluster-queue/blob/master/src/artifacts.js#L143
14:24pmoore|PTO-back-18-Aprilgrenade: i suspect this 400 problem is because we are using httpbackoff http library to upload very large artifacts
14:24pmoore|PTO-back-18-Aprilgarndt: ^
14:24garndtpmoore|PTO-back-18-April: in this case it was that the artifact expiration was > task expiration, which results in a 400 from the queue, and I believe the worker panics
14:24pmoore|PTO-back-18-AprilWhich reads into memory so it can retry
14:24pmoore|PTO-back-18-AprilAh ok
14:25garndtit trees all non-500 errors as a panic situation
14:25garndttreats
14:25pmoore|PTO-back-18-AprilNext week when I'm back i can fix the memory issue if it still is open issue by using read seeker
14:26garndtI entered this where we can discuss the handling of the 400's https://bugzilla.mozilla.org/show_bug.cgi?id=1356800
14:26firebotBug 1356800 NEW, nobody@mozilla.org [generic-worker] Handle some errors during artifact uploading more gracefully
14:26pmoore|PTO-back-18-AprilOr go-got to keep jonasfj happy
14:27garndtthe artifact he was originally trying to upload (> 40gb) would have failed anyways even without the memory error because I believe the limit for a single put on a s3 object is 5gb
14:28pmoore|PTO-back-18-Aprilgarndt: artifact expiration being after task expiration is handled by generic worker and it should not panic but handle correctly as malformed payload
14:28pmoore|PTO-back-18-AprilI believe there is even a test for it
14:28garndthrm, well then perhaps the reason this was resolved as worker-shutdown is a different reason: https://tools.taskcluster.net/task-inspector/#V-qWmL-PSzaogy8YTcOqaQ/
14:29pmoore|PTO-back-18-AprilI don't have computer just wanted to give a heads up it shouldn't be that
14:29garndtunfortunately the worker only log the status code in the panic message to papertrail, not the reason that was sent with the status code, so I had to do some guessing
14:30garndttask expires "2018-03-29T14:03:22.791Z"
14:30garndtartifact expires expires "2018-04-15T04:17:16.209Z"
14:30garndtbut it got to the point of uploading that aritfact and not a malformed payload error
14:30pmoore|PTO-back-18-AprilAre you sure?
14:30pmoore|PTO-back-18-AprilIt doesn't log the response body?
14:31garndthttps://github.com/taskcluster/generic-worker/blob/master/artifacts.go#L372-L378
14:31garndthttpResponseCode and the artifact payload
14:31pmoore|PTO-back-18-AprilNo worries, I'm away from my machine
14:32pmoore|PTO-back-18-AprilI have to go now but good luck!
14:32garndtthanks sir
14:33pmoore|PTO-back-18-Aprilwcosta has some generic worker knowledge
14:34garndtI do not think it's as urgent anymore because I see some tasks he created after the fact where he adjusted the artifact expiration to be 6 months from now but before task expiration, and it looks like they were successful
14:34garndtI was just poking my head in to see if he was still blocked
14:35garndthave a good weekend, sir. See you on Tuesday
14:42grenadeThanks guys. Yes, as usual i managed to bugger stuff up on my own. Sorted now, as you saw.
14:45garndtNo problem. Sorry I jumped into it late
17:15jonasfjpmoore|PTO-back-18-April: go-got is great for small requests... Because it stored everything in buffers, so retries are easy... It's sort bad for file uploads...
17:16jonasfjIt's trying to make simple things easy... And leaving complex things as an exercise for the user... :)
16 Apr 2017
No messages
   
Last message: 123 days and 17 hours ago