mozilla :: #taskcluster

12 Sep 2017
08:04whimboowcosta: so I re-checked and still high cpu load for automountd with latest central!
08:07wcostawhimboo: so, I am out of ideas
08:07wcostadid you file a bug?
08:08whimboowcosta: not yet. maybe its by the nature on the test?
08:09wcostawhimboo: really no idea, what does the test do?
08:10whimbooit forces a chrome crrash
08:10whimbooand is doing that over and over again
08:10whimboowhen repeating the test
08:11whimbooi'm currently investigating bug 1395504
08:11firebothttps://bugzil.la/1395504 NEW, nobody@mozilla.org Infinite hang of web content process when parent process crashes [@ CrashStatsLogForwarder::Log() |
08:13wcostawhimboo: you can file a bug and cc me, I will take a look when I have some time, but it may take a while
08:13whimbooi will first skim over the already existent bug
08:13whimbooso I know which information to provide
08:13wcostaroger that
08:21whimboowcosta: strange that automountd doesnt show up in activity monitor
08:21wcostawhimboo: isn't it a kernel component?
08:22* wcosta making random guesses
08:22whimboono idea
08:22whimbooi only see the high cpu usage by using top
08:23wcostaweird, top shows it up, but not activity monitor, what about ps?
08:24whimboops works too.
08:24whimboomaybe i should restart the loaner
08:24whimbooand try fresh
08:26wcostaso, must be something with activity monitor, I guess
11:27sfraserAre there any metrics collected, or nice graphs to look at, about taskcluster api calls? I'm trying to figure something out with creating tasks in the queue
12:36dustinthere's a signalfx account
12:58gerard-majaxhello
12:58sfraseris there data about how long a task creation call normally takes?
12:58gerard-majaxdustin, garndt, sorry to bother, but I have a lot of tasks failing because of what looks like network conditions
12:59gerard-majaxdustin, garndt, e.g., https://tools.taskcluster.net/groups/VDjI7JuKQpKApPabVnbTUQ
12:59dustinsfraser: I'm not sure.. but it's not quick :)
12:59sfraserdustin: we had to back out the partials in-tree because it made the decision task overrun its maxruntime :(
13:00dustingerard-majax: maybe the task is downloading too much?
13:00dustinsfraser: yeah, I saw, and the fix had a typo?
13:01gerard-majaxdustin, like, it used to work well, and then all of a sudden ?
13:01dustingerard-majax: that can be an issue with an EC2 instance, or with EC2 in general, or with what you're downloading from
13:01dustinwere they all the same worker?
13:01gerard-majaxthey are all deepspeech-worker
13:01sfraserdustin: it did, but increasing maxRunTime also seems like postponing the issue, so I've been looking at why it might take too long
13:01dustingerard-majax: that's the workerType
13:02dustinworkerGroup/workerId is what I'm wondering about
13:02gerard-majaxlsat week there was already an issue, garndt had to restart something, because tasks where not starting properly
13:02dustinsfraser: from scrollback it looks like 4.5k tasks
13:02gerard-majaxdustin, hm I need to check
13:02dustingerard-majax: that sounds unrelated
13:02sfraserdustin: that's when it ran out of time. It's 5040, but I can add the filter I wanted to, to make it nearer 4700
13:02gerard-majaxI can't find workerGroup
13:03gerard-majaxah found it
13:03dustinsfraser: the max parallelism on the client side is 50x, and the max on the server side is defined by the number of heroku dynos
13:03dustinwhich is less than 50
13:03dustinand also shared with other users
13:03gerard-majaxdustin, seems to be us-east-1 but not the same worker Id each time
13:03gerard-majaxdustin, do you have ways to check if it's EC2-side?
13:03dustingerard-majax: ok, maybe check whatever url's they're hitting
13:03dustinno
13:04dustinsadly, amazon considers even quite substantial failure rates "normal" so it won't even appear on their status page
13:04dustinsfraser: queue has 25 webheads running
13:04* gerard-majax adds amazon to the list of things to eradicate once I run the world
13:04dustinso practically probably 20x parallelism
13:04dustinlol, good luck
13:05dustinso roughly (250)(createTask time)
13:05gerard-majaxdustin, all I know so far is that tasks failed since yesterday evening at best, and early, so during tensorflow deps pulling
13:05gerard-majaxdustin, is there a trick to restart all tasks in a taskGroup ? having to manually go into each is a bit tedious :/
13:05dustinand if 2400s was too short, then createTask time is at least 10s
13:06dustinwhich seems long, but within an order of magnitude
13:06dustinit's doing a bunch of "synchronous" stuff: adding to the table, adding to the queue, etc.
13:07dustin(as in "x must complete before y begins" to ensure consistency)
13:07dustinsfraser: all of which is to say, i think increasing the decision task time is the right fix
13:08sfraserdustin: it does seem that way, yes. But it also seems like we'll just keep having to increase it each time we add more tasks to the graph
13:08dustinat some point making createTask faster would be good -- we've talked about moving the queue to a local postgres backend, which might help (or might not) -- but for the moment it's as fast as it is
13:08dustinor slow
13:08dustinwell, hopefully we won't be adding 1000's of tasks every week :)
13:08sfraseris there a specific part that's slow?
13:08dustinthose additional tasks also cost $$$ and/or run on limited pools
13:08gerard-majaxdustin, please, tell me, you plugged an AI to taskcluster and irc
13:09gerard-majaxdustin, so that failures are automagically fixed when I complain here :)
13:09dustinJonas would know the details, but I think it's just a complex process
13:09dustinwe use azure tables and azure queues on the backend
13:09dustinand I think the task gets inserted into a few queues
13:09sfraserahh
13:09dustinazure has some nice consistency guarantees, but speed is not great
13:10dustinalso, it's in a MS datacenter when everything else is in AWS so there's some network latency too (but that's not going to be a big part of 5-10s)
13:10dustinanyway, you're welcome to dig into that and see if you can find a way to optimize :)
13:10dustinI can give you signalfx access, although I don't think that has any resolution *within* the API call
13:10dustinI think it does measure the duration of the API calls tho
13:11dustinif you want to know more details about what goes on within, Jonas is the person to talk to
13:12dustinhttps://github.com/taskcluster/taskcluster-queue/blob/master/src/api.js#L538
13:13dustinso following the `await`s there, you can see what it's doing
13:13dustininstrumenting that with some more statsum measurements would probably be a mergeable PR
13:14dustingerard-majax: so I guess it worked this time? If so, I'd have to guess it's probably whatever bazel is hitting.. are you caching? could you cache more?
13:15sfraserYeah, I spotted the awaits, and briefly was sad that we couldn't use async in the decision task.
13:15gerard-majaxdustin, I'm not caching, I don't know how that is supposed to be done
13:16gerard-majaxdustin, I'm waiting a few more minutes before calling victory
13:16dustingerard-majax: I'm sure bazel has ways to cache stuff
13:17gerard-majaxdustin, bazel is weird
13:17dustinsounds java-y
13:17dustinhttps://docs.taskcluster.net/reference/workers/docker-worker/docs/caches
13:17dustinoh, I'm thikning of maven
13:17dustinanyway
13:19gerard-majaxdustin, https://tools.taskcluster.net/groups/VDjI7JuKQpKApPabVnbTUQ/tasks/R_WD4qaFSm6X6I5joo1cRw/runs/0/logs/public%2Flogs%2Flive_backing.log#L20844
13:20gerard-majaxdustin, downloading those on my laptop, the file from github is with the wrong checksum
13:21gerard-majaxdustin, even at home
13:23gerard-majaxdustin, people are having the same behavior on tensorflow's IRC :(
13:24dustingerard-majax: sorry, why are you pinging me/
13:25dustinis there something I can help with?
13:25gerard-majaxdustin, nope, just telling you it's unrelated to taskcluster in fact :)
13:26gerard-majaxsince I bothered you thinking it was a tc issue at first :/
13:27dustinok :)
13:27dustinbad checksums on github sounds pretty bad :(
13:28gerard-majaxespecially when the other mirror return the good one, and that the github one looked okay
13:28dustinjust making sure i hadn't missed something
13:28dustinyeah, distributed errors can be confusing
13:36garndtUgh that sucks
14:51jonasfjdustin: I pushed 5 commits for review on bug 1329282, I know it's big and maybe we need someone else to say r+ for the new mach commands... But I'm hoping you're a good start :)
14:51firebothttps://bugzil.la/1329282 NEW, jopsen@gmail.com Deploy QEMU engine to build docker images
14:51dustincool!
14:51whimboowcosta: so your patch on bug 1338651 modified the docker image. When I have an old loaner, how can I get it updated to use the changes?
14:51jonasfjnote: it's still missing a final piece changing the docker image building tasks; but there is so much let's just get some preliminary reviews started..
14:51firebothttps://bugzil.la/1338651 FIXED, wcosta@mozilla.com taskcluster cross-compiled OS X builds create perf problems when not stripped (Talos performance reg
14:52wcostawhimboo: newer builds automatically use the newer docker image
14:53whimboowcosta: the tests i run under /Users/cltbld/. is that the problem?
14:53whimboodo i have to use a different folder
14:53gerard-majaxdustin, if there's something broken, it's github
14:53wcostawhimboo: I don't think so, what is the problem you are having?
14:53gerard-majaxdustin, I have the same kind of checksum mismatch on archive tarball exposed by raspberrypi for the toolchain: https://tools.taskcluster.net/groups/VUjwJI6DTQat9udJw2AFdw/tasks/NghROH8aRhiKL8Olkk-xWA/runs/0/logs/public%2Flogs%2Flive_backing.log#L21626
14:53whimboowcosta: still this high cpu load of automountd
14:56wcostawhimboo: if you run the test against a newer version of firefox build, then the bug I referenced you surely is not the problem
14:57garndtjonasfj: will that actually switch our production docker image building over to that? we should talk before landing something like that
14:57RyanVMWTF, I lost my ability to add jobs again
14:58RyanVMTaskcluster: Supplied credentials do not satisfy authorizedScopes; credentials have scopes
14:58dustinRyanVM: can you pastebin the whole error?
14:59RyanVMhttps://pastebin.mozilla.org/9032218
15:01whimboowcosta: looks like its the crash reporter which causes it
15:02whimbooted: ^ could the crash reporter on osx and our osx workers cause a high cpu load of the automountd process?
15:03Aryxhi, Windows buildbot test jobs are failing: https://treeherder.mozilla.org/logviewer.html#?job_id=130378439&repo=mozilla-inbound
15:03bstackdustin: wait, do a client's scopes need to be a superset of the authorized scopes? I assumed that you would just end up with an intersection.
15:03dustinno, superset
15:04dustinintersection of scopes isn't defined
15:04bstackO_o
15:04bstackugh
15:04bstackwe should give sheriffs notify:* scopes then
15:04bstackor perhaps just give everyone that
15:05dustinyeah
15:05dustinmakes sense
15:09bstackdustin: r+ on adding queue:route:notify.email.* to https://tools.taskcluster.net/auth/roles/mozilla-user%3A* ?
15:10dustinor moz-tree:level:1
15:11garndtcatlee: do you know who could look at the buildbot windows failure Aryx mentioned above? looks like a package error
15:11Aryxgarndt: seems to be buildbot, so i had move discussion to #releng Bug is https://bugzilla.mozilla.org/show_bug.cgi?id=1399151
15:11firebotBug 1399151 NEW, nobody@mozilla.org Windows buildbot test fail: Could not install python package: C:\slave\test\build\venv\Scripts\pip i
15:11garndtoh ok, thanks!
15:11bstackok, I'll add it to moz-tree:level:1
15:12bstackRyanVM: can you try again now?
15:13RyanVMbstack: nope - do I need to logout/login first?
15:14bstackyou shouldn't need to. Is the error the same as before?
15:15RyanVMlooks like it
15:15dustinoh, I'm sorry
15:15dustinusres don't have moz-tree:level:N
15:15bstackhah, ok
15:15dustinjust add it to mozilla-user:*
15:15* bstack removes it from that
15:15bstackkk
15:16bstackok, one more try RyanVM.
15:16RyanVMnope
15:18bstackcan you paste the error again, sometimes they are subtly different
15:21bstackhuh, why didn't that work
15:23bstackRyanVM: if you'd like, link me to the job and I can run whatever action on it you need for the time being
15:23dustinI can look more after the meeting
15:24RyanVMbstack: it's not urgent
15:24RyanVMhappy to use it as a guinea pig for now
15:24bstackoh, good :)
15:51catleedustin, sfraser: what about changing visit_postorder to do a breadth-first search?
15:51catleeor adding a new method to visit in dependency order
15:51dustinaren't postorder and breadth-first the same?
15:51sfraserI'm wondering about storing the kind order somewhere, and using that to prioritise
15:52dustinsorry, I missed something here
15:52dustinwhat would this do?
15:52catleewe're not parallelizing task submission properly
15:52catleebecause of the way graph traversal works
15:52sfraseron nightly only, it seems
15:52catleeyeah
15:55dustinah
15:55catleeit seems we don't parallelize submission of l10n jobs because they each have their own chain of dependencies
15:55dustinbecause it's using a kind of silly thing with the futures
15:55dustinright
15:55tedwhimboo: i don't know why it would, except possibly as relates to that bug that wcosta fixed by moving the build directory from /home to /build
15:55dustinyeah, being smarter about that would be good
15:55dustinI think there's a visit_preorder now, too
15:55dustinor
15:56dustinright, that's in bug 1585880 that might not land before I retire..
15:56dustinyou could cherry-pick https://reviewboard.mozilla.org/r/172780/diff/4#index_header though
15:56dustinor I can pull it into a separate bug and land it
15:56dustineasy enough
15:57whimbooted: for the loaner I have right now, i use /Users/cltbld? Is that the wrong folder?
15:57dustinhm, that won't help actually
15:57whimbooted: if yes, I might have to get it re-imaged?
15:57tedwhimboo: i mean for for the build in taskcluster
15:58tedthe root cause of https://bugzilla.mozilla.org/show_bug.cgi?id=1338651
15:58dustinthe visit_ functions assume sequential ordering, which you don't have for parallism
15:58firebotBug 1338651 FIXED, wcosta@mozilla.com taskcluster cross-compiled OS X builds create perf problems when not stripped (Talos performance reg
15:58dustinsfraser: I'd brute-force it: use a deque, put all the tasks in, and when you pop a task and its dependencies aren't done yet, put it back at the end of the queue
15:58whimbooted: oh, so its not dependent on where firefox actually runs in when running the tests
15:59tedwhimboo: right
15:59dustinthat runs the risk of busy-looping when the queue is almost empty, but maybe that's fixable with some creativity
15:59tedwhimboo: see https://bugzilla.mozilla.org/show_bug.cgi?id=1383805
15:59dustininserting them in postorder might minimize the task shuffling required
15:59firebotBug 1383805 DUPLICATE, nobody@mozilla.org [macOS]: Opendirectoryd hangs when Nightly is running.
16:00tedwhimboo: what branch are these builds you're using from?
16:00whimbooted: strange then, but I leak expertise to further dive into this problem. shall I file a bug, or may it be expected? Frankly I have no idea here
16:00whimbooted: latest nighlty even
16:00sfraserdustin: could still use visit_postorder, but not as a generator - get the full list, and replace calls to .result for all(f.done() for f in deps_fs)
16:00tedwhimboo: huh, interesting
16:00sfraserthat sort of thing?
16:00tedwhimboo: wcosta's fix landed ~2 weeks ago
16:01tedhttps://hg.mozilla.org/mozilla-central/rev/8cf125b4fa04
16:01whimbooted: correct. that's why we are wondering
16:01dustinsfraser: I don't really understand that, but general direction, yes
16:01* dustin in another meeting now
16:01tedwhimboo: that machine doesn't have a /build listed anywhere, does it?
16:01whimbooted: i could hand you the loaner details in case you want to have a look
16:01tedwhimboo: not particularly :)
16:02whimboono, it doesnt have
16:02tedif you dont see /build in `mount` then the answer is no
16:03whimbooted: there is /builds and /home
16:04tedwhimboo: OK! then that's why
16:04whimboohttps://irccloud.mozilla.com/pastebin/ZN6LZO2W/
16:04whimbooso i have to ask RelEng to re-image?
16:04tedwhimboo: er, i don't see any /builds there?
16:04teddid it get cut off?
16:05whimboosorry /builds is just a folder, and not mounted in
16:05tedah
16:05whimbooand contains git-shared hg-shared mercurial-certs slave tooltool.py tooltool_cache
16:07tedis anything in there a symlink to /home ?
16:08tedhttps://superuser.com/a/426719/29304
16:08tedsuggests that can be the case
16:09sfrasercatlee: This sort of thing? https://irccloud.mozilla.com/pastebin/UoeURdeB/
16:09tedin any event, i suspect the issue is related to this machine having a /builds directory
16:09tedsince that's where the builds are done now
16:11catleesfraser: looks sane to me
16:11catleedoes it work? :)
16:11sfraserrunning locally now
16:14sfraserseems to work
16:14catleeI don't think you want as_completed
16:15sfraserhm, you're right
16:15catleeif not all([f.done() for f in deps_fs])?
16:15catleeif any(not f.done() for f in deps_fs)?
16:16sfraserI've gone for the first
16:16sfraserbut am happy with either. I'll push it to date
16:16sfraserthen I will have to check on it while out
16:17catleefinishes in 2 minutes on my laptop
16:17catlee:)
16:17catleeI'll keep an eye on date
16:18sfrasertriggered a nightly
16:18catleegreat idea to reuse visit_postorder
16:19sfraserd.ustin was right, I think it'll minimise queue cycling
16:19catleeyeah
16:19dustinphew :)
16:19dustinthanks for working on that
16:19dustinI forgot that algorithm was so suboptimal
16:20catleeit says so right in the comments!
16:20dustinhaha, that's something eh?
16:22sfraserbusted!
16:24sfraserdeps_fs won't contain all the dependencies, just the ones that have already been submitted
16:24sfraserI do need to head out now, though. I'll check later
16:25dustinjonasfj: ^^ did scheduler have some fancy algortihm for this, or just topo-sort + sequential insert?
16:25jonasfjlol, are you guys not figuring it out...
16:26jonasfjtasks = [...]
16:26jonasfjtasks_done = set()
16:28jonasfjdef schedule_tasks():
16:28jonasfj for task in tasks:
16:28jonasfj if task in tasks_done or task.dependencies not in tasks_done:
16:28jonasfj continue
16:28jonasfj create_task(task).then(lambda: tasks_done.add(task) and schedule_tasks())
16:28jonasfjor something like that...
16:29sfraserlast line looks more javascript than python ;)
16:29jonasfj1) track tasks created
16:29jonasfj2) create tasks whose dependencies are created
16:29jonasfj3) when a task is done goto (2), if all tasks are created, we're done
16:30jonasfjyeah, I'm slightly language confused :)
16:30dustinmakes sense to me
16:31dustinO(n^2) in the number of tasks, that's not too bad at n~10000
16:31dustinand no busy-loops :)
16:32jonasfjyeah it's O(n^2)
16:33jonasfjI'm sure it's possible to do better, but probably no really worth it..
16:33dustinyep
17:10tedsfraser: I wrote a slightly better futures.as_completed at some point in the past
17:11tedhttps://hg.mozilla.org/mozilla-central/annotate/994b97c9de37/toolkit/crashreporter/tools/symbolstore.py#l377
17:33RyanVMbstack: any news? :)
17:34bstackStill poking. Just had a meeting in the middle of the debugging though.
17:34bstackI have another bunch I'll try in a sec
17:37bstackhunch*
17:41bstackok, that didn't work
17:41bstacktrying another thing
17:43bstackRyanVM: can you try again now?
17:43RyanVMnegative
17:43bstackwtf
17:43bstackmaybe try log-out/log-in now?
17:43garndtis there anything I can poke at?
17:44bstacknot sure why that would help but I'm deeply confused
17:44RyanVMnope, didn't help
17:44RyanVMhttps://pastebin.mozilla.org/9032238 if it helps
17:44bstackgarndt: ryan.vm keeps getting https://pastebin.mozilla.org/9032218 when trying to use an action task
17:45bstackaha, so the new one is different
17:45bstackthat's kinda good news
17:45bstackthe problem now is that you don't have assume:repo:hg.mozilla.org/try:*
17:46garndtah yea, I was just about to say
17:46garndtbut if you expand the scopes it should have it
17:47bstackyeah, but it looks like the scopes are being expanded? https://github.com/taskcluster/taskcluster-auth/blob/a1d8a6b197371841a0e073ec9418f2656a2f1b74/src/signaturevalidator.js#L152-L166
17:47garndthrm
17:47garndtI don't see mozilla-group:all_scm_level_1 in our roles
17:47garndtI see "active_scm_level_1" but not "all_scm_level_1"
17:48garndtthose groups are new to me, so I'm not sure what they should be
17:50dustinoh
17:50dustinthis is my fault
17:50garndtyea I was just checking out https://bugzilla.mozilla.org/show_bug.cgi?id=1395320
17:50firebotBug 1395320 FIXED, dustin@mozilla.com Switch to active_scm_level_N groups
17:51dustinmaybe not
17:51dustinhm
17:51garndtso I'm not sure why Ryan has no active_ scm groups
17:51garndtbut as all_ instead
17:51dustinI was thinking that the old scm_level_3 had assume:scm_level_2
17:51dustinbut I don't think it didi
17:52dustinyeah, that is def weird
17:52garndtI'm pretty sure it did
17:52dustinwait, so RyanVM has *no* active_* groups?
17:52dustinthat's a different problem
17:52garndtright
17:53dustinI wonder if these new groups aren't ready for prime time
17:53garndthe just has assume:mozilla-group:all_scm_level_1,assume:mozilla-group:all_scm_level_2,assume:mozilla-group:all_scm_level_3
17:53dustinyeah, that doesn't make sense -- all should be the union of active and expired
17:53garndtactive does not appear in that scope list he gave in the error message
17:53garndtyea
17:53dustinit wasn't found in enumerating LDAP groups either
17:55* dustin asking in #iam
17:56garndtthanks dustin
17:56garndtand TIL about that channel
17:57garndthah
17:57garndtwell then... "You were kicked by &ChanServ: You are not permitted to be on this channel."
17:57dustinyeah, it's infra so closed by defualt
17:59garndtfine by me, one less channel for me to worry about
17:59dustin13:56:43 <henx> [jabba] he doesn&#39;t have any of the mercurial bits
17:59dustinRyanVM: this is the account you do hg pushes with, right?
18:00dustinlooking like there&#39;s something messed up with it in LDAP.. seeing if we can straighten it out, and who else is affected
18:00dustinit = your account
18:00RyanVMdustin: no, I use my personal gmail account for pushing - I had to have level 3 added to my moco account specifically for previous scopes issues
18:00dustinahh
18:00dustindo you have a ref to that bug?
18:00dustinlikely it got messed up then
18:01dustinif jabba can look at the bug he can probably fix it up - just needs to see the original request
18:01RyanVMbug 1389275
18:01firebothttps://bugzil.la/1389275 FIXED, ludovic@mozilla.com Please add Level 3 commit access to my moco LDAP account
18:01dustindanke
18:01RyanVMoh heh, we specifically didn&#39;t add hg access
18:01garndtwell that time is here! :) https://bugzilla.mozilla.org/show_bug.cgi?id=1389275#c2
18:02dustinRyanVM: ah! that&#39;s the issue
18:02bstackoh
18:02dustinRyanVM: also, you&#39;ll need to keep the access active (so, push sometimes)
18:03RyanVMugh
18:04dustinI was hopeful we&#39;d get linked accounts with the shift to auth0, but it looks like at best not soon
18:04RyanVM:(
18:04dustinthe other thing we could do is add your non-moco LDAP to vpn_sheriffs and whatever other groups are required
18:04RyanVMI was really hoping to not have to deal with juggling two different ldap accounts and associated ssh keys
18:04dustinyeah
18:05dustinyou&#39;re not the only person who has double-account woes
18:05dustinalthough you are the only person (I know of) with this particular issue
18:06dustinhopefully nobody else was smart enough to ask for membership in scm_level_3 without LDAP push rights, or whatever black magic ludo did :)
18:06dustin*hg push rights
18:06RyanVMhah
18:06RyanVMjust felt like unnecessary exposure at the time
18:07dustinyeah, it&#39;s sensible
18:08dustinbtw I think getting vpn_sheriff and whatever else added to your gmail account is the better long-term fix, and then just only use that acct to login
18:10RyanVMI&#39;ve been trying to maintain a distinction between employee-level access and contributor-level access, but maybe it&#39;s futile :)
18:10dustinyeah, and here you&#39;re trying to do things that require some of the rights of each, I think
18:11dustinjabba said he updated your account but there&#39;s a cron-script that runs hourly that will need to run before that&#39;s fixed
18:11dustinis that OK to wait?
18:11RyanVMok
18:11RyanVMso check back after 3?
18:12dustinI&#39;m not sure it&#39;s on the hour, so after 12:11p US/Pacific :)
18:12RyanVMok
18:13dustingarndt: we&#39;re not sure if there are others in this state, so as a datapoint, the situation is seeing someone in all_ but not active_ or expired_
18:13dustin..and you can see the whole set of groups in the tc-login logs
18:14garndtand you get active_ and/or expired_ by having the right hg push stuff set on your account?
18:14dustinyeah
18:14dustinvia a crontask of some sort
18:15dustinI think there&#39;s an &quot;hg object&quot; associated with each LDAP account that has push permisisons, and the hg server can update that object to indicate last date used
18:15dustinso something&#39;s syncing back from that object to group membership
18:15garndtmmm I see
18:15RyanVMdoes anybody know offhand what the timeframe is for marking accounts inactive? wondering how often I&#39;ll need to do a dummy push to keep the bits active
18:15dustin6mo
18:15RyanVMah, nice
18:15dustinI don&#39;t remember if there are warning emails
18:16RyanVMone would hope!
18:16RyanVMmeh, I&#39;ll just throw a reminder on my calendar for next march
18:16dustinI think there might not be, since we don&#39;t want to encourage non-participating contribs to do trivial pushes just to keep it alive
18:16dustinthat said, yeah.. you will want to do that
18:16dustinI *think* a user repo push will do the trick
18:17RyanVMexcellent, just gave myself a reminder for march 4
19:14RyanVMbstack, dustin, garndt: https://www.youtube.com/watch?v=usfiAsWR4qU
19:15bstack\o/
19:15bstackAwesome
19:15bstackThanks, Dustin/garndt
19:15dustinsweet :)
19:16* dustin tries to appreciate the small victories while losing the war
19:16RyanVMthanks everyone, always fun being the odd duck
19:16garndtwoo++
20:00RyanVMhmm, wonder if there&#39;s CoT issues with trying to add new jobs on Try. I tried to add some new Windows tests to a push, build went fine, but the signing task failed (and as a result, tests won&#39;t run)
20:02garndtRyanVM: example?
20:02RyanVMhttps://treeherder.mozilla.org/#/jobs?repo=try&revision=347b95e67089f89d0bf81faad0a6b732e26261cf&group_state=expanded
20:02garndtI think there was a bug entered somewhere about cot and action tasks
20:02garndtcc aki-food
20:16garndthrm, I think this might be the bug
20:16garndthttps://bugzilla.mozilla.org/show_bug.cgi?id=1393277
20:17firebotBug 1393277 NEW, aki@mozilla.com cot-verifiable action tasks
20:17garndtoh, maybe not...
20:17dustinI didn&#39;t know tests depended on signing
20:18dustinI think there&#39;s code in CoT 1.0 to handle action tsaks
20:18akixpcshell does
20:18dustinI really don&#39;t know tho
20:18akiRyanVM: &quot;add tasks&quot; should work now, unless they&#39;ve changed since my treeherder patch landed
20:18RyanVMdunno, I triggered all of those ~1hr ago
20:18dustinah &quot;requires-signed-builds&quot;
20:18dustincool :)
20:21RyanVMaki: unless it depends on a TH patch that&#39;s still on stage or something
20:21akiyeah, the action task appears to have changed
20:22akisince 13 days ago https://github.com/mozilla/treeherder/commit/8c50b4fff295f5831b04df5f788e042712b0cfa4
20:22akii&#39;ll revisit action task verification once it settles; aiui b.stack is still working on em
20:23bhearsumis there someone around who can grant me a few scopes for a github repo?
20:26garndtbhearsum: I can take care of it if you can enter a bug under taskcluster:;service request so I don&#39;t forget about it
20:26garndtaki: so this is something just related to the action task format?
20:26bhearsumgarndt: yup, sure
20:26bhearsumTIL about that component
20:27garndt:)
20:27akigarndt: yes, cot only supports old add-tasks, and no other action task cot verification, atm. i&#39;m holding off on more action task verification til b.stack is done making changes
20:30bhearsumgarndt: https://bugzilla.mozilla.org/show_bug.cgi?id=1399243
20:30firebotBug 1399243 NEW, nobody@mozilla.org please grant me some scopes for my balrog test repo
20:31garndtaki: ok, we might be nearing the end of changes to the actual action task layout, but we&#39;ll need to confirm, maybe I&#39;m mistaken
20:31akiok
20:31garndtI think the most recent changes was around the workspac edirectory being used
20:32akii think actions -> action.json was still in progress as of a week or two ago
20:41bhearsumgarndt: i think you missed secrets:set:repo:github.com/testbhearsum/balrog:*
20:41garndtoh
20:42garndtI copied and pasted what was in the bug, and both of the secrets scopes are &quot;get&quot; :)
20:42garndtI&#39;ll fix
20:42bhearsumoh, sorry, heh
20:42garndtshould be ok now
20:44bhearsumlooks like it - thanks for the quick turnaround!
20:45garndtno problem sir
21:02garndtaki: we&#39;ve moved over to actions.json for backfilling and adding new jobs...I think those taks should be fairly stable now (assuming no major problem found). Retrigger and cancel do not result in an action task (yet)
21:02akiyeah, retrigger was the main one
21:03akiwaiting for that, and i&#39;ll revisit
13 Sep 2017
No messages
   
Last message: 7 days and 18 hours ago