mozilla :: #taskcluster

8 Aug 2017
08:57glandiumwindows workers don't have scopes?
10:26tedi really wish the task creator would give me some feedback about why my task definition isn't valid beyond just making "create task" unclickable
14:08armenzgdustin: hi! A couple of weeks ago I tried using "schedule missing tests" and did not work; it was known at the time; is this something that is now fixed?
14:08dustinmaybe!
14:08dustinglandium: windows workers have scopes...
14:14armenzg:)
14:20armenzgdustin: you point me to a bug with regards to scopes (you being able to schedule but not other people) - here's the issue I was facing https://treeherder.mozilla.org/#/jobs?repo=try&author=armenzg@mozilla.com&selectedJob=118083870
14:21dustinhttps://bugzilla.mozilla.org/show_bug.cgi?id=1383066
14:21firebotBug 1383066 FIXED, bstack@mozilla.com Set schedulerId on action tasks
15:22jonasfjhassan, Eli: t.ed has a point... the task-creator does display schema errors, but it doesn't do anything for YAML errors -- not sure when that disappeared, probably when we switched from JSON to YAML :)
15:55hassanEli: i can look into it if you haven't jumped on it already ^
15:55Elihassan: go for it :)
15:55hassan++
17:32armenzg_lunchdustin: bstack what is the scope of the re-working of action tasks?
17:33dustinI'm not sure what the question means
17:34armenzg_mtgdustin: I want to know what things will be fixed in-tree as changes to Treeherder
17:34armenzg_mtgis there also a test plan for this?
17:35dustinI filled out the TPS reports
17:35armenzg_mtgwhat is TPS?
17:36dustinit's a reference to Office Space :)
17:36dustinit sounds like you're asking what PM paperwork we've done
17:36dustini don't think tehre's a Test Plan
17:36dustinI'm not really sure what that is
17:36bstackMy plan is that everything should work and nothing should not work :p
17:37bstackI see your updates to the bugzilla stuff today. I'll make sure to take that all into consideration as I poke at it further.
17:37bstackIs there something else specific we should keep in mind?
17:38garndtdustin: you don't want to know what a test plan is
17:38garndtI have written way too many of them, and they suck :)
17:39bstackhere's the fundamental issue of action tasks and why a test plan (if it is what I think it is) will be hard to make:
17:39bstackAn action task is not well defined beyond the realm of "it used to do X, I think" and everyone has a different X
17:39dustinand forgets the Y and Z it also used to do
17:39bstackyeah
17:40dustine.g., retriggering cancelled tasks
17:40bstackit's the same issue as try and such
17:40dustinand, oh yeah, failed tasks
17:40dustinyep
17:40bstackso the only real way forward is to build a thing and then wait for the bug reports to roll in :/
17:40dustin..and also not be solely responsible for fixing them
17:41bstackhopefully the new action tasks will make it easier for us to not be entirely responsible
17:41bstackbut also, what is a test plan?
17:42dustinthere are 8 ISO standards for Test Plans
17:43jgrahamISO or IEEE?
17:44dustingood catch, IEEE
17:44dustin*someone* is paying attention!
17:53jonasfjhassan: I only found two minor things in the queue PR -- otherwise everything looks awesome :)
17:55armenzgbstack: dustin garndt maybe my use of the term "test plan" is incorrect; I wasn't referring to any formal term. Let me unpack; are there any tests being run (or planned to) that will run to help these action features not to regress
17:55hassanjonasfj: awesome, thanks for the review!
17:56armenzgalso, I want to know how far are your intents to fix aiming for
17:56jonasfjhassan: btw have you seen: https://sentry.prod.mozaws.net/operations/taskcluster-queue/issues/634613/#
17:56armenzgfor instance, are you going to replace the current menu items (e.g. "add missing jobs")
17:56armenzgto a call to "missing test" action
17:58armenzgI was asking for a general kind of document so I could read it first
17:58hassanjonasfj: yeah. dustin iirc that error was fixed here: https://github.com/taskcluster/taskcluster-queue/commit/f2f7d79e3d4d7a770ae184537d00ddc09a116193
17:58jonasfjhassan: oh, nvm I see dustin marked it as resolved... dustin any idea what happened with provisionerId == undefined ?
17:59dustinyeah
17:59dustinit was stupid :(
17:59dustinwe wrote 11 provisioner rows into the TaskDependency table
17:59dustinso trying to scan that table as if it was the Provisioners table caused exceptions
18:00jonasfjoh, so we've polluted the TaskDependency table in production?
18:00tedhah, do l3 users seriously not have scopes to create tasks at any priority except "very-low"?
18:00dustinand would have caused exceptions scanning it as TaskDependencies if I hadn't managed to delete the 11 rows before the nightly run
18:00dustinyep
18:00dustinted: we can give you "wicked low" if you want
18:00jonasfjokay, thanks for cleaning those up... :)
18:00jonasfjdustin: ++
18:00teddustin: haha
18:00dustinseriously, no, I think they have very-high
18:00Callekgarndt: ping, if I have a n-i to set to someone about a tc windows builder (try) thats handled via OCC who is my best bet?
18:01Callekgarndt: I know pete and grenade are both pto
18:01jonasfjpriority is per workerType...
18:01teddustin: the task creator gives me an error if i try to put any priority in, actually
18:01garndtCallek: wcosta can try to help out if it's urgent. What's the problem?
18:01bstackarmenzg: anything we have written down about this stuff is in bugzilla
18:02bstackand _should_ all be linked to the one root bug that you've seen
18:02bstackbut I may have missed something somewhere
18:02Callekgarndt: tl;dr try jobs in buildbot (and for other platforms) have `gapi.data` even on try builders, but a test is failing on win testing due to that missing for try workers
18:02CallekI was n-i&#39;d on it and was briefly confused, so not sure what the path forward is but hopefully <someone> can help advise with those concerned with the test.
18:02wcostaCallek: do you have the link for the task?
18:02armenzgbstack: do you want me to file any bugs I think are missing and then decide if this is something that could be tackled?
18:02wcostaOr the bug
18:03Callekwcosta: https://bugzilla.mozilla.org/show_bug.cgi?id=1385613
18:03firebotBug 1385613 NEW, nobody@mozilla.org Intermittent test_safe_browsing_initial_download.py TestSafeBrowsingInitialDownload.test_safe_browsi
18:03armenzgI would add it to the root
18:03Callekwcosta: I&#39;m going to put a comment in there, I&#39;ll n-i you
18:03wcostaCallek: ok
18:03garndtwcosta: we&#39;ll have to figure out how those tokens get in there and what the right one is for try instances. I think it might be in OCC
18:04bstackarmenzg: yeah, that seems like a good place for it :)
18:04garndtI had to run out but could help debug later
18:04armenzgthank you bstack
18:04jonasfjted: a schema error? or?
18:04bstackno problem
18:04tedjonasfj: &quot;you do not have permission...&quot;
18:04tedsorry, lost the exact error
18:04jonasfjwhat workerType
18:05tedgecko-t-linux-xlarge
18:05garndtIt&#39;s possible the level 3 role is still using the legacy create task scope
18:06jonasfjyeah, I think we&#39;re only giving: queue:create-task:very-low:aws-provisioner-v1/gecko-1-*
18:06jonasfj..
18:07jonasfjso tree-level-1 only has very-low (probably not much runs at higher priority)
18:07jonasfj(but I think we want to give more, so we can do loaners at higher priority)
18:12armenzgcatlee: should cancel requests go through TaskCluster or BuildAPI?
18:12armenzgbstack: ^
18:12armenzg&quot;cancel all&quot; jobs
18:12armenzgrather than &quot;cancel *this* build&quot;
18:13bstackif it is an action task we&#39;re going to define it in terms of tc-only
18:13bstackfor simplicity and functionality&#39;s sake
18:13armenzgbstack: is &#39;mozilla-taskcluster&#39; defunct?
18:13armenzgor not yet?
18:13bstacknot yet
18:13bstackbut it will be
18:13armenzgwhat&#39;s left?
18:14bstackthings that are getting replaced by action tasks and something tailing the pushlog
18:14armenzgalso is &quot;cancel all&quot; handled via an action task?
18:22tedjonasfj: not a critical thing, i&#39;ve just been creating this task many times and noticed it took a while to get scheduled sometimes
18:24bstackarmenzg: we haven&#39;t written a cancel-all thing yet
18:25bstackthat probably doesn&#39;t make as much sense to be an action task
18:25armenzgbstack: how will you handle it?
18:25bstackbut rather just a thing that interacts with the tc api
18:25armenzgright now &#39;mozilla-taskcluster&#39; takes care of it
18:25armenzgno?
18:25bstackalthough... maybe cancel-all is enough work that an action task would make sense
18:26bstackI assume that&#39;s a moz-tc thing, yeah
18:26armenzgbstack: also, create new nightly builds on a revision
18:26armenzgis that being handled?
18:26bstackgarndt: ^ is cancel-all a moz-tc thing?
18:26garndtI believe cancel all just issues individual cancel requests as pulse messages that moz-tc listens to
18:26bstackthe new-nightly-builds thing is not planned work for me currently
18:26garndtIn this case, it should just call cancel API for each tc task I think
18:26bstackbut it can be if we think it should be
18:27bstackalthough I think that might be a better task for somebody who knows what those words mean :p
18:27garndtThat seems out of scope for what you were originally taking care of
18:28garndtThis was mostly to make backfilling, retrigger, cancel, and adding new jobs easier when they&#39;re defined in tree
18:28armenzgOK
18:28armenzgI&#39;m going to document everything I know
18:28armenzgbecause I forget
18:29armenzgand secret features arise
18:29armenzgthe day we shut off services
18:29tedhah
18:30armenzgwhat a wild west :S
18:30armenzgI bet not many even know about triggerbot!
18:30garndtWhat the heck is that?
18:30armenzgexactly :)
18:35armenzgchmanchester: does trigger-bot still run?
18:35armenzgI can&#39;t see it in the list of Heroku apps
18:36chmanchesterarmenzg: I checked on it about a week ago, it&#39;s still going
18:37chmanchesterI thought is was going to be a no-op since the windows move, but there are still tests running in buildbot
18:37armenzgchmanchester: the main purpose is to re-run orange jobs on try
18:37armenzganything else?
18:37armenzgchmanchester: win8/win10 are still running but they&#39;re scheduled via BBB
18:37armenzgwhat does it do with non-BBB jobs?
18:38armenzghrmm
18:38chmanchesterarmenzg: it implements &quot;--rebuild&quot; and &quot;--rebuild-talos&quot;, which I think is the main thing it&#39;s doing now
18:38chmanchesterarmenzg: nothing, I&#39;d think!
18:38armenzgchmanchester: does it rerun orange jobs anymore?
18:39chmanchesterarmenzg: I believe it would for buildbot tests
18:39armenzgOK
19:28jonasfjgps: so is mozreview a publishing repository? (I&#39;m trying to guess why histedit won&#39;t work for me)
19:29dustinit&#39;s not
19:30dustin`hg phase` can reset the phases when they get messed up
19:30wcostaCallek: https://bugzilla.mozilla.org/show_bug.cgi?id=1385613#c11
19:30firebotBug 1385613 NEW, nobody@mozilla.org Intermittent test_safe_browsing_initial_download.py TestSafeBrowsingInitialDownload.test_safe_browsi
19:31Callekwcosta: interestingly I n-i&#39;d you because garndt pointed me at you when I asked about pete and rob both being away :/
19:32wcostaCallek: well, I am covering them, but occ is something completely out of my knowledge
19:32Callekdamn
19:32Callekwe should increase bus factor.... somehow...
19:32wcostabut if nobody else can dig in it, it is wcosta&#39;s duty
19:33jonasfjdustin: yeah, it&#39;s not phases... I listed them.. maybe it&#39;s because I have bookmark or something..
19:33wcostawhen I talked to pete, there was a very small probability I had to touch occ for anything serious but change g-w version
19:35* wcosta is just *that* unlucky guy
19:37jonasfjhttps://irccloud.mozilla.com/pastebin/MdfedRqL/
19:37jonasfjdustin: ^any clues?
19:38jonasfjnormally I would expect &quot;hg histedit&quot; to just magically work...
19:38dustintry just &#39;hg histedit&#39;
19:38jonasfjI tried..
19:38dustinit should just edit the draft commits
19:38dustinwhat does it do?
19:38jonasfjit allows me to edit just to first commit
19:38jonasfjie. the &quot;Fix ups mach error&quot; commit
19:39jonasfjI used to have multiple bookmarks... but delete those...in case they created something weird..
19:39dustinthey don&#39;t
19:40dustinhuh
19:40dustindo you have a hash collision?
19:40garndtwcosta: I had to run out unexpectedly but let&#39;s talk about occ when I&#39;m back
19:40dustinline 5 and line 15 look the same
19:41jonasfjoh, so they have the same parent..
19:41jonasfjso I didn&#39;t base my fix up on top of my changes...
19:41* jonasfj confused...
19:41dustinhaha
19:41jonasfjI keep forgetting that &quot;hg log&quot; is garbage
19:41dustinindeed
19:41jonasfjit&#39;s not log of what I am at...
19:41jonasfjit&#39;s log of repo...
19:41dustinright
19:41dustinI have an &#39;hg wip&#39; that&#39;s not bad
19:42dustinhttps://irccloud.mozilla.com/pastebin/UmRO5Ij9/
19:42jonasfjhg wip looks wrong here too:
19:42jonasfjhttps://irccloud.mozilla.com/pastebin/lPPpYRnn/
19:43dustinwell, it&#39;s accurate, it just shows that your repo is funny-shaped :)
19:43dustinhg rebase can fix that
19:43jonasfj@ <-- is where I am (my fixup)
19:43jonasfjoh...
19:43dustinkeep in mind that hg rebase doesn&#39;t use half-open intervals like git does
19:43jonasfjso I can&#39;t edit 99c9b5dd6453 because it has two children
19:43jonasfjnow it makes sense..
19:43dustinright
19:44jonasfjso this seems unfixable now... :) maybe I can rebase it into position...
19:46gps`hg show work` from 4.2+ is equivalent to wip
19:46gpsthe version in 4.3 can be used as a replacement for wip
19:47wcostagarndt: no worries, it feels like the bug is low priority
19:47wcosta^Callek
19:48jonasfjwow, by pure luck I guessed the right &quot;hg rebase -r .. -d ..&quot; command...
19:48jonasfj:)
19:48dustinheh
19:49garndtwcosta: it&#39;s been open for 10 days, that&#39;s my assumption as well
19:49jonasfjrebase is so hard (in git too) becuase there are more concepts than I care to read about...
19:49jonasfjbase, source, destination, revision, I give up...
19:49wcostamaybe Gods were just waiting the &quot;right&quot; time
19:49Callekgps: `hg show work` is pretty crappy for my needs....
19:49wcostaand it is set as P5, so....
19:49Callekgps: https://irccloud.mozilla.com/pastebin/SvXRS90W/
19:50Callektail&#39;d line from show work has `26ca0 (MOBILE470b1_2016042520_RELBRANCH) Added FENNEC_47_0b1_RELEASE FENNEC_47_0b1_BUILD1 tag(s) for changeset b5dc09e14bc3. DONTBUILD CLOSED TREE a=release` at the end of a TON of `:` stuff
19:51jonasfjbut now hg histedit worked and everything is awesome :)
19:52dustin`hg show work` does nothing for me
19:52dustin4.2.2
19:53gpsdo you have the &quot;show&quot; extension enabled?
19:53gpshg --config extensions.show= show work
19:55dustinah, no
20:33glandiumdustin: scopes are not appearing on the task inspector, and I don&#39;t see any scope added for sccache for generic-worker, contrary to docker-worker... so I really don&#39;t know how it&#39;s supposed to be working
20:33dustinwhat task?
20:34dustinmost test tasks don&#39;t require any scopes
20:34glandiumdustin: windows builds
20:34glandiumpicking a random one from inbound: https://tools.taskcluster.net/groups/fX2rqWfyT5ytQjuNan52Cw/tasks/YwXY7fDhRpOrqNLeA65pDA/details
20:35glandiumcompare with: https://tools.taskcluster.net/groups/fX2rqWfyT5ytQjuNan52Cw/tasks/Oe2aoWr_Qw-A60xPSiy8bQ/details
20:37glandiumand there is no equivalent to https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/transforms/task.py#599 for generic-worker
20:46dustinah, I thought ted had that working
20:46dustinalthough if it requires taskclusterProxy, probably not
20:55glandiumdustin: it does work on windows.... but I don&#39;t know how
20:55dustinme neither!
20:57glandiummy only guess is that it&#39;s handled at the IAM level, and I hope we have different pools of workers for level 1 and level 3
20:57dustinbuilders, yes
20:57dustinthat sounds vaguely familiar
22:20hassanjonasfj: pushed the last nit :)
22:20jonasfjCool,
22:21jonasfjhassan: I think you can merge.... But maybe watch for errors on paper trail and heroku and be ready to rollback just in case..
22:23dustinwcpgw
22:23hassanok, i&#39;ll merge tomorrow morning then
9 Aug 2017
No messages
   
Last message: 12 days and 21 hours ago