mozilla :: #taskcluster

13 Jul 2017
01:05akii'm having trouble signing into the hooks service; login.tc.net sends me back to login.tc.net after clicking the okta button
01:06jonasfjHooks service?
01:07jonasfjjust go to tools.tc.net and login with okta... just checked it works on my end
01:07jonasfjyou can try the task-creator that's where I signed in... but any page should work..
01:07akiyeah, i did via https://tools.taskcluster.net/hooks/project-releng/nightly-desktop-win64%2Fdate
01:08akii log in via okta, and it gives me credentials, but no login
01:08* aki tries chrome
01:09akichrome works =\
01:10* aki restarts nightly and tries again
01:12akistill broken in nightly, but restarting without addons works.
01:13jonasfjworks fine in nightly for me...
01:13jonasfj56.0a1 (2017-07-11) maybe if I update it...
01:14akii think it's because of containers. my problem
01:14jonasfjstill fine..
01:14akithanks for the double check!
01:16jonasfjah, interesting...
01:16jonasfjbtw, wow you guys have a look a lot of cron jobs :)
01:16aki:)
01:17akione per project, and then a bunch of more customized ones
02:45glandiumthe new live log view cuts out the last line (or more?)
02:45glandiumcan only see the top of it
02:52dustinaki: so it was containers? you're the third one to have that problem..
02:52dustinglandium: yeah, it's a scrollbar
02:52dustinI meant to file a bug about that, but if you could that'd be great
02:54glandiumdustin: filed 1380540
02:55dustintanks
02:55akidustin: yeah, i have okta open in the work container, and i may have had hooks open in the default container, so i get logged in but the hooks page doesn't have access to the cookie
02:55dustinahhh
07:56glandiumdammit, third time today that I quit firefox because I'm trying to do ctrl+q to unstop a ctrl+s in a loaner shell
10:03jhfordglandium: didn't that *finally* get fixed in nightly?
10:04jhfordi was always told "well we have session restore so it's nbd if you accidentally cmd+q instead of cmd+w"...
10:05glandiumjhford: I'm on beta
11:58gerard-majaxhello tc people
11:58gerard-majaxdo you have some tooling to track over time the time to completion associated with a tc-github worker type?
12:00jhfordmy understanding is that there's only one tc-github worker type, so the stats would be an aggregationof all tc-github jobs
12:01gerard-majaxjhford, I may be mislead with the nomenclature here :)
12:01jhfordentirely possible ;)
12:01jhfordso, we have one default 'worker-type', and my understanding is that we do have stats for each 'worker-type'
12:02jhfordi suspect you are asking for stats on the jobs run in one specific repository that uses tc-github
12:02gerard-majaxjhford, https://github.com/mozilla/tensorflow/blob/v1.0.0-warpctc/.taskcluster.yml#L16
12:02jhfordahh, we should have stats for that in signal I suspect, let me see if I can log into it
12:02gerard-majaxjhford, does that qualifies for a specifictc-guthub worker type?
12:02jhfordyes
12:02gerard-majaxokay, I feared it was some higher-level stuff :)
12:02jhfordi thought you were talking about the general worker type we use when you don't specify one :)
12:03gerard-majaxI'm curious to see: how things got improved when we switched the underlying aws types for that workerType, and how it degrades when we add stuff in the PR against master
12:04jhfordlet me see if I can pull that up
12:04jhfordi'm not an expert on worker stats, but i'll give it a go
12:07jhfordgerard-majax: i think i've found it
12:08jhfordhow many days back do you care about?
12:08gerard-majaxjhford, as much as you can :)
12:08gerard-majaxjhford, how long can you go back at most ?
12:08gerard-majaxI guess 3 months would be enough
12:09jhfordi have a series from oct 2016 to may 2017 and then a break, then until now
12:10jhfordgerard-majax: did you get my pm?
12:10jhfordit's for the 95th percentile and 50th percentile for runtime
12:10jhfordi could break it down by instance type too i guess
12:31jlorenzopmoore|away: hi! I noticed XPCShell tests can't print out strings on the windows generic-worker https://bugzilla.mozilla.org/show_bug.cgi?id=1380627. Have you seen a similar issue before?
12:31firebotBug 1380627 NEW, nobody@mozilla.org [windows] XPCShell tests: Cannot print out a string in the logs
13:51pmoorejlorenzo: i haven't seen that before - but could it be an xpcshell problem, rather than a worker problem?
13:51pmooreiiuc, this isn't logging? https://hg.mozilla.org/try/rev/92744b4bc185b0314b0238aa8f3c7fec07419cd3
13:52jlorenzopmoore: that's right, but only on tc-win
13:52jlorenzothe other platforms, like tc-linux and bb-win are logging
13:52jlorenzoso, I'm unsure where to investigate
14:02pmoorejlorenzo: https://bugzilla.mozilla.org/show_bug.cgi?id=1380627#c2
14:03firebotBug 1380627 INVALID, nobody@mozilla.org [windows] XPCShell tests: Cannot print out a string in the logs
14:03pmoore;)
14:04pmooreit looks like the test seems to think the PATH entries are on the C: drive, and finds them on the Z: drive
14:04pmooreon buildbot, they are on C: drive, in taskcluster, they are on Z: drive (SSD storage)
14:04* jlorenzo facepalms
14:04pmooreor something like that
14:05jlorenzowow, thank you for pointing this out!
14:05jlorenzothe Z: drive sounds like a probable explanation
14:06pmooreyeah, it was hard to understand the output of the error, but i expect something like that is at play
14:07* jlorenzo triggers a new try run, with the logs at the right place
15:03jhfordhttps://public.etherpad-mozilla.org/p/dustin-worker-dashboard
17:40bstackgps: iiuc, pushlog_id per project in hg is guaranteed to be (N + X) later than N, but there is _not_ a guarantee that given an N, you can find the next push by going to (N + 1). Does that seem correct?
17:42gpsbstack: correct
17:42bstacksweet, ty
17:42gpswell, usually
17:42gpsif we reset repos, the pushlog can reset
17:42gpsbut we only do that for e.g. projects/*
17:43bstackmakes sense
18:29gpsgarndt, et al: i suspect the tree closure may have to do with a TC worker change
18:29gpshttps://bugzilla.mozilla.org/show_bug.cgi?id=1380381#c11
18:29firebotBug 1380381 ASSIGNED, mshal@mozilla.com build symbols missing on macOS/OS X, unhelpful crash signatures like [@ XUL + 0xddb7c]
18:31bstackg.ardnt is pto at the moment, but I think we can wild-goose-chase with you
18:32* bstack starts checking for geese
18:33mshalty!
18:36bstackwcosta: you are the resident osx expert on the team (sorry) ^
18:36bstackcan you help look into this?
18:37bstackmaybe a worker change 21 days ago or so?
18:37* wcosta scrolls back
18:37bstackI'm not quite sure how all of the osx stuff happens :|
18:38bstackit's that bug right there ^ that explains it all
18:38bstackbrb
18:45bstackany thoughts, wcosta? Can I be helpful in some way?
18:46wcostabstack: I am looking for all possible related TC changes we made on Jun 22. I think nothing you can do right now
18:46wcostaI take care of it
18:46bstackok, ty. I will eat food in that case.
18:47bstackjust yell if I can do something and I'll hop back on
18:48wcostaok, enjoy your food
18:58gpsis there a worker inspector tool? looking to view history of a particular worker
19:09jonasfjNope, but hassan is in the process of planning APIs and internal data store necessary to build one...
19:12wcostagps: I am preparing a test patch to run llvm-dsymutil under strace to hopefully figure out why it fails sometimes
19:13Callekdustin: following on from yesterday, https://tools.taskcluster.net/groups/IIyV1KCaRxyXMMkXWykKyA/tasks/IIyV1KCaRxyXMMkXWykKyA/details click Actions->Edit Task, then change the deadline to july 14'th, then click update timestamps, it *forces* it down to +1 hour from `now`
19:13Callekdustin: so yea, I can confirm thats an issue
20:16Callekdustin: if garndt is still out : can you triage the n-i in Bug 1380381?
20:16firebothttps://bugzil.la/1380381 ASSIGNED, mshal@mozilla.com build symbols missing on macOS/OS X, unhelpful crash signatures like [@ XUL + 0xddb7c]
20:16Callekplz and ty
20:16dustinsure
20:17dustinCallek: re deadlines
20:17dustincreated: '2017-07-13T20:16:54.421Z'
20:17dustindeadline: '2017-07-13T21:16:54.421Z'
20:17dustinthat's only 1 hour apart already, unless my eyes are crossed
20:17dustinoh, hang on
20:17Callekdustin: it wasn't initially, and yea if you push it past that 1 hour and then update again it will revert back to 1 hour
20:17dustinyeah, https://tools.taskcluster.net/groups/IIyV1KCaRxyXMMkXWykKyA/tasks/IIyV1KCaRxyXMMkXWykKyA/details has the deadline 1h after created
20:18dustinhmm
20:18dustinok, yeah
20:18dustinI wonder if that feature got reverted in the Great Tools Rewrite
20:20CallekIts not a big deal overall, but figured I'd call it out as a surprise :-)
20:20dustinhttps://bugzilla.mozilla.org/show_bug.cgi?id=1359468
20:20dustinyeah
20:20firebotBug 1359468 REOPENED, dustin@mozilla.com In task-creator, adjust *all* timestamps relative to created
20:20dustinI'm surprised too since I fixed that :)
20:20dustinok
20:20dustinbuild symbols, one moment
20:23dustinCallek: ok, I'm up to date.. what's the question/request?
20:23dustinwell I read the bug anyway
20:23Callekdustin: the build symbols stuff is more a "what could have changed...." kind of fishing expidition
20:23Callekdustin: gps is helping to drive it, and aiui trees are closed for that bug
20:24CallekI only saw the comment because I watched the tcmigration `related` bugs
20:24* Callek has to run now, but be back later
20:24dustinit looks like wander figured it out
20:25dustingps: btw, worker explorer RFC is https://github.com/taskcluster/taskcluster-rfcs/issues/74
20:32jmaherI was wondering who is working on getting talos to run on windows via BBB; I believe we need to modify configs to make that work
20:32dustingrenade mostly?
20:32dustinthat'd be my first guess anyway
20:33jmaherok
20:33jmahergrenade: are you around?
20:33jmaherthanks dustin
20:34catleejmaher: I have it running on date
20:34catleehttps://treeherder.mozilla.org/#/jobs?repo=date&filter-searchStr=talos
20:35catleeat least I think I do
20:35* dustin counts his guess as wrong :)
20:35catleehttps://bugzilla.mozilla.org/show_bug.cgi?id=1379789
20:35firebotBug 1379789 NEW, catlee@mozilla.com Enable Windows BB tests on date branch
20:36catleehttps://bugzilla.mozilla.org/show_bug.cgi?id=1379661
20:36firebotBug 1379661 NEW, nobody@mozilla.org Run Windows Talos, HW and other unmigrated tests via BBB
20:43jmahercatlee: oh great; that is not a lot of changes
20:44catleejmaher: yeah, it was surprisingly easy
20:44jmaheryay taskcluster
20:44catleeI was hoping to tackle the rest of the tests this week
20:44catleebut things have been a been hectic at home
20:46jmahercatlee: yeah, I will validate perf on the builds one more time
20:47catleeawesome, thank you!
20:47* catlee crosses fingers
20:47catleeI guess there should be data in perfherder now from date
20:48jmaheroh, let me see, that might work
21:22jmahercatlee: if you are still around, do you know if the builds on date are pgo or opt?
21:25catleejmaher: I think we have both pgo and non pgo builds
21:25catleelet me check which ones the talos tests are running off of
21:25jmaheryeah, I see pgo, opt, nightly builds
21:25jmaherI am trying to trace down the task that generated the build we are using
21:26jmaherok, win 2012 x64 opt (B)
21:26catleeyeah, I think it's the non-pgo build
21:27catleeI guess we should run on pgo builds too
21:28jmaherit appears the win7 stuff is using the nightly build
21:28catleewhich win7 stuff?
21:28jmahersorry, the talos jobs on date
21:28catleeI haven't done anything there in particular
21:29catleeI think we trigger talos from nightly too
21:29jmaheryeah, I will validate a few things- a few early data points show that tc is much faster than bb builds
21:29catleeyeah
21:29catleehttps://hg.mozilla.org/projects/date/rev/538d1fc9337354100284535d37084f0ffaaaede9
21:30catleenightly would be PGO
21:30jmahergot it
21:31jmaherhmm, will have to sort that out over time, let me get a few other data points- I should have enough data for an analysis tomorrow am
21:33jmaheractually I only see win7 data for nightly builds and it looks like win10 (x64) gets two data points per push when there is a nightly- one for opt another for nightly
21:34dmoseon one-click loaners, when I select the "clone gecko" option from the wizard, does it leave the exact mozconfig for the build that's failing tests somewhere?
21:34jmaherahal: ^ ?
21:35catleejmaher: ok, so we should detangle the opt and nightly data
21:35catleeI wonder how this is working for linux
21:35dmosei'm trying to debug a race, which is what makes having the right mozconfig so important
21:36catleeah, I bet talos wasn't turned on for on-push win32
21:39tomprinceWhat bugzilla component would changes to http://gecko.readthedocs.io/en/latest/taskcluster/taskcluster/caches.html be under?
21:40gpsbstack, wcosta: i think TC is off the hook for this dsymutil issue
21:41gpsit was a worthwhile wild goose chase though. thanks for helping.
21:41bstack\o/
21:41wcosta\o/
21:41bstacknp, I didn't really do anything :p
21:41wcostaneither do I
21:41wcostainteresting how many weird ci issues we had this year
21:43wcostacross compile performance, dsymutil, home directory change busting valgrind tests....
21:46dmosegps: do you know about 1-click loaners and how to get the correct mozocnfig?
21:48gpsdmose: define "correct mozconfig." what are you trying to do?
21:48dmosegps: i have a failing test that's racy. i can reproduce it on a 1-click loaner.
21:49dmosegps: i want to now modify the code to test various hypotheses. since it's a race, i want it to be as close as possible to binary running the test
21:49dmoseso that i can actually continue to reproduce it
21:49dmose(i have never been able to repro it locally)_
21:50gpsdmose: i don't think you can easily compile from a *test* 1 click loaner
21:50dmosegps: huh. then why does the wizard offer to clone gecko for me?
21:50gpsbut if you want to build from a build loaner, you can load up the logs for an automation job and set MOZCONFIG to whatever it uses
21:51gpsso you can use mach
21:51gpsmaybe you can compile from a test loaner. i've never done it.
21:51gpsi didn't think it had all the packages needed for building
21:51dmoseit might well not
21:51dmosehmm, maybe it'll be easier to unzip the omnijar, edit the js, and rezip
21:51dmoseyucko
21:52gpsif you do an artifact build, it should "just work"
21:53dmosegps: in that case, i could just create a simple mozconfig, yeah? (ie and not worry about all the details)
21:54jonasfjtomprince: if you want new scopes issued taskcluster / service-requests, if just added more docs, I'm guessing "taskcluster / task configuration" and well, just r? dustin I suspect
21:55gpsdmose: yes, just ac_add_options --enable-artifact-builds
21:55* tomprince was trying to figure out HG_STORE_PATH, and thought they should leave documentation for the next person.
21:55dmosegps: awesome; thanks!
21:55jonasfjtomprince++
21:58dmosegps: where would i need to poke to teach the artifact builder to pull from pine instead of mozilla-central?
21:59dmoseor is that useless because there won't be artifacts for pine available to download?
22:09gpsin theory it is possible
22:10gpsbut i'm looking at the source code for `mach artifact install` and i'm scratching my head
22:10gpsyou can hack that up in python/mozbuild/mozbuild/mach_commands.py around like 1666
22:10gpsor in mozbuild/artifacts.py
22:10gpsset "tree" to "pine"
22:15dmosegps: awesome; thanks
22:38nthomasis there a general explanation for hg clone failures like https://mozilla-release-logs.s3.amazonaws.com/mozilla-beta/devedition-55.0b9/build2/Generate_funsize-update-generator_docker_image-all-P22kPsDmQCaVPNOtvzBg7g-3 ?
22:38nthomashg update I guess
22:40dustinrobustcheckout isn't that robust? :)
22:45dustinseriously though I think hg is just one of those things that fails sometimes
23:05dmosegps: it's weird, i had add it to candidate trees in artifacts.py, which got it to actually check pine, but it's not finding the pushhead
23:05dmosegps: which is weird because the pushhead is definitely there
23:07gpsnthomas: https://github.com/mozilla/normandy/issues/757#issuecomment-303526887
23:08gpsso it's a bug in the task config not using the helper function to prepare a task for vcs checkouts
23:08nthomasah hah, thanks for the pointer
23:09dmose i wonder if i confused the artifact cache
23:10dmosebah, blowing away the pickled caches didn't fix
23:12dmoseooh, i bet it has to be projects/pine
23:14dmosew00t!
23:15nalexanderdmose: w00t!
23:15dmose:-)
23:23dmoseis there any way to easily request that a specific one-click loaner NOT be a spot VM that can be killed?
23:23dmosei've had three of these things get killed out from under me
23:23dmoseand I need to debug this race
23:38gpshttps://tools.taskcluster.net/index/artifacts/gecko.v2.mozilla-central.pushlog-id.-1/decision-nightly-mochitest-valgrind good times
14 Jul 2017
No messages
   
Last message: 12 days and 15 hours ago