mozilla :: #taskcluster

11 Sep 2017
06:45whimboohi all. hm is the index broken? When I select builds from eg 170413 I also get builds listed from other days
06:45whimboohttps://tools.taskcluster.net/index/artifacts/gecko.v2.autoland.pushdate.2017.04.13
08:26Aryxsfraser: hi, busted gecko decision task for 1342392 https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=7f5ae5ff11812721ae7521715729489ddb2643f3&filter-resultStatus=exception&filter-resultStatus=usercancel&filter-resultStatus=runnable&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=retry
08:28sfraserAryx: yeah, just spotted that. an hg rename hadn't made it into the patch. Sorry. Changing now
08:28Aryxthank you, use "CLOSED TREE" in the commit message
08:30sfraserAryx: do I need an r= as well?
08:31sfraserexamining the history indicates no
08:32sfraserAryx: the patch left a .orig in there, too. Could you back out and I'll resubmit with a cleaner version
08:34sfrasernvm
08:34sfrasertoo many changes too short time period
08:35Aryxsfink: ok
08:35Aryxsorry, meant sfraser.
08:56sfraserAryx: I've fixed the issues, am I ok to push again?
08:56Aryxalso the flake8 one? then yes
08:57sfraseryup, also the flake8 one. Am unsure how to run with those flake8 parameters locally, it was insanely spammy
08:57sfraserbut the ones in the logfile are done
08:58sfraserHonestly, I figured no-one linted the files based on how many flake8 reports I had been getting.
16:00armenzghassan: Eli I want my web app to allow easily to switch between data sources
16:00armenzglocalhost:8011 versus somesite.stating.mozilla.com or something else
16:01armenzgany recommendations?
16:01armenzgshould this be by adding more entries to package.json?
16:06Eliarmenzg: you want that to change based on how you build?
16:11armenzgEli: I think so; what is the alternative?
16:11armenzgif you want to use localhost start like "command A", if you want to use production start like "command B"
16:11Eliarmenzg: no, that is what i would recommend too
16:14dustinor .neutrinorc?
16:14Eliarmenzg: like we do in tools, you can define variables to be injected into your code based on environments: https://github.com/taskcluster/taskcluster-tools/blob/master/.neutrinorc.js
16:14dustinhttps://github.com/eliperelman/taskcluster.net/blob/master/.neutrinorc.js#L53
16:14ElixDDDD
16:15Elii like dustins example better ^
16:18armenzgthat helps for production but for local development I want to be able to either use localhost or production
16:18dustinyeah, it was written by someone pretty smart
16:18armenzglol
16:18dustinarmenzg: in general the idea is to have a conditional in .neutrinorc.js that looks at an env var and sets whatever values you want
16:18dustinso you can say RUN_AGAINST='local' or RUN_AGAINST='staging'
16:18dustinor whatever
16:18armenzghrmm
16:19armenzgBACKEND='localhost:8011' yarn start
16:19armenzgwould that be acceptable?
16:19dustinyep
16:19Eliyes
16:19armenzgand that becomes process.env.BACKEND
16:19Eliyou still need to tell the .neutrinorc.js to inject that into your code
16:19armenzgoh
16:19armenzgit doesn't do it automatically
16:20Eliwe dont arbitrarily inject all environment variables into code, its a security hazard
16:21Eliif you want to provide a list of environment variables you *do* want, you can use the neutrino env middleware: https://www.npmjs.com/package/neutrino-middleware-env
16:21Eliso then in a nutshell, you can pass the following in your .neutrinorc.js: ['neutrino-middleware-env', ['BACKEND']]
16:23armenzgEli: you DON'T WANT ALL THE envs? :)
16:23armenzgif you print them on your logs you will get much longer logs
16:24armenzg(Buildbot reference)
16:24Elilol
16:24dustinhaha
16:24dustinit helps to print them for each step
16:24dustinalong with three or four versions of the command you're running
16:25dustinone for "copy paste" which won't actually work when copy-pasted
16:25dustingood times
16:40armenzgEli: dustin it seems that shipit-uplift serves page over https and I get net::ERR_INSECURE_RESPONSE
16:40armenzghow can I overcome this?
16:40Elihow are you communicating with it?
16:41armenzghttps://irccloud.mozilla.com/pastebin/oCPcM6YF/
16:41armenzga simple fetch
16:43Eliyoull probably want to open that page in its own tab to see what ssl errors ithas
16:43armenzghttps://cl.ly/0r1L0v0Q3Z2T
16:44dustinyou can't use https on localhost
16:44dustinwell, not easiliy
16:44dustinload localhost with http
16:44dustinyou can talk to an https site from an http, just not the other way around
16:44dustin(btw ngrok.io is an alternative that lets you use https, but has some other complications)
16:48armenzgI get connection reset if I try to use http
16:53Eliif its a self-signed cert, youll probably have to click advanced and accept the cert
16:56armenzgOK; that made some progress
16:56armenzgI will add that to the docs
16:58armenzgty!
18:00catleegps, dustin: do either of you have clever idea for how we can implement the "periodic updates" on taskcluster? Right now this job regularly runs some scripts to update in-tree HSTS/HPKP files, and then pushes to hg.
18:02catleewe've discussed a special kind of hg scriptworker that has credentials to make commits and push them
18:02akiwe've also brainstormed that maybe it pushes to reviewboard or phabricator rather than directly to m-c
18:03catleeyeah
18:03catleeor we generate fine-grained permissions that only allow it to make changes to a limited set of files
18:03catleeoooh, action task for version bumping
18:03catleeand merge day
18:05akisure :)
18:06akii definitely want a full audit before blanket vcs push perms though
18:10catleeyeah
18:10catleeeverything should be an action task
18:10catleecron
18:10catleedecision
18:10catleereleases
18:12jmahergarndt: I saw your mail about the new timings, I don't see a tab with a title for revision 3dc98e77386986e771f615d8418348adceb65c75
18:14garndtsorry, I should be consistent with how I'm naming these tabs, it should start with "9_11_"
18:16jmaherok, let me look
18:16jmahergarndt: I am on this spreadsheet: https://docs.google.com/spreadsheets/d/1vvjMDRY_XKoX9st6R86JqX73eCQAuwg2oHaZhBpEN4c/edit#gid=811735407
18:17garndtyes
18:17garndtgo to the last tab
18:17garndt9_11_timings_non_e10s
18:19jmaherwait, did you just add that? :)
18:19garndtheh, I just had to give the spreadsheet the death stare
18:20Callekgarndt: offhand, is there hardware differences between what Buildbot used for Windows spot instances and what TC uses for windows instances? (builders)
18:20garndtit's quite possible
18:20garndtwhat instances does buildbot use?
18:21CallekI know we both use AWS here, but I'm seeing failures when running spidermonkey suites via taskcluster that don't manifest when run with a job count of 1 (rather than default, scaled-by-cpu count)
18:21jmahergarndt: ok, that tab looks better- 55 hours added for win7/debug non-e10s tests
18:21Callekgarndt: c3.4xlarge afaict
18:22garndtCallek: I'm pretty sure that our builders us c4/m4 2xl
18:23garndtactually 4xl
18:23garndt"instanceType": "c4.4xlarge",
18:24garndtwe also log the instance type at the top of the log: https://tools.taskcluster.net/groups/duauXaj9QAqPsKPofGpUYA/tasks/SM4gVb-WQum_YUTz4JAC3A/runs/0/logs/public%2Flogs%2Flive.log#L7
18:25Callekgarndt: so maybe ssd has a role here?
18:26Callekthis is doing expensive javascript codepath testing aiui
18:26Callekhigher failure rate in debug
18:27garndtthey're both SSD (or should be), one is provisioned using EBS and the other is instance storage attached I think
18:27Callekyea teh c3 has ssd instance storage
18:27Callekanyway, not sure whats going on here (I'm not too familiar with the spidermonkey suites, I just know I want them off buildbot!)
18:27Callekthanks for the information on differences between our two sets
18:27garndtno problem!
18:41catleeCallek: maybe the build directory matters :)
19:04whimboowcosta: hi
19:04wcostawhimboo: hi
19:04whimboowcosta: i have a osx loaner for a while to reproduce a bug under mac os
19:04whimbooso its a tc worker
19:05whimbooi noticed a really high cpu load for automountd
19:05whimboodo you know what this is the case?
19:05whimbooits 30-40% during a test run
19:05wcostawhimboo: no idea, isn't an old build of firefox that tries to look for symbols at /home ?
19:06wcostaactually not firefox, firefox cross builds
19:06whimboohm, maybe. the test I currently run in a loop is a crash test
19:06wcostawhimboo: https://bugzilla.mozilla.org/show_bug.cgi?id=1338651
19:06firebotBug 1338651 FIXED, wcosta@mozilla.com taskcluster cross-compiled OS X builds create perf problems when not stripped (Talos performance reg
19:07whimbooso it sounds plausible
19:07wcostait has landed for a couple of weeks now, but if your test is still refering to an old build
19:07wcostait can be the case
19:08Callekcatlee: boy i hope not :)
19:08whimboowcosta: yeah, i do a regression test. the build is from Apr 13th, so pretty old
19:09whimboothanks
19:09wcostawhimboo: so, probably it is, you can confirm picking a newer firefox build and check
19:09whimbooi will do later, and verify if possible
19:09whimboothanks for fixing this
19:10whimbooit really slows down the test times
19:23dustin14:10:36 <catlee> everything should be an action task
19:24dustineverything that involves adding tasks to an already-existing &quot;push&quot; (meaning, roughtly, a big row in treeherder)
19:25dustinregarding periodic updates, yeah, I think a scriptworker instance or two is a good choice
19:29dustin{&quot;repo&quot;: &quot;mozilla-central&quot;, &quot;operation&quot;: &quot;hsts-bump&quot;}
19:29dustin*very* limited inputs
19:29dustinI suspect most of these would run from .cron.yml
19:31catleethe problem is that code for *how* to do those lives in-tree too
19:32catleeit actually runs xpcshell I think...
19:33dustinhm
19:33dustinaccepting a patch as input kinda sucks :(
19:34akifor merge day, it would run the merge day script
19:34dustinmaybe there is a preliminary in-tree task that prepares the new content (hsts list, etc.), and then the hsts-bump knows how to apply that in such a way that any input can at worst result in a bogus hsts list
19:34akiand yeah, maybe a try-/review-like repo to grab a revision?
19:35dustinnot sure what that means..
19:35akirather than a bare patch. i suppose that repo would require open access
19:35dustinyeah, I&#39;m not sure that&#39;s a great help
19:35dustinI don&#39;t want to have input that *could* potentially affect anything, and we try to whitelist only certain sorts of changes
19:36dustinI&#39;d rather the input only be able to affect the minimum.. then we just have to verify we haven&#39;t written something that can be fooled by bogus input
19:36gpscatlee, aki: pushing to Phabricator and having something auto-review is the preferred solution
19:36aki+1
19:36akior having a human review in some cases
19:36dustinjson.dump(hsts_data, open(&quot;path/to/hsts.json&quot;, &quot;w&quot;)) is pretty hard to fool
19:36dustin(I know it&#39;s more complicated than that, just illustrative)
19:37dustinI like review, yeah, where it&#39;s not going to be a great burden to reviewers
19:37gpswe want all pushes to go through autoland service, so please be cognizant about introduce new things that `hg push`
19:37catleeyeah, that&#39;s why we&#39;re talking about it now :)
19:37catleeversion bumps too
19:37dustingps has mentioned bumping base OS versions on docker images too fwiw
19:37dustinI suspect if bumping becomes possible with .cron.yml, uses for it will come out of the woodwork :)
19:38jgrahamI feel there might be something I should pay attention to here; we want to make the wpt-sync more automatic
19:38gpsi just wrote up https://docs.google.com/document/d/14ogSVp3KSMxrr2809-2Rrbic5DsVRtZdVJk85Ojie8U/edit# today
19:38dustinhm, that might have sounded pejorative -- I really love it when ideas come out of woodwork
19:39dustinjgraham: yeah, that&#39;d be an example
19:39dustineven if the &quot;bump&quot; is just changing a revision somewhere in-tree that a later task uses to pull down and integrate the tests
19:39dustin(so I would be uncomfortable with a bumper doing pulls and merges)
19:40jgrahamSo I am missing all the context here, but for wpt the process is pretty involved; we need to pull in pre-prepared metadata updates from try pushes
19:40catleeI meant gecko version bumps, but all the other versions need bumping too
19:41jgrahamSo making it run entirely in tree on TC seems hard
19:41dustinjgraham: yeah, it wouldn&#39;t be the easiest of the applications of bumper functionality
19:43jgraham(so from my point of view I guess the big question is &quot;if we try to launch this at some point, is someone going to turn around and say that having a service that makes commits that land on an integration repo is now forbidden&quot;)
19:44jgraham(and an answer like &quot;no but in 2018 you will have to push to Phabricator instead, and have a script that verifies all changes only touch the expected files, and autoreviews the change before landing to inbound&quot; seems fine)
19:46dustinI think the relevant people who might forbid it are going to be part of the conversation
19:47dustinis autoreview a Phabricator thing?
19:47dustinif that&#39;s &quot;a thing&quot; that sounds great
19:47dustinif it&#39;s a hack to screen-scrape Phabricator, it sounds awful
19:47catleeis that something we can use now?
19:47dustinI don&#39;t think Phabricator&#39;s even live for testing yet is it?
19:49garndtI think you have to opt into using it by talking with mcote
19:51dustinah
19:56gpsjgraham: we want 100% of commits to be pushed via autoland
19:56gpsautoland is a set of HTTP services. so the next question becomes who can make calls to it
19:57gpsdustin: phabricator has a rich HTTP API. and you can create rules in Phabricator to react to certain events
19:57gpse.g. if a file changes, send HTTP callback
19:57gpsthis can be used to script auto reviews
19:57gpsand presumably auto landing
19:57rillianted: thanks for the upload task link; it&#39;s good to have an alternative.
19:57tedyw
19:58tedtook me a bit of wrangling to get it working the way i wanted
19:58dustingps: that&#39;s spuercool
19:58dustinthe autoreviewing stuff
19:58dustinhaving to land everything on autoland I&#39;m still unhappy about
19:58jgrahamgps: That sounds fine for this use case
19:59rillianted: one question about taskgraph generation: the docs says tasks within the same kind can&#39;t have interdependencies, but both the clang build and the sccache build are toolchain tasks, but sccache depends on clang. Is that a bug?
19:59dustinit&#39;s a docs bug
19:59dustinthere&#39;s nothing preventing it, and nothing fundamentally wrong with it
19:59rilliandustin: ok, thanks.
19:59dustinI originally used kinds as a way to try to make sense of the spaghetti of dependencies -- basically a partition of the graph
20:00dustinbut times have changed
20:00dustinI&#39;d r+ the heck out of a docs fix
20:00dustinother things on my mind at the moment tho
20:09rilliandustin: where are the dependency hashes calculated? Now that my try push has succeeded once, subsequent pushes with different task definitions are just using the old artifact.
20:12dustinfor a toolchain?
20:12rilliandustin: yes. it doesn&#39;t rebuild the toolchain.
20:12rilliane.g. when worker.env changes
20:12dustinoh, hm, that&#39;s not too surprising
20:13dustinI don&#39;t know the details, but the hash is over the in-tree contents
20:13rillianno counting the .yml task definition?
20:14dustinmaybe not -- it probably should
20:22rillianlooks like that&#39;s per-kind, in e.g. transforms/job/toolchain.py
20:37jmahergarndt: one other thing to do is look at the overall utilization of the osx pool, especially since we have the extra 80 machines lately
20:37jmaherbut I believe not as of today
22:35arrjmaher: we no longer have the extra capacity. they were shut down today
22:40garndtbye bye mac minis, loved ya when you were around, didn&#39;t want to see you go
22:41sfraserso nightly seems to be busted because the execution time is now over 30 minutes. The trigger for this is likely my partials generation in-tree changes, as it&#39;s a lot of extra task submission, and task submission is one-at-a-time
22:59KWiersodustin: ^
23:05jonasfjsfraser: where is task submissions one-at-a-time? (the decision task uses concurrent requests last I checked)
23:06* jonasfj curious...
23:06akie.g. https://tools.taskcluster.net/tasks/dxOw2_k3Sqq2vl5z_sO9zQ
23:08sfraserit seems to be doing about 3 task submissions per second
23:08sfraserI&#39;ve increased the runtime of the decision task to 2400 seconds and triggered the hook again
23:08sfraseronce I wake up in the morning I will look at shortening the decision task
23:09sfraserhttps://public-artifacts.taskcluster.net/dxOw2_k3Sqq2vl5z_sO9zQ/0/public/logs/live_backing.log took 29 minutes of its runtime just submitting tasks
23:09akiok. ty
23:11jonasfjwow, yu
23:11jonasfj4.5k tasks
23:12sfrasercron uses https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/create.py#98 create_task, which is not concurrent. create_tasks() does appear to be concurrent, but not what cron uses
23:13akicron creates a decision task, which probably uses create_tasks()
23:13jonasfjyeah, I would think so too
23:13sfraserah, I was going by https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/cron/__init__.py#79
23:13akigo sleep :)
12 Sep 2017
No messages
   
Last message: 8 days and 3 hours ago