mozilla :: #ateam

7 Sep 2017
06:02whimboophilor: hi
06:02whimboophilor: regarding https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=fcd3a7c6694c50996b0c6b6f702eb944741e31b5&selectedJob=129059857
06:02whimboothe problem here is the crash
06:02whimbooi will file a bug for that
06:25whimboobdahl: hi. so we stopped running marionette headless tests on Linux?
06:25whimboojmaher|afk: ^
08:53jmaher|afkwhimboo: bdahl: I see MnH jobs on mozilla-central for linux64: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&filter-searchStr=marionette%20headless&selectedJob=129081423
08:55whimboojmaher|afk: hm, is SETA controlling what I see in add new jobs?
08:55jmaher|afkwhimboo: for add new jobs, no
08:55whimboolooks like i can only trigger what has already been run
08:55whimboosee https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=41f763899ab82c9891237001b4426a44aea44a87&filter-searchStr=mnh
08:55jmaher|afkwhimboo: but I have noticed odd stuff in add new jobs
08:55whimbooups
08:55whimboofilter
08:56whimbooso we dont run headless for 32bit
08:56jmaher|afkwhimboo: I went back on m-c to August 16th and we have consistently run MnH on linux64
08:57whimbooyes, all fine for that platform
08:57whimboosorry
09:00jmaher|afkwhimboo: if you want it on linux32, you would need to add it to the test-sets.yml list for linux32: http://searchfox.org/mozilla-central/source/taskcluster/ci/test/test-sets.yml#261
09:00jmaher|afkalthough in general we are running much less on linux32 than previous
09:01whimboonot necessary. i just thought we run it there too
13:04whimbooAutomatedTester: i think I found the issue for Marionette in stopping after 60s of trying to connect
13:04whimboojmaher: ^
13:05jmaherwhimboo++
13:05jmahersounds promising
13:05davehuntwhimboo: ooh?
13:05whimboowe indeed use 60s but only when called via start_session
13:05whimboonot when letting Marionette launch the application
13:05whimboohave to figure out where those 60s are coming from
13:06jmaheroh, and all our automation uses start_session I believe
13:07whimbooAutomatedTester: your patch modified mozharness, but it should be in Marionette
13:07whimboojmaher: yes
13:08whimbooi haven't seen a use of raise_for_port or wait_for_port
13:09whimbooi hate that we have to define those constants at two places
13:32whimboojmaher: https://bugzilla.mozilla.org/show_bug.cgi?id=1362293#c29
13:32bugbotBug 1362293: Marionette, normal, nobody, NEW , Intermittent IOError: Process killed because the connection to Marionette server is lost. Check gecko.log for errors (Reason: Timed out waiting for connection on localhost:2828!)
13:33whimbooreally silly behavior
13:36jmaherwhimboo: good find!
13:37whimbooi think i will keep the 120s for the client and harness
13:37whimbooand we could adjust for our TC jobs by taking AutomatedTester's patch if necessary
13:37whimboojmaher: ^ does that sound good?
13:37whimbooi have never seen locally that it takes longer than 120s
13:37jmaherwhimboo: that sounds reasonable
13:37whimboogreat. so patches upcoming soon
13:38jmaherideally we should work on figuring out why it takes so long to connect, I would expect it to always be <10s
13:38whimbooyes, one issue i found in the past
13:38jmaherbut we can tackle that later- maybe we could have a log message to indicate the marionette server startup time?
13:38whimboowhich was a heavy load of the worker
13:38whimboomarionette doesnt log anything right now
13:39whimboobut I work on a bug to enable a bit of logging
13:39whimboowith marionette I mean the driver and harness, not the server component
18:18tedgbrown: do we have anything that tracks the runtime of specific test suites?
18:19gbrownted: I don&#39;t. ekyle might?
18:19ekyleted: we have the times for the chunks of the test suites
18:19tedgbrown: i was curious whether we fixed the underlying issue in https://bugzilla.mozilla.org/show_bug.cgi?id=1331049
18:19bugbotBug 1331049: General, normal, nobody, NEW , DeadlockDetector death test block for 90s each and cause gtest failure due to timeout on osx debug TC build
18:19tedif so we could revert the timeout change
18:20tedekyle: oh, that&#39;d be perfect, is that something mortals can use or would it be faster for me to just ask you to do the query?
18:20ekyleted: probably faster for me to do the query, um what suite chunk, or is it s specific test?
18:21tedekyle: it&#39;s mac debug gtests
18:21tedi&#39;m curious to know whether the runtime dropped ~10 days ago
18:21ted(that&#39;s when https://bugzilla.mozilla.org/show_bug.cgi?id=1338651 merged to central)
18:21bugbotBug 1338651: Build Config, normal, wcosta, RESOLVED FIXED, taskcluster cross-compiled OS X builds create perf problems when not stripped (Talos performance regressions vs. buildbot builds)
18:22ekyleted: looking now....
18:31ekyleted: nope
18:31ekylehttps://irccloud.mozilla.com/file/Uwz0P2aa/image.png
18:31ekyleted: from https://activedata.allizom.org/tools/query.html#query_id=dUZu6ZRI
18:31tedekyle: ah, bummer
18:32ekyleted: oh! but those are not the mac versions. Looking more...
18:33tedekyle: oh! phew
18:33tedthe theory fits way too well, so if they didn&#39;t get faster i am really suspicious
18:35gbrownekyle: might also be good to limit to mozilla-central (or at least eliminate mozilla-beta, etc)
18:35ekylegbrown: sure, good idea
18:42ekyleted: can&#39;t tell. I doubt it:
18:43ekylehttps://irccloud.mozilla.com/file/dMoXgJ57/image.png
18:43tedOK
18:43ekylefrom: https://activedata.allizom.org/tools/query.html#query_id=7Hy0d2s8
18:43tedekyle: oh, is this filtering just debug jobs?
18:44ekyleted: the number of runs is too low. So I do not know if those spikes and troughs are part of pattern
18:44ekyleted: this is anything running on buildbot
18:45ekyleted: only a few times a day
18:45tedoh
18:45tedwe&#39;re running most of them in taskcluster now, aren&#39;t we?
18:45tedand you don&#39;t have data on those?
18:45ekyleted: yes, by a large margin
18:45ekyleted: yes I have everything in task cluster
18:45ekyle(above)
18:45tedoh
18:46ekyleted: but non are mac
18:46ekyleted: only linux and windows run gtest on taskcluster
18:47tedekyle: i don&#39;t think that&#39;s true
18:47tedhttps://tools.taskcluster.net/groups/S7auQQ9sR1qx3JfuCZnibw/tasks/YiOXeZl1QleSevejvPapDg/runs/0/logs/public%2Flogs%2Flive_backing.log
18:48ekyleted: yes, you appear correct, looking more...
19:06tedactivedata doesn&#39;t seem to have any data for mac on inbound
19:06tedhttps://irccloud.mozilla.com/pastebin/ENpd7usY/
19:06tedthat query produces zero rows
19:06ekyleted: https://activedata.allizom.org/tools/query.html#query_id=DO1D+bK0
19:06tedi started with just build.branch == mozilla.inbound and narrowed it down
19:07ekyleI think those are them, there is something strange about the markup on the mac machines from taskcluster
19:07tedaha
19:07ekyleted: you know how to use ActiveData :)
19:08tedekyle: i&#39;m a quick learner ;-)
19:08ekylehttps://irccloud.mozilla.com/file/1KhoaQrH/image.png
19:08ted(having something to start with and tweak helps a lot)
19:09ekyleted: so I can not interpret that chart, maybe it makes sense to you. Want to zoom into the day with the spike?
19:09tedlemme fiddle with it a little more
19:09tedthanks!
19:13tedekyle: is task.id the taskcluster task id? (the gibberish-looking string)
19:14ekyleted: yes, but ActiveData is always behind, do not try to get tasks from today
19:14tedOK
19:16ted&quot;suite&quot;:{&quot;fullname&quot;:&quot;gtest&quot;,&quot;name&quot;:&quot;gtest&quot;},
19:16tedodd
19:16tedoh, you covered that
19:16tedhah
19:19tedaha
19:20tedhttps://activedata.allizom.org/tools/query.html#query_id=6xgJw1kN
19:20tedekyle: is there a built-in way to graph that?
19:21ekyleted: we can try redash! ....
19:21tedekyle: just wondering if there was a simple way to generate those graphs you&#39;ve been pasting or if you&#39;ve just been copy/pasting into a spreadsheet :)
19:22tedeyeballing the data i think it proves my point, which is nice!
19:22ekyleted: just cut and paste to spreadsheet. let me find the test-system with redash...
19:22ted28-Aug-2017 1737.2094999551773 6
19:22ted29-Aug-2017 860.5429999828339 5
19:22tedekyle: no worries, thanks for the help!
19:23ekyleted: yw!
19:23ekylehttps://irccloud.mozilla.com/file/WdW4bsDB/image.png
19:24ekylefor an hour-by-hour look at near aug 27th spike
19:24tedekyle: so your query didn&#39;t seem to be filtering by branch
19:24tedif you look at mine i refined it a little more so it&#39;s just inbound debug mac gtests
19:24ekyleted: true
19:24tedso many facets!
19:24tedwe had this exact same problem with build timing data in perfherder
19:25tedhad to keep splitting out buckets to get useful data
19:25ekyleted, well you proved it: data + expertise is awesome!
19:26ekyleted: yes, lots of facets! let the machine deal with the facets, and we only need to look at them when we have a question.
19:27ekyleted: I must go now
19:27ekyleted: you have a good rest of day
19:28tedhttps://docs.google.com/spreadsheets/d/1HtdXwwZef1SnSoEDNCCtTXeC_fn2u6bLzSaLlY4cmyw/edit?usp=sharing
19:28tedthanks!
19:28tedi made a pretty graph
19:28tedwell, pretty-ish
19:32tedalso TIL how to use activedata, which is helpful
20:26gpsahal: do we still need to use mozunit for pytest tests? i&#39;m writing new tests from scratch and `mach python-test` is complaining: TEST-UNEXPECTED-FAIL | No test output (missing mozunit.main() call?): /home/gps/src/firefox/python/...
20:26gpsfurthermore, the &quot;def test_...():&quot; in the test file isn&#39;t being executed
20:27ahalgps: yeah, it still sets some common configuration
20:27gpsbecause my pdb import isn&#39;t triggering :/
20:27ahalwe could probably move that over to python/mach_commands.py though
20:27ahalthe main blocker would be those last two tests that still use unittest
20:28gpsis the scaffolding documented anywhere?
20:29ahalno (though there&#39;s not much to document)
20:29ahalassuming you mean pytest specific stuff
20:30ahalhttps://dxr.mozilla.org/mozilla-central/source/config/mozunit.py#218
20:31ahalit would be better if we just invoked pytest directly from python/mach_commands.py
20:32* gps goes to meeting. may ask questions later
21:10gpsahh, so mozunit.main() now automagically uses pytest. nice.
21:23ahalah yeah, guess I could have mentioned that in my post
21:25gpsahal|afk: what about relative module imports. if i have a module with common fixtures, how do i import that?
21:25gpsi&#39;m currently getting an import error because &quot;attempted relative import in non-package&quot;
21:25gpsdirectory structure is <path with setup.py>/test and there exists a test/__init__.py
21:26gpsmy guess is sys.path doesn&#39;t have the &quot;test&quot; directory on it? and/or it registers &quot;test&quot; as the package name
21:26gpsor &quot;test&quot; is on sys.path and each test file is a package not a module
21:27gpsthe latter i think
21:30gpsoh, hmmm. `mach python-test` behaves differently from `python -m pytest`
8 Sep 2017
No messages
   
Last message: 13 days and 7 hours ago