mozilla :: #ateam

14 Mar 2017
03:27pyangerahm: thanks for lightning reply :)
10:25Tomcat|sheriffdutyjmaher|afk: ping
10:39jmaher|afkTomcat|sheriffduty: hi
10:39Tomcat|sheriffdutyjmaher: hi, do you know why in https://treeherder.mozilla.org/#/jobs?repo=try&revision=e620ff722ab006fdc5ec478c39af46f60381e04f no pgo builds are triggered
10:39Tomcat|sheriffdutyi used +mk_add_options MOZ_PGO=1 but seems it didn't worked
10:41jmaherTomcat|sheriffduty: you have pgo builds, we don't have pgo options for try, so the builds are pgo and the tests use those
10:41jmaherlook ta the compile time for win xp opt
10:42Tomcat|sheriffdutyjmaher: ah ok
10:42bcjmaher: Happy PI day!
10:42Tomcat|sheriffdutyhmmm ok then this run seems to be ok re: timeouts
10:42jmaherbc: thanks!
10:42jmaherpie for pi day
10:42Tomcat|sheriffdutyjmaher: trying the next set of bisects than
10:42jmaherTomcat|sheriffduty: ok
10:44jmaherbc: 34 and raining, a bit of sleet and lots of wind- was not an enjoyable day for running; but it is PI day
10:45jmaherok, afk for a bit more
10:45bcat least it didn't snow on day.
10:51Tomcat|sheriffdutybc: yeah stay safe in this snowstorm
10:52bcSame here. Rain and 33. Maybe a bit icy but no snow here. I can finally take the blade off of my lawn tractor. I think this is our March snow and we're going to be scott free from here on out.
10:54Tomcat|sheriffdutybc: hehe even the german air force one canceled the stay in the US for this storm :)
10:55bcafk for a bit
12:10AutomatedTesterjgraham: hey, can you look at the error from https://gist.github.com/barancev/0c8f1a03feddfdfdf751888f85c2001c ?
12:11AutomatedTesterI have asked for a verbose log
12:11AutomatedTesterbut the traceback leaves a lot to be desired
12:33AutomatedTester jgraham: verbose log is https://gist.github.com/barancev/72ec27d213a0e58fe3979b156e6f146b
12:35jgrahamAutomatedTester: Sure
12:35AutomatedTesterjgraham: thanks
12:36AutomatedTesterjgraham: let me know if you need more info
12:41jgrahamAutomatedTester: Hmm, that error looks like it's marionette claiming the response is N bytes but sending fewer than N bytes of data
12:41jgrahamOr no response at all
12:43AutomatedTesterjgraham: https://github.com/SeleniumHQ/selenium/blob/master/java/client/test/com/thoughtworks/selenium/corebased/TestClickAt.java#L38-L40 is what is causing it
12:43jgrahamidk if the PipeError is significant
12:44AutomatedTesterato: FYI, someone is seeing dead object on element#text from Marionette
12:44AutomatedTesterato: getting more info (looks weird)
12:44atoAutomatedTester: Probably related to remoteness changes.
12:45AutomatedTesterprobably
12:45whimbooato: as figured out shortly we also have remoteness changes for file urls!
12:45AutomatedTesterits definitely related to window changes
12:45jgrahamAutomatedTester: You might want to get ato or maja_zf|afk to look at that log. I'm not sure we're going to be able to do much without actually running the test
12:46jgrahamato: Does marionette ormally log the response it's going to send?
12:46atoAutomatedTester: When a remoteness change occurs, the outerWindowID we use to internally identify a window is invalid, and thus we cannot find the web element ID.
12:47AutomatedTesterato: I would expect a element not found or stale element error
12:47AutomatedTesternot Dead Object
12:47atowhimboo: Sounds like it might be related to why you were having problems getting the appropriate events with about: and file: then.
12:47whimbooato: no. file is still special
12:47whimboono events are sent
12:47whimbooevents for all about: pages are fine now locally
12:48whimbooato: hm, when are you abot to leave for pto?
12:48atoAutomatedTester: Right, yes. Were holding on to some object reference that was destroyed.
12:48whimbooato: i would like to get at least a quick feedfback on the new solution for page load
12:48whimboowhich is still in content
12:48atojgraham: Yes, it logs it in trace level.
12:49atowhimboo: I have reviewed that at least three times now. What changed?
12:49atowhimboo: On PTO from tomorrow.
12:50atojgraham: I didnt log the log, but .length on a string in JS is not reliable with Unicode data.
12:50atos/log/look/
12:50whimbooato: you checked the code in chrome
12:53jgrahamato: Well I can't see a log from marionette that it's sending a response. I wonder if it accidentially closes the connection (maybe if the action causes a navigation?)
12:55atojgraham: It should also log a message when the connection closes, but I guess history has shown that the XPCOM notifications we get are not entirely reliable in exceptional circumstances.
12:56jgrahamato: THe relevant part of the log is
12:56jgraham1489494617022 Marionette TRACE conn0 -> [0,23,"performActions",{"actions":[{"actions":[{"duration":100,"origin":{"element-6066-11e4-a52e-4f735466cecf":"aa4f53dd-655b-4c98-82f0-c6325fed0ad4"},"type":"pointerMove","x":10,"y":5},{"button":0,"type":"pointerDown"},{"button":0,"type":"pointerUp"}],"id":"default mouse","parameters":{"pointerType":"mouse"},"type":"pointer"}]}]
12:56jgraham[GPU 18260] WARNING: pipe error: 109: file c:/builds/moz2_slave/m-cen-w64-ntly-000000000000000/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
12:56jgraham1489494643598 webdriver::server DEBUG Deleting session
12:56jgraham1489494643597 Marionette DEBUG Closed connection conn0
12:56jgrahamSo it seems like we get an actions message to marionette
12:56atoInteresting.
12:56AutomatedTestercrash?
12:56atoI dont know if the pipe error warning is relevant. It might not be.
12:56jgrahamThen WebDriver decides to close the session
12:57atoProbably not a crash since it closes the connection.
12:57jgrahamBut the order of log lines is not necessarily reliable
12:57atoThat too.
12:57atoBest way to investigate is to add dump() statements in driver.js and action.js.
12:58jgrahamYeah
12:58jgrahamAutomatedTester: Is this urgent? Is it something you can file a bug on and let, say, maja_zf|afk look at?
12:58AutomatedTesterjgraham: we can totally get maja_zf|afk to look at
12:59AutomatedTesterjgraham: just wanted an initial investigation
12:59jgrahamOK, that sounds like a plan. I suspect there's something not quite trivial going on
12:59AutomatedTesterthe first gist didnt look useful for debugging
12:59jgrahamMy totally random guess is that navigation in the middle of an actions sequence causes badness
12:59atoYes, I suspect it might be something more sinister than a bug in Marionette.
12:59atoMarionette would normally catch errors and return them.
13:00AutomatedTesterok, I will raise a bug in a minute then
13:00AutomatedTesterthanks ato and jgraham
13:00atoWe could have a situation where we forget to catch an error, but that seems unlikely since all the new actions code uses the modern message dispatch primitives.
13:01* AutomatedTester is heading home
13:01atoAutomatedTester: Regarding the earlier dead object bug: That is a JS error and should come annotated with a stacktrace.
13:01atoAutomatedTester: That will tell us where we are trying to access an object that has been GCed.
13:02AutomatedTesterato: I know that, but how/why are we returning that exact error
13:02AutomatedTesterato: https://gist.github.com/BeyondEvil/a8ae064a121b987e9e0c618281536332
13:02AutomatedTesterato: they are going to reduce the issue
13:03AutomatedTesterthere is a page transistion in the test
13:03AutomatedTesterso the stack is kinda meaningless without the page
13:03atoMaybe foundEls gets GCed because a page transition? Possibly because we the page navigates whilst were looking for elements?
13:04AutomatedTesterso many possibilities...
13:04AutomatedTesterits not a priority atm
13:04atoWe probably want to escape the the findElements promise if a page load occurs.
13:04AutomatedTester(at least until we get a reproducible test case
13:04AutomatedTester)
13:04atoAlso we probably want to pack the elements with weakrefs.
13:04atoRelevant code: http://searchfox.org/mozilla-central/source/testing/marionette/element.js#250
13:05AutomatedTesterthanks
13:06AutomatedTesternow definitely heading home ;)
13:06atoSo it looks like we get hold of an element reference that gets GCed by the time we try to serialise it for return to the user.
13:06atoSo I think youre absolutely right its to do with a page load.
13:11atoAutomatedTester: Also, you should ask that person to use Nightly. The stack appears to be from an older Firefox version.
13:12atoAutomatedTester: From before I rewrote the implicit wait code.
13:31marcojmaher: what is blocking enabling more tests in the ccov build?
13:35jmahermarco: getting the coverage instrumented properly and tests annotated, there are only a few things not done though
14:03AutomatedTesterato: jgraham when we add a visual "Firefox is under WebDriver control, we need to make sure we dont do a door hanger or a new ribbon on the browser chrome. Chromium have just added it and people can't click on the part of the page where the ribbon is
14:03AutomatedTesterand then chromium isnt rendering the part under the ribbon after it is dismisssed
14:03atoHah, thats incompetent.
14:03jgrahamAutomatedTester: That sounds bad
14:04AutomatedTesteryup...
14:04atoI suggest we talk to some UX people before we do this.
14:04AutomatedTesterbut in other news people are shouting expletives at Chromium and not me
14:04AutomatedTesterand thats
14:04atoGood day for you? :D
14:05jgrahamAutomatedTester: But ChromeDriver Just Works. We keep getting bug reports telling us that
14:05AutomatedTesterjgraham: yea... thats why I keep disabling their tests in the selenium test suite
14:05jgraham(well not that many actually but some. I did wonder if it was all the same person with different GH handles)
14:05atoYes, I am well informed that chromedriver works.
14:05AutomatedTesterit "just works" when you dont use certain features
14:06jgraham(but more reasonable to assume that it's actually legit complaints from multiple people)
14:06AutomatedTesteryou can't do alerts in Chrome
14:06AutomatedTesterthey have a nasty race condition that is breaking alerts
14:07atoI bet that chromedriver will work less if they flip their W3C WebDriver compat switch, but less so than geckodriver used to because we have gone down the hard and thorny path of making Selenium comply.
14:07atoWell, to be fair, our alerts handling is also broken.
14:07atoYou cant have alerts in two tabs at the same time because our code stores alert state globally.
14:07atoBut I guess fewer people try doing that.
14:08AutomatedTesteris that a thing people will do?
14:08atoWell I wrote a WPT test for it ages ago and found this out (-:
14:08* ato does it
14:08AutomatedTesterwhile everything is an edge case... thats an interesting edge case
14:40atojgraham: Still waiting on feedback on my last changes in https://reviewboard.mozilla.org/r/111308/#comment149042.
14:40atojgraham: I see you have given an r+, but not closed the issue.
14:45mcoteemorley: just fyi, bug 1068447 has landed and has been deployed. the bug is still open just because there's a clean-up patch that has yet to land
14:45bugbotBug https://bugzilla.mozilla.org/show_bug.cgi?id=1068447 Pulse, normal, cdawson, NEW , [PulseGuardian] Allow multiple Users per PulseUser
14:46mcotecamd, nudge ;)
14:46mcoteoh it was merged
14:46mcotecamd: guess you can close the bug then :)
14:46emorleymcote: ah I hadn't even noticed the bug was open, I was adding the dependency more so people could follow the breadcrumbs and read more about the feature if they wished
14:48mcoteah ok
14:55jgrahamato: Oh I missed that, sorry
14:55jgrahamDidn't realise you were still waiting
14:56atojgraham: There are other things I need to do too, so not waiting for that in particular.
14:57atojgraham: If I was blocked I would be more proactive (-:
15:01nishu-tryinghardi know that marionette has a client and server module. But i dont see how there modules are initilized or how they communicate. Where is the server instance running ?
15:01nishu-tryinghardthese*
15:04nishu-tryinghardi have seen the code in marionette.js where a new instance of MarionetteServer is created. But dont know where it is running, is it running along with the browser itself?
15:04AutomatedTesternishu-tryinghard: yes, in the browser
15:04AutomatedTesternishu-tryinghard: it starts up when the browser starts up
15:07nishu-tryinghardAutomatedTester, also how is it integrated with browser? can i get more info on that. I have this driver which gives us all the functionality to interface with JS engine.
15:07nishu-tryingharddriver.js*
15:07AutomatedTesternishu-tryinghard: sorry, what?
15:08AutomatedTesteractually... just seen the time. bbl, school run
15:09nishu-tryinghardAutomatedTester, okay ty, cya later
15:21atonishu-tryinghard: Most of Firefox is written in a special flavour of JS called XPCOM.
15:22atonishu-tryinghard: XPCOM offers a JS to C++ API for the various things Marionette needs to do.
15:23atonishu-tryinghard: The code in testing/marionette is packaged into whats called a jar, which is included in the Firefox omnijar. This is source code available under resource:// and chrome:// schemes that provides internal Firefox functionality.
15:24atonishu-tryinghard: Other examples following this approach is Firefox devtools, the Firefox UI itself, bookmarks, history, &c.
15:25whimbooato: i got side-tracked with mozmill-ci and mozdownload work. so not sure if I can provide a first WIP for you
15:26whimbooi'm just going through the r? so you can land your patches
15:26atowhimboo: WIP for what?
15:26whimbooato: the content process based page load algorithm changes which were necessary for refresh()
15:26atowhimboo: I think https://bugzilla.mozilla.org/show_bug.cgi?id=1333014 is done, but I couldve missed something.
15:26bugbotBug 1333014: Marionette, normal, ato, ASSIGNED , Return element click intercepted error when clicking obscured element
15:28atowhimboo: Modulo some test failures that I need to fix up when I get time, so none of this is currently under time pressure.
15:28wlachjmaher: attempting to do some custom retriggers now
15:28atowhimboo: OK, I can probably have a look later in the week whilst Im away, but I wont be on IRC.
15:29atowhimboo: Also will probably be looking at it at odd times.
15:29jmaherwlach: thanks, I ni? you for ones that could help out possibly
15:29wlachyup
15:29jmaherwlach: and that makes a good proving ground :)
15:29wlachI think it will take a bit of learning/tweaking to figure out the best way to take advantage of this new toy
15:30wlachI'm already realizing some things are frustrating (like having to trigger a "basis" job to hang the retrigger off of on try)
15:35jmaherekyle: have you done the activedata *bump*
15:35jmahertoday that is
15:35ekylejmaher: let me check...
15:39ekylejmaher: wow! that machine lost connection to any wifi. it is catching up now
15:39jmaherekyle: ack
15:40nishu-tryinghardato: ty so much that info was very useful and what i was looking for .
15:41whimbooato: k
15:42atonp
15:45gbrownhttps://pastebin.mozilla.org/8982011
15:54gbrownhttps://pastebin.mozilla.org/8982013
15:54ekylehttps://people-mozilla.org/~klahnakoski/testfailures/test.html#search=dom/media/mediasource/test/test_SeekToLastFrame_mp4.html&sampleMax=2017-03-14&sampleMin=2017-03-01
16:09whimbooi love the summary of bug 1029886 :D
16:09bugbotBug https://bugzilla.mozilla.org/show_bug.cgi?id=1029886 Safe Browsing, normal, nobody, RESOLVED FIXED, tracking bug for tracking protection
16:28atoMuch meta in there (-:
17:08atowhimboo: OK, apparently it fixed itself. The try run looks green where it matters: https://treeherder.mozilla.org/#/jobs?repo=try&revision=8e993425bda2&group_state=expanded
17:10atoCannot click invisible element
17:11atoI do sometimes wonder about the sanity of some of our geckodriver users.
17:13whimbooato: thats always good :)
17:13whimboomaybe retrigger some jobs
17:13whimbooato: i'm about to upload a wip for the refresh patch now
17:14atoOK, I will likely be able to look at that tonight.
17:15whimbooato: btw. our /slow page is not that accurate for slow downloads :)
17:15whimboojust figured out with my refactoring
17:15atoDownloads?
17:15whimboowhen it gets loaded from the cache its not timing out :)
17:15whimbooslow page loads
17:16whimbooi have to implement a better slow loading test case
17:17atoOh right.
17:17atoOne could possibly force a reload through setting some header?
17:17whimboowith the new event approach we see that
17:17atoI dont know how these things work.
17:17whimboonot for goBack and goForward
17:18atoIt doesnt hit the server at all in that case?
17:18jmahergbrown: fyi, tomcat is looking into aurora win7-pgo* failures; seems that we have issues after the merge
17:18whimbooato: right. we need a js approach
17:18gbrownjmaher: thanks, was just going to ping
17:19atowhimboo: What problem are you trying to solve?
17:20whimbooato: my page load refactor is doing the next step before I can investigage chrome
17:20whimbooit chagnes from polling the readystate to listening for events
17:21whimbooi hate this, really... HTTP Error 500: Internal Server Error
17:21whimboothat's the only thing I currently get :(
17:21whimboomcote: ^
17:21atoNot sure I follow, but I guess it will be evident from your patch (-:
17:21atomozreview? I get that all the time when the US wakes up too.
17:22whimboohurray ~20th try worked
17:22atohttps://bugzilla.mozilla.org/show_bug.cgi?id=1338530
17:22bugbotBug 1338530: General, normal, glob, NEW , Push is failing on "Error 500: Internal Server Error", however the review request appears to have worked in Review Board
17:22whimbooyes
17:23atoI find its usually better in European mornings.
17:26globwhimboo: does it take a while before you get the error?
17:26whimbooglob: about 10-15s
17:26whimbooi think
17:26globok, so not a timeout
17:26whimbook, sorrzy ihave to head out for dinner
17:26whimboono, definitely not
17:27globno worries. i've spent a ton of time trying to track this down; sorry it's taking so long
17:27whimbooglob: if you want some info please ni? me on th bug
17:27whimboohappy to provide logs
17:27whimboobut have to know what to do
17:27whimbook, out for today!
17:27whimboohappy remaining day
18:47wlachted: do you have any bright ideas on how I could fix https://bugzilla.mozilla.org/show_bug.cgi?id=1347177 I'm testing my mochitest retriggering stuff now, and it works well for reproducing failures except for the fact that the failing jobs appears green :/
18:47bugbotBug 1347177: Task Configuration, normal, wlachance, NEW , Custom mach command should cause job to fail when it fails
18:47wlachted: it appears that mochitest doesn't actually propagate an error code when it fails, and just relies on mozharness (or something like it) to parse its log
18:52jgrahamI have some vauge memory that return codes were considered untrustworthy for some reason
18:53jgrahamSeems like the kind of thing that emorley remembers
18:55emorleyHmm I know there were a few bugs years ago, but I thought they were fixed
18:56wlachthere's this line here: http://searchfox.org/mozilla-central/source/testing/mozharness/scripts/desktop_unittest.py#715
18:57* wlach worries that this problem does not have a simple solution
18:59dmosedo artifact builds somehow behave differently w.r.t. noticing changes in browser.ini files?
19:01dmosewould anyone like to pair with me on vidyo for a few mins to help figure out why my skip-ifs in a browser.ini aren't working?
19:01dmosei will compensate in beer
19:25RyanVMemorley: pretty sure we still have instances where identical failures are red on one platform and orange on another
19:36jgrahamRyanVM: Yeah I have seen that
19:40wlachhmm it seems like sometimes mochitest does return an exit code, just not all the time apparently
19:41wlachI really wish our testing code wasn't such a layer cake, though maybe it's unavoidable
19:47jmaherwlach: thanks for looking at those intermittent bugs- I assume after a dozen or so of these you will have enough information to understand an easy workflow
19:49wlachjmaher: yeah, I guess the real question is whether this extra logging is useful to developers. at the very least this seems like an easier path to reproducing the failure
19:50dmosegrrr fencepost error
19:50jmaherwlach: that would be interesting if the original goal was not useful, but we found a super tool for reproducing failures
19:50dmoseif you have a single test in your browser.ini
19:51dmoseskip-if silently fails
19:51wlachjmaher: yeah hard to say, the intermittents I have thrown this at so far have been highly reproducible
19:51wlachjmaher: maybe I should try this against some that are more elusive
19:55jmaherwlach: yeah, you should keep a spreadsheet or notes about the ones you do- platforms it fails on, failure rate, easy to reproduce, etc.
19:55wlachjmaher: good idea
21:02jmaherjgraham: Ms2ger: would you know which bugzilla component https://github.com/w3c/web-platform-tests/blob/master/html/semantics/scripting-1/the-script-element/nomodule-set-on-async-classic-script.html should live in?
21:12wlachjmaher: jgraham: gbrown: here's what i have so far https://public.etherpad-mozilla.org/p/mochitest-retrigger-testing
21:16jmaherwlach|afk: great!
21:24mcotehm my post hasn't shown up on Planet Automation yet
15 Mar 2017
No messages
   
Last message: 43 days and 14 hours ago