mozilla :: #ateam

15 Feb 2017
10:44jmaher|afkxidorn: thanks for the patch for stylo failures; I don't understand the error file too well- these look like files, with keywords and numbers, not specific error messages
10:45xidornjmaher|afk: those in `pattern` are substring of error messages
10:45xidornjmaher|afk: since error messages are long, and they are quite different between each other, using some common substring should be more efficient
10:45jmaher|afkso |test_inherit_computation.html `animation` [6]| has 6 TEST-UNEXPECTED-FAIL messages mathing .*animation.* ?
10:46xidornjmaher|afk: yes
10:46xidornactually TEST-UNEXPECTED-{FAIL,PASS}
10:46xidornand they are turned into TEST-KNOWN-FAIL
10:47jmaher|afkxidorn: cool; and all tests have to be .html ?
10:48xidornjmaher|afk: I think currently they are all .html, but it should support other files... at least making it support other type should be a straightforward change
10:49jmaher|afkxidorn: I also see things like |test_initial_computation.html `grid` [*]|
10:49jmaher|afkthat ignores all `grid` mathes?
10:49jmaher|afkmatches
10:49xidornjmaher|afk: that means there would be some failures (non-zero)
10:49xidornjmaher|afk: tes
10:49xidornyes
10:50xidornmarking all `grid` UNEXPECTED as KNOWN-FAIL, since it is not fully supported yet...
10:51jmaher|afkxidorn: when you gathered this data, was it from a try push and how many times did you run each test?
10:51jmaher|afkI want to make sure you are not overlooking high intermittents
10:51jmaher|afkand I also want to make sure you are not ignoring regular intermittents
10:51xidornjmaher|afk: they are pretty stable. basically tests run locally and data from try push are identical
10:52xidornjmaher|afk: given majority of them have exact number listed, intermittents shouldn't be ignored
10:52xidornshouldn't be ignored silently, I mean
10:52jmaher|afkok
10:57jmaher|afkxidorn: one other question in the parser, I see |(test_\S+|\.{3}) # test name|, does {3} match .html ?
10:57xidornjmaher|afk: \.{3} is for "..."
10:57xidornwhich means using the previous test name
10:57jmaher|afkok, that clears it up
10:58jmaher|afkI had overlooked the | in that line originally
10:58xidorn:)
11:02jmaher|afkxidorn: should we hook this up to mach so the failure pattern file gets passed through automatically?
11:02jmaher|afkfor local testing that is
11:03xidornjmaher|afk: I'm not sure...
11:04jmaher|afkxidorn: one other thing I see here is we have many test files that have the same name in the source tree- I see some issues there
11:05xidornjmaher|afk: the failure pattern file is in the same directory of manifest file
11:05xidornthere shouldn't be much tests with same name under a single manifest file
11:05jmaher|afkso this is only for 1 directory of tests?
11:05xidornjmaher|afk: yes
11:06jmaher|afkwill it expand to other directories?
11:06xidornonly layout/style/test
11:06xidornjmaher|afk: I think it is expandable. we can just put a failure pattern file in the directory we want to test
11:07xidorn"--failure-pattern-file" specifies a file name, not a path
11:08jmaher|afkok, I see: "mochitest-style": ["--disable-e10s", "--failure-pattern-file=stylo-failures.md", "layout/style/test"]
11:08xidornjmaher|afk: yes, and pat_file = os.path.join(os.path.dirname(test['manifest']), options.failure_pattern_file)
11:09jmaher|afkso my concern for this being applied to all mochitests is invalid- but that also assumes we don't mess things up in mozharness/taskcluster land
11:23jmaher|afkxidorn: one more question- if we are expected 6 failures, and we only get 5, will we report test-unexpected-pass?
11:23xidornjmaher|afk: it reports unexpected-fail
11:23xidornI can change it to unexpected-pass when there are fewer failures than expected, though...
11:24jmaher|afkI think that would be more intuitive, but let me think a bit more on it
11:24xidornI just thought it doesn't make much difference... so just keep it all unexpected-fail
11:25jmaher|afkyeah, that might work out just fine
11:25xidornif you prefer the different way, I can do that as well
11:25xidornshouldn'
11:25xidornshouldn't be too hard
11:26jmaher|afkyeah, give me a few minutes- that was just a thought I had pending and it works out good that you are on irc right now :)
11:32jmaher|afkxidorn: I am fine with the test-unexpected-fail
11:33xidorn:)
11:33jmaher|afkxidorn: how will we surface the fact in the failures that we were expecting failures that we didn't see
11:33jmaher|afkso if you push to try and fix stuff for example :)
11:33xidornjmaher|afk: that's a good question...
11:35xidornin general, if the number changes significantly, it usually means something new is implemented, and one should disable the related patterns and rerun the test to check the raw UNEXPECTEDs
11:36jmaher|afkwould it be possible to print out something in the log when using the pattern file to help remind people?
11:37xidornthat is probably doable
11:38xidornprobably we can collect all failures recorded by a pattern, and if the pattern number doesn't match, print all recorded failures at the end of test?
11:39jmaher|afkfor example adding |Error: we are marking many failures as known-fail with layout/style/test/stylo-failures.md, please make sure any errors listed here are cross referenced to that file
11:39jmaher|afkand that would show up in the failure summary log, but not cause the job to be orange
11:40xidornjmaher|afk: oh, you mean that
11:40xidornI misunderstood
11:40xidornhow to make something show up in the failure summary? just use Error?
11:41jmaher|afkyes, I believe so, I see it often :)
11:41jmaher|afkit would always be there, but the job would only be orange if there was a real failure
11:41xidornand... I don't quite understand what do you really mean by "please make sure any errors listed here are cross referenced to that file"
11:41jmaher|afkwell, maybe a different message
11:42xidorndo you mean to ask people check stylo-failures.md when they see UNEXPECTED in the result?
11:42jmaher|afkso if there are failures, then we need to ensure we fix them as they are new, or fix the annotations in the stylo-failures.md file
11:42jmaher|afkyes, what you said
11:43xidornthen I guess the message should probably be at the very beginning?
11:43jmaher|afkyes, early on when we get the failure file
11:44xidornok
11:47jmaher|afkok, submitted the review with a few comments- really close, this irc chat helped a lot
11:52xidornjmaher|afk: thanks :) I'll address them tomorrow
11:54jmaher|afkxidorn: great, looking forward to this being green and getting stylo up and running :)
13:01Tomcat|mtgAutomatedTester: ping :)
13:45whimboojmaher: wtf? https://bugzilla.mozilla.org/show_bug.cgi?id=1339347#c5
13:45bugbotBug 1339347: Loan Requests, normal, hskupin, NEW , Slave loan request for Windows 8 x64 opt buildbot test slave machine to whimboo
13:45Gijshow does autoland determine on what cset to land to try?
13:46jmaherwhimboo: I don't think we use mozilla-build in automation
13:46whimboojmaher: how do we run tests on windows machines then?
13:47whimbooi do not see another Python installation
13:47jmaherwhimboo: with mozharness
13:47whimboowhich requires Python
13:47jmaheryes, but we install all the modules with mozharness
13:47whimbooand python comes with mozilla-build
13:47whimbooso we use 2..7.3
13:48jmaheryeah
13:48whimbooso why aren't we updating the mozilla-build installs on test machines?
13:48jmaherwhimboo: we rarely update test machines
13:49jmaherthat requires a lot of testing and maintenance
13:49whimboocurerntly we have mb 2.2 and this is 1.6
13:49jmahermb?
13:49whimboomozilla-build
13:49AutomatedTesterhey jmaher, I was wondering how quantum related work (webrender and stylo) were impacting stockwell
13:49jmaheroh mozillabuild
13:49jmaherAutomatedTester: I don't know to be honest- I do know as of monday we went from a score of 0.3 to a score of 0.0 :)
13:50AutomatedTesterjmaher: we are seeing that quantum related jobs can be orange for a few hours (since they are tier 2 they dont get backed out)
13:50AutomatedTesterand could skew OF score
13:52jmaherjgraham: we have a really long running wpt-e10s-3 job on osx debug and there are dozens of test-errors and timeouts, but the job is green: https://archive.mozilla.org/pub/firefox/tinderbox-builds/autoland-macosx64-debug/1487154705/autoland_yosemite_r7-debug_test-web-platform-tests-e10s-3-bm106-tests1-macosx-build229.txt.gz
13:52jgrahamjmaher: Sounds plausible
13:52jmaherjgraham: specifically with /WebCryptoAPI/generateKey/failures* tests, is this expected- wcosta and I have spent more than a minimal amount of timetrying to track down these failures
13:53jmaherand then to find out they happen ALL the time
13:53jgrahamjmaher: If they aren't logging as TEST-UNEXPECTED- then they are expected
13:53jmaherAutomatedTester: stockwell doesn't necessarily track OF, it tracks actual bugs, we can easily ignore linux64-stylo/qr builds
13:54AutomatedTesterjmaher: ahh
13:54AutomatedTesterjmaher: in that case I won't put any more thought into this ;)
13:54jgrahamI think the .worker. tests there will error until we support WebCrypto in workers
13:54jmaherjgraham: ok, that is really confusing- can we fix that so we don't have TEST-ERROR in the logs? I can understand the need for timeouts
13:54jmaherAutomatedTester: cool, thanks for checking
13:54jgrahamjmaher: Well the tests do error
13:54jgrahamIt's a bug in Firefox (typically)
13:55jmaherbut if that is test-pass, then isn't that test-pass not test-error?
13:55jgrahamWe log the actual status that the test gave
13:55jmahercan we switch that to TEST-ERROR[-NOT-REALLY] ?
13:55jgraham-UNEXPECTED- is the sign of something unexpected happening
13:55jgrahamjmaher: But it did really error
13:55jgrahamThere is really a bug
13:56jgrahamIt just isn't a regression
13:56jmaherI am confused
13:56jmaherif this is a bug, why isn't it a regression?
13:56jgrahamBecause it always errored
13:56jgrahamThe bug in this case is "we don't support WebCrypto in workers"
13:56jmaherthen why do we run these? we have very limited resources, especially on osx
13:57jgrahamBecause we are running everything we import that isn't unstable by default
13:57jmaherlets fix that then, running automation because it exists is a waste of our resources
13:57jgrahamOtherwise I have to know upfront which tests we support and people have to remember to re-enable tests when they work on a feature
13:58jmaherwell if we had real test owners for these tests then they would implement the bugs and turn on the tests
13:58jgrahamHah
13:58jgrahamWe can get real owners
13:58jmaherwell please turn off these tests which we do not support in firefox
13:58jgrahamGetting them to remember at some future time to switch on some tests won't happen though
13:58jgrahamIn the same way that we don't ever un-disable unstable tests
13:59jgrahamjmaher: If there is a special concern with OSX capacity I can accept that turning them off on that platform is a necessary evil
13:59jgrahamOtherwise I think that's a bad idea
14:02jmaherwell we differ, but discussing it will not change either of our minds, so lets move on for now and let others make the final call
15:30whimboojgraham: hi. are you around?
15:43jgrahamwhimboo: Yes
15:45whimboojgraham: maybe you can have a look at https://bugzilla.mozilla.org/show_bug.cgi?id=1322383#c74
15:45bugbotBug 1322383: Marionette, normal, hskupin, ASSIGNED , Chrome context methods (driver.js) miss checks for valid window - no NoSuchWindow failure thrown
15:51whimboojgraham: any way o being able to print python stack output would be great
15:56ekylegbrown: I found the problem for activedata slowness: A majority of the data is on a single node, which means one machine is answering the bulk of the queries. The unbalance is caused by a lack of drive space, which I solved yesterday. Now, we must wait for the automation to better balance the data.
15:58jgrahamwhimboo: I'm not sure why there's no stack, but if the problem is that the list of window handles isn't updated by the time that execute script returns, that seems like a bug?
15:59jgrahamIn marionette I mean
15:59whimboojgraham: window.close() is not synchronous!
15:59whimbooso you should not use execute_script at all
16:00whimboojgraham: what kind of window is that?
16:01whimboo a chrome window?
16:01whimboowe have self.marionette.close_chrome_window() which can do it
16:01jgrahamwhimboo: No it's just a content window
16:02jgrahamBut I don't know the window handle ofc
16:02whimbooso what I really need is a way to dump output to the console
16:02whimbooi assume its a problem with the threading
16:02whimbooso any idea how I can get this done?
16:02whimboootherwise debugging is hard
16:02jgrahamwindow.close looks pretty sync in the spec
16:03jgrahamwhimboo: Try removing the script close and see if that fixes the problem. Or ignore the race condition in close_other_windows. Or both
16:18gbrownthanks ekyle
16:36whimboojgraham: nope. its not sync... simply see bug 1311350
16:36bugbotBug https://bugzilla.mozilla.org/show_bug.cgi?id=1311350 Marionette, normal, hskupin, RESOLVED FIXED, close() and closeChromeWindow() do not wait until the underlying window has disappeared
16:36whimboojgraham: you have to wait for the appropriate events to occur
16:37whimboootherwise you run into races
16:37whimboojgraham: so I got logigng working to the console
16:37whimbooi will have a look in how executormarionette.py is working and how we can get the situation improved for window handling
16:37jgrahamwhimboo: That doesn't seem the same as the spec says
16:37jgrahamMaybe the spec is wrong
16:38whimboojgraham: which spec?
16:38jgrahamHTML
16:38jgrahamWhich defines window.close()
19:06jgrahamjmaher: It turns out that the WebCrypto slowness is entirely due to a bug in the harness that I fixed some time ago but that never landed
19:06jgrahamWell in testharness.js
19:07jgrahamIt will go away when I do the next wpt update
19:10jgrahamWhich I am starting now
19:10jmaherjgraham: thanks for looking into that- I will keep an eye out for the update!
19:30RyanVMjgraham: will that fix need backporting?
19:31jgrahamWell an uplift would save some machine time
20:52jmaherdoes glob still keep a logbot around?
20:53jmaheroh, found it, logs.glob.uno
20:55Standard8jmaher: see /topic ;-)
20:56jmaherheh, thanks Standard8, it was staring me in my face
23:45xidornit seems treeherder uses this to generate failure summary https://github.com/mozilla/treeherder/blob/master/treeherder/log_parser/parsers.py#L348
23:45xidornI wonder whether it is possible to add any warning into the failure summary
23:50Aryxxidorn: should be, cc KWierso. #treeherder is the best choice for these questions
23:50xidornAryx: ok, thanks
16 Feb 2017
No messages
   
Last message: 127 days and 16 hours ago