w3 :: #testing

20 Apr 2017
08:16annevkjgraham: did I write that MathML test?
08:16annevkjgraham: your rewrite looks okay on the face of it, but none of that looks familiar to me
08:45jgrahamannevk: No, but I needed someone to give a quick review because it's blocking updating wpt on m-c
08:46jgrahamThe test was causing OOM on windows-7 debug which somehow caused the test runner to get into a loop and the job to never finish
08:46jgraham(which might be a bug in the test runner, but fixing the test so it doesn't just time out on all platforms also seems valuable)
08:46annevkIt looks okay from a quick glance, but it does seem a little hacky
08:47jgrahamI'm not sure what the less hacky approach is
08:48annevkLess duplication, but maybe that's okay too
08:48annevkIt's all fixed sets anyway
08:49jgrahamI mean I guess it would be possible to encode the doctype in the test url as a query string and use <meta name=variant>, but I think that&#39;s sufficiently less obvioius that it&#39;s not worth it
09:23annevkjgraham: anyway, LGTM, I guess I&#39;ll figure out later how to clear the needinfo thing
09:42jgrahamannevk: You go to https://reviewboard.mozilla.org/r/131878/, click &quot;Finish Review&quot;, change the first dropdown at the top to r+, and press publish
09:42jgrahamYeah, it&#39;s terrible UI, but it does help with the papertrail a bit if you do that now :)
09:47jgrahamTakk!
09:47annevkThanks for the instructions
10:13jgrahamgsnedders: r? ^
10:15gsneddersjgraham: will look soon, need to finish packing
10:25jgrahamgsnedders: Thanks
10:39jgrahambobholt: Overall I&#39;m pretty happy with the SC stuff, just a few changes that I think will simplify the code
12:28annevkHmm, I guess now I need to learn a whole new set of git commands while we transition away from submodules
12:29annevkAnd there&#39;s no documentation
12:29gsneddersFor what?
12:29annevkFor the record, git pull fails because merging would overwrite untracked files
12:29gsneddersjgraham: ^^
12:30gsneddershttps://github.com/w3c/web-platform-tests/pull/5607 didbn&#39;t land seemingly
12:30* gsnedders still isn&#39;t a fan of cramming more-and-more into the README because he never finds what he wants in it because it&#39;s so full of useful-once information
12:30annevkNone of git clean -f, git rebase origin/master or so work either
12:31gsneddersannevk: see that PR
12:31annevkWell create SUBMODULES.md, just have it somewhere
12:31gsneddershttps://github.com/w3c/web-platform-tests#submodules
12:32* gsnedders realises we&#39;ve split the set-up documentation, with the Windows stuff now detached from everything else
12:32gsneddersBut hey, that&#39;s all duplicated between docs/ and README anyway ;_;
12:59jgrahamgsnedders: Are you going to clean up the documetnation now that we don&#39;t have N repos?
13:00gsneddersjgraham: oh, that&#39;s a good idea
13:01jgrahama/win 8
13:02gsneddersnot b?
13:03jgrahamNope either a or win 8
13:14annevkBlocked by Chrome bot from landing a change take frustrating: https://github.com/w3c/web-platform-tests/pull/5619
13:16jgrahamannevk: Yeah :/ I think we need to either fix that or disable it by default
13:16jgrahamOr make it not block PRs
13:16jgrahamI&#39;ll send email
13:17annevkI also strongly agree with Domenic about the noise from bots
13:18annevkI suspect that all these little paper cuts make it less likely for contributors to come back, except those that are forced to deal with it somehow like us :/
13:19annevkInstead of &quot;let&#39;s do it live&quot; which might work with less code and less folks involved we should go for a &quot;let&#39;s keep it stable&quot; approach, even if that means that some infrastructure patches take longer due to testing and such
13:21gsneddersgiven a fair bit of our problems come down to Chrome/ChromeDriver, that may mean staying on slightly older releases for longer
13:21gsneddersjgraham: can you review dbaron&#39;s PR?
13:22jgrahamI mean in this case we did test it. And it seemed to work.
13:23jgrahamgsnedders: He marked it reviewed. I can merge it once the tests run
13:23gsnedderswe&#39;re meant to have a link to the review, right?
13:24jgrahamgsnedders: There is one
13:24gsneddersoh
13:24gsneddersI didn&#39;t read closely enough
13:26jgrahamannevk: I think that making the coverage check only happen for parts of the repo that actually have test coverage is the low hanging fruit there
13:26jgrahamannevk: The other parts are harder. We could make the lint check only comment if it fails
13:27jgrahamMoving to updadting the GH reviewers instead of the comment for OWNERS doesn&#39;t seem trivial since it has permissions requirements
13:28annevkOnly commenting on failure seems like a plus
13:28annevkBundling reports in a single comment that you update seems like a plus
13:28annevk(You won&#39;t get notifications for updates to a comment)
13:29jgrahamUpdating comments is possible, but bundling them is difficult
13:29gsneddersalso GH reviewers is also an experimental feature at the API level
13:29jgrahamYou either need state in the server or some way to parse the existing comment to work out which part to replace
13:29gsneddersjgraham: most things just use comments for that
13:30annevkjgraham: if when you make the comment you leave easily replaceable bits in comments you can work that out I think
13:30annevkjgraham: <!-- SOME RANDOM STRING THAT INDICATES THIS IS WHERE CHROME-BOT STATUS GOES -->
13:31jgrahamYeah, maybe there&#39;s a possibility there, although I note that most bots don&#39;t have to deal with freeform user-supplied text in the comment body
13:34jgrahamAnyway, I guess we can make it work with some effort
13:34jgrahambobholt: Is that something you are likely to have time to work on?
13:34annevkI think if we reduced it to a single bot comment that would already be a lot better than what we have now
13:35annevkBecause now it&#39;s 5 bot comments
13:36annevkNo, 6!
13:36annevkLook at all that noise: https://github.com/w3c/web-platform-tests/pull/5620
13:36annevkThat&#39;s not inviting for reviewers either
13:39bobholtjgraham: i can probably do that after we get sauce sorted
13:40bobholtmeeting with folks today to set my next sprint, but i think my directive is generally to improve the usability and stability of the infra
13:40jgrahamannevk: I think we can make this better. I also think that you are overstating how much of a problem it is.
13:41Domenicjgraham: you have two of your most prolific contributors telling you it&#39;s a really big problem. This kind of dismissive attitude is pretty upsetting.
13:41jgrahamDomenic: I&#39;m not being at all dismissive
13:42jgrahamI think it&#39;s a legitimate concern
13:42jgrahamI think you are being pretty dismissive of the benefits of these various tools
13:42annevkThe notification spam from web-platform-tests is definitely a drain of sorts on my productivity
13:43annevkI don&#39;t think I ever dismissed the benefit of the tooling, just how it&#39;s presented
13:43DomenicI think the benefits are maybe there for you and a couple other people, and you should not foist them on other people.
13:43annevkAnd I&#39;m dismissive of the flaky tooling since it seems I can never land anything after I&#39;m told by the reviewer I can without first asking here
13:44jgrahamannevk: I agree that the problems with the Chrome tooling are real and serious
13:44jgrahamannevk: I&#39;ll make that not block PRs today
13:45jgrahamDomenic: Are you just talking about reviewable there? Or are you arguing that stability checking tests isn&#39;t a benefit to more than a couple of people?
13:45Domenicjgraham: reviewable and codecov
13:45jgrahamCodecov is a bug
13:46jgrahamFallout from merging submodules which is a huge win for lots of people in the long run
13:46jgrahamI totally disagree about reviewable and think that you are specifically overstating that as a problem now that it doesn&#39;t affect the status check
13:47jgrahamIt doesn&#39;t generate notifications and doesn&#39;t affect status. That seems pretty non-intrusive to me
13:49DomenicIt generates advertisements that take up more space than most commit messages.
13:51jgrahamIt generates a link to a tool that is very useful for working with the PR. Yes I wish it was smaller. But I don&#39;t honestly believe the presence of that link is affecting your productivity.
13:51DomenicAnd I am frusrated that you don&#39;t believe me when I tell you things about my own productivity.
13:51DomenicIs there an escalation path for this?
13:52DomenicAdvertising your favorite web services is just not an appropriate use of WPT infra.
13:52jgrahamHah
13:52gsneddersRealistically the right solution to Reviewable is make GitHub&#39;s review tools not suck.
13:52jgrahamI wish it was one of my favourite web services
13:52jgrahamBut &quot;less sucky then GitHub code review&quot; is a very low bar
13:53annevkI quite like GitHub code review
13:53DomenicYou can use third-party tools to manage your GitHub experience all you want. I have a neato browser extension that makes notifications easier to deal with. That&#39;s fine. I don&#39;t advertise it using shared infra.
13:54annevkI think the governance question is fair though, although I suspect there might not be an answer for it
13:54jgrahamDomenic: I still don&#39;t have any explaination for why the current setup is causing you an actual issue
13:54jugglinmikejgraham annevk I&#39;ll mention this on the e-mail thread, but the &quot;very short term&quot; solution would be to instruct people to `git commit --allow-empty -m &#39;Trigger CI&#39;`
13:55Domenicjgraham: can you understand why advertising LastPass would cause you an actual issue? If not, then let&#39;s start doing that please.
13:55jgrahamDomenic: That is such a disingenuous example that I don&#39;t think you&#39;re arguing in good faith
13:55DomenicI don&#39;t understand why you can&#39;t see the comparison. It&#39;s strictly 1:1
13:56gsnedderswhat does LastPass have to do with PRs, in any way shape or form?
13:56jugglinmike...although maybe allowing Chromium failures is so simple that it&#39;s not worth suggesting the workaround
13:56jgrahamWhat a tool that is unrelated to a specific pull request vs a link to a review ui for that pull request
13:56Domenicgsnedders: it makes my GitHub experience easier
13:56gsneddersReview tools at least have a direct application specific PRs.
13:56DomenicThis review tool has no application to my PRs
13:56DomenicLastPass is much more related to my PRs, and those of most contributors who aren&#39;t jgraham.
13:57jgrahamNo it&#39;s not
13:57jgrahamAnyway I&#39;m going to go fix some actual problems rather than have this argument
13:57jgrahamPlease feel free to escalate or whatever you like
13:57DomenicI use LastPass to login to GitHub to review everyone&#39;s PRs. It&#39;d be super-nice if on computers where I hadn&#39;t installed it yet there was a quick link to the extension. That&#39;d be way more useful to me than an advertisement for reviewable on those PRs.
13:58annevkjgraham: how would you escalate it?
13:58jgrahamannevk: Talk to MikeSmith or plh or someone I guess.
13:58annevkI think escalation is the right thing to do here, since this discussion seems a little personal
13:58annevkDomenic: does that work for you?
13:58DomenicYeah, sounds good!
13:58annevkGreat
14:01gsneddersI presume some of the problem, realistically, if that those who want something that tracks specific issues across edits and rebases, etc., don&#39;t have the power to influence GitHub&#39;s tools being addressed to overcome those shortcomings
14:02annevkI think once you are aware of how GitHub&#39;s review system works, you tend to put nits as line comments and larger issues that need to be tracked across smaller edits in the review box
14:12* jgraham is aware of how GitHub&#39;s review system works
14:19aygHow many people actually use Reviewable in practice? If it&#39;s rarely used, then it doesn&#39;t seem necessary to have a link on every PR.
16:01jugglinmikejgraham: It looks like testharness.js does not guarantee consistent sub-test ordering in the reported JSON array. Can you confirm?
16:07jgrahamjugglinmike: I.m not sure which JSON array you mean but in any case I believe that&#39;s correct; everything I can think of either uses a hashmap or insertion order in an array
16:18jugglinmikejgraham: oh right. I was a little vague http://testthewebforward.org/docs/testharness-library.html#callback-api
16:18jugglinmike&quot;add_completion_callback(callback) - callback called with an array of tests and a status object&quot;
16:21jgrahamjugglinmike: Yeah, that gets populated with the tests from the Tests object which are added when a Test is created which happens when test() or async_test() or whatever is called
16:21jgrahamWhich doesn&#39;t have to be deterministic
16:21jugglinmikeok
16:21jugglinmikeSo to compare against &#39;expected&#39; output, I&#39;ll need to ignore order and also null out the &quot;index&quot; property
16:22jgrahamWell it depends I guess
16:22jgrahamIf you set your tests up so that things always happen in a deterministic order I would expect the output to also be deterministic
16:23jgrahamBut if you have multiple async tests started from unsynchronised events, I don&#39;t expect it to be deterministic
16:23jugglinmikejgraham: This is for the testharness.js test suite, and I think that we have to account for that case because we&#39;re testing Worker integration
16:26jgrahamjugglinmike: OK
16:27jugglinmikeAw c&#39;mon. This is exciting stuff!
16:27jugglinmikeA test harness for the test harness!
16:29gsneddersWhat test infra are we going to test that test infra under?
16:29jugglinmikeOpen question gsnedders. I&#39;m going to create a pull request in a few minutes to discuss
16:39gsneddersFWIW, I&#39;m out tomorrow
16:44jugglinmikeboo
16:46jgrahamjugglinmike: It is exciting :)
16:47jgrahamgsnedders: Oh the .yml file was invalid
16:47jgrahamPushed a new version, but idk if the ignore rules work
16:48jgrahamThey might cause everything to be ignored; the docs don&#39;t make much sense to me
16:52jugglinmikejgraham: this only gets you so far, but it can help catch some errors quicker https://lint.travis-ci.org/
16:53jugglinmike...which I suppose is obvious given the name &quot;lint&quot;
16:54jugglinmikejgraham: gsnedders Ran out of time prior to my meeting. I&#39;ll submit that pull request shortly after I&#39;m done there
16:57jgrahamjugglinmike: This is for the codecov stuff
16:57jgrahamIt turns out that it also has a lint
17:08jgrahamjugglinmike: Have we tried the environment variable hack suggested in https://bugs.chromium.org/p/chromedriver/issues/detail?id=1699#c17
17:30aygWhat is codecov testing the coverage *of*? How many lines of the tests themselves get run, to detect dead code?
17:36jgrahamayg: No, it&#39;s testing the unittest coverage of the tools/ directory
17:36jgrahamayg: Since tools merged in to wpt (yesterday) it appears on all PRs
17:36jgrahamayg: This is a bug of course
17:54jugglinmikejgraham: if we implement the workaround, we may never see a fix from Chromium. The bug will continue to trip up application developers. I think ignoring Chromium failures is the better solution because it avoids the workflow interruptions, places some pressure on the Chromium team, and (importantly) is something we actually understand.
17:57jgrahamjugglinmike: If the workaround works I think I&#39;d prefer to let rbyers decide what action to take. It doesn&#39;t really affect me if tests land while they&#39;re unstable in Chrome, but if we had the same situation but s/Chrome/Firefox/ I would be extremely keen to land the workaround (although I like to think that in that situation I would be actively fixing the bug on the geckco side too).
17:59jgrahamIf chromedriver is resource starved and chrome is having difficulties importing tests because of instability then landing the workaround seems like a win in the bigger picture. If the firefox stability check is good enough or it doesn&#39;t matter much then it might make sense to delay longer.
18:03jugglinmikejgraham: I&#39;m inquiring about taking time to fix ChromeDriver personally. Given my background, this is probably not very efficient, though I would enjoy trying. That said, I&#39;m not sure how the use of the workaround will affect the appetite for seeing a real fix
18:04jgrahamjugglinmike: Understood. I added a comment to the issue; hopefully you and rbyers and foolip can work out the best course of action.
18:07jugglinmikeah, that&#39;s a good idea. I&#39;ll add my comment as well
18:35jgrahamjugglinmike: https://cs.chromium.org/chromium/build/scripts/slave/runtest.py?type=cs&q=DBUS_SESSION_BUS_ADDRESS&l=90
18:37jugglinmikehm
19:53jugglinmikegsnedders: Not sure if you&#39;ve left for the weekend, so I didn&#39;t formally request your review. But here&#39;s the testharness.js patch I mentioned earlier https://github.com/w3c/web-platform-tests/pull/5627
21:06jgrahambobholt:
21:07jgrahamjugglinmike: Are you more inclined to try and start dbus or to set the environment variable to something random?
21:07jgrahamI think you&#39;ve read more about this than me now
21:08jugglinmikejgraham: I have probably read more, but I haven&#39;t understood most of it
21:08jugglinmikeWhich would be necessary to adequately answer your question
21:08jgraham:)
21:09jgrahamBTW I&#39;m more inclinded to move this into wptrunner than into .travis.yml
21:09jugglinmikeeven the empty string is enough on my machine
21:10jgrahamWith appropriate checks that we are on linux and so on
21:15jugglinmikeWell I suppose that would shield more people from this bug
21:20jgrahamjugglinmike: OK well the PR currently has both options. If you develop preference then feel free to add it to the bug and/or review the associated changes; I think we can land this tomorrow
21:20jgrahamIf you have decided that it&#39;s OK to land at all :)
21:23jugglinmikejgraham: now CC&#39;ing you on an ongoing discussion with Chrome folks
21:27jgrahamjugglinmike: Thanks
21:27* jgraham is now going to sleep
21:39jugglinmikeGood night!
21 Apr 2017
No messages
   
Last message: 95 days and 23 hours ago