mozilla :: #balrog

11 Oct 2017
07:28roflgv4z277ijqyn7uenr5pdvsovoawibwcjtlqgkxkfifdcs7csshpq.b32.i2p
07:28rofl(accessible with tunnel)
07:28roflhttp://lvb6wabr3fuv7l2lmmaj33jwh7ntb7uuhmfmluc7hwtf6rm36k6q.b32.i2p/
07:28rofl(kiwi client on webpage)
07:28roflCallek mostlygeek fkang mythmon hemant sphilp fuzzmz sfraser|pto relud catlee miles chartjes jlund|away wsmwk rdalal mtabara natim aselagea|buildduty jlorenzo glasserc collin5 nthomas|pto _6a68 logbot rhelmer alweezy
13:09allan-silvahey, bhearsum, Do you think you have a time tomorrow to talk about emergency shut off ?
15:00glassercI guess the sentry error in Balrog stage is related to my code
15:01glassercI guess it's possible to have a scheduled change that isn't an insert that doesn't match any rule
15:01bhearsumhttps://sentry.prod.mozaws.net/operations/stage-admin/issues/621258/ ?
15:01glasserchttps://sentry.prod.mozaws.net/operations/stage-admin/issues/666801/
15:01glassercSo yes
15:01bhearsumi think those might be related to the prod -> stage dump we did yesterday
15:02bhearsumi had relud run a query to remove the required signoffs, and they showed up right after that...
15:02bhearsumso maybe it's a bug that only happens if you don't have any required signoffs
15:02* bhearsum adds one
15:08glassercI think it's because there is a scheduled change to a required signoff that isn't there any more
15:08bhearsumah
15:08bhearsumok, so that's an invalid state
15:09bhearsumi can probably remove those with some manual API requests...
15:09glassercSide note, is it just me or is it really hard to think through the implications of "stacking" scheduled changes and required signoffs?
15:11bhearsumare you talking about how you need to fulfill the time requirement + signoff requirements before a scheduled change can be enacted?
15:12glassercI'm not sure.. something about it seems very recursive and mindbending to me
15:12glassercLike, you have to get some people to sign off on changing the act of some people signing off
15:14bhearsumoh, yes
15:14bhearsumthat's a security thing
15:14bhearsumthe whole goal of multiple signoffs was to protect against single bad actors or compromised accounts from doing bad things
15:15bhearsumso you have to ensure that no single person can modify stuff, or signoff on their own
15:15bhearsumand you need to ensure they also can't _remove_ the signoff requirements by themselves
15:16glassercDo signoff requirements change so often that we need them to be modifiable through the database? Is this maybe a candidate for something that could be done in code?
15:17bhearsumi can see an argument for doing it through the IAM service for sure, possibly directly in code or the app config
15:23bhearsumthere we go - i removed the scheduled changes and the exceptions are gone \o/
15:27glassercOK.. if we think this is a common occurrence, then I can make the code more robust
15:27glassercBut I guess you said it was an invalid state
15:27bhearsumyeah, i think it's okay to ignore. the only way i'm aware that we can get into this state is by messing with the db by hand
15:42203BAJ482balrog-web #200: building mozilla/balrog:master-36cb816f5d3270472f6f4aeb77bbfc9e5cfa68b7
15:52cloudops-ansiblebalrog-admin #225: master-36cb816f5d3270472f6f4aeb77bbfc9e5cfa68b7 deployed to stage /cc relud bhearsum
15:52cloudops-ansiblebalrog-web #200: mozilla/balrog:master-36cb816f5d3270472f6f4aeb77bbfc9e5cfa68b7 deployed to stage /cc relud bhearsum
17:00alweezybhearsum: Hello, Based on your comments on https://github.com/mozilla/balrog/pull/406, does it mean that on renavigating to rules you don't need the url to point to the current pr_ch but the filter should have it?
17:01bhearsumhi alweezy, i don't have time to answer questions today, sorry -- but if you drop something in the PR i'll reply when i can
17:02alweezyOkay, Thanks!
17:45Avedis777Hey, I'm back!
17:45Avedis777Where can I find the "Add Scheduled Change for Rule" form?
17:52bhearsumAvedis777: that's over on http://localhost:8080/rules/scheduled_changes
17:53bhearsumi think it's at the top of http://localhost:8080/rules, too
17:53Avedis777I mean where is the code for it?
17:54bhearsumif you're talking about the backend, the entry point is https://github.com/mozilla/balrog/blob/master/auslib/web/admin/views/rules.py#L202, which ends up in https://github.com/mozilla/balrog/blob/master/auslib/web/admin/views/scheduled_changes.py#L67
18:01Avedis777Is there an id for the date field?
18:07Avedis777I want to add a condition to check if that id exists. https://github.com/mozilla/balrog/blob/36cb816f5d3270472f6f4aeb77bbfc9e5cfa68b7/auslib/db.py#L2522 before that loop. If it does exist, I can then check if it has been changed and I can append it to the body variable.
18:07Avedis777Am i looking at this the right way? Or is there a different approach
18:08Avedis777Oh and im working on this bug in case anyone wants to help :) https://bugzilla.mozilla.org/show_bug.cgi?id=1386400
18:10bhearsumAvedis777: what do you mean by "id for the date field"?
18:13Avedis777an id or a variable to reference the date in the form
18:14bhearsumah
18:14bhearsumok, so i don't think you need to think about the web form at all here - everything in db.py is at a lower level than that
18:15bhearsumso what you're looking for instead is the database column name, which is "when" (defined at https://github.com/mozilla/balrog/blob/36cb816f5d3270472f6f4aeb77bbfc9e5cfa68b7/auslib/db.py#L901)
18:18Avedis777So am I on the right track?
18:18bhearsumyou're looking in the right area of db.py, yep
18:19bhearsumit's just important to realize that in that file, you're a bit far removed from the web layer - so you can't think of things in terms of what the HTTP request looks like
18:30Avedis777Python is hurting my head, ive never seen or used it before
18:35Avedis777how can I print table["time"]?
18:36Avedis777Can I use the console to see messages that I trigger?
18:39bhearsumAvedis777: you should be able to see anything you've printed in the docker-compose output
18:39bhearsumAvedis777: i won't be able to reply for a bit - i'm about to start a deployment. you're on the right track though!
18:45cloudops-ansiblebalrog-web #200: mozilla/balrog:master-36cb816f5d3270472f6f4aeb77bbfc9e5cfa68b7 canary deployed to prod /cc relud bhearsum
18:46cloudops-ansiblebalrog-admin #225: master-36cb816f5d3270472f6f4aeb77bbfc9e5cfa68b7 deployed to prod /cc relud bhearsum
18:54bhearsumlooks like everything is OK so far, the iop spike is from some failed nightlies that had to be respun - nothing to be concerned about
19:06reludoh, cool. glad you recognize what that's from. also, note that it's only having an effect on the replica, which is not in use in any way.
19:06reludbhearsum: lgtm. you wanna promote now, or give it more time to stew?
19:07bhearsumlooks like the effect on the replica is neglible anyways - just a few ms
19:08bhearsumi'm happy to proceed now - no new exceptions nor anything of note on datadog
19:09reludk, promoting
19:09cloudops-ansiblebalrog-web #200: please check balrog canary and promote to full deploy /cc relud bhearsum
19:16cloudops-ansiblebalrog-web #200: mozilla/balrog:master-36cb816f5d3270472f6f4aeb77bbfc9e5cfa68b7 deployed to prod /cc relud bhearsum
19:16bhearsumrelud: the good news about these extra nightlies is that they seem to be instantly proving that the perf problems are fixed
19:17reludnice
19:27reludbhearsum: tags not named master-* or latest will now build stage
19:27bhearsumawesome
19:27bhearsumlet me make a quick one by hand to give it a try
19:38* bhearsum waits on stage to respond
19:46bhearsumrelud: should i expect an irc notification about the stage deploy still?
19:51reludyes
19:51reludI'll check on it in a moment
19:51reludbhearsum: you have push agent too, or it won't work
19:51reludshould generally push agent first
19:52bhearsumaaaah, okay
19:52bhearsumwill anything bad happen if i push agent afterwards?
20:01* bhearsum doesn't chance it
20:04reludbhearsum: okay, i found the bug on my side, and fixed it.
20:04bhearsumah, okay
20:04cloudops-ansiblebalrog-admin #226: building bhearsumtest2
20:04bhearsumsweet
20:04reludif you push again, it should trigger a build, and you don't have to push a new tag for it to work
20:04cloudops-ansiblebalrog-admin #226: failed in build step /cc relud
20:05cloudops-ansiblebalrog-web #201: failed in build step /cc relud
20:07relud^ those are because the image pushed was not built by CI
20:07cloudops-ansiblebalrog-admin #227: building bhearsumtest3
20:08bhearsumwhen you say CI is that my CI or yours?
20:08reludyours
20:08bhearsumahhh
20:08cloudops-ansiblebalrog-admin #227: failed in build step /cc relud
20:09cloudops-ansiblebalrog-web #202: failed in build step /cc relud
20:09bhearsumokay, i'll have to wait until i land my CI changes to create version tags there to verify this, i guess
20:11cloudops-ansiblebalrog-web #203: building mozilla/balrog:relud
20:12cloudops-ansiblebalrog-web #203: failed in build step /cc relud
20:12relud*sigh* retagging an image causes the check to fail
20:12cloudops-ansiblebalrog-admin #228: failed in build step /cc relud
20:17reludso it can't even be a re-tag. i'm going to ask if anyone else in my team has a solution to that :(
20:19bhearsumhmmm, bhearsumtest3 was a new tag i thought
20:19bhearsumor does "retag" mean tagging an existing rev?
20:22reludit means that
20:22bhearsumah
20:22reludso like: docker pull mozilla/balrog:latest && docker tag mozilla/balrog:relud && docker push mozilla/balrog:relud
20:22reludand that results in a failure to verify, because it changes the digest
20:22bhearsumah, yeah - that's what i just did
20:23reludyeah, i thought that would work v_v
20:23bhearsumi'm going to try to get my other patches merged tomorrow - they should let us do tagging in response to github tags
20:26reludcool. do you want me to change stage behavior until then?
20:27bhearsumnaw, we can leave it
20:57bhearsumi'm going to step out for a bit, i'll be back before the nightlies start reporting in
22:38bhearsumlooks like we had our usual ~8 UTC load spike (from the cronjob IIRC), but everything is going well still
22:38bhearsumand nightlies are running, probably close to an hour before we start to see repacks hammer balrog
22:48cloudops-ansiblebalrog-admin #2: latestDEV FAILED
22:49cloudops-ansiblebalrog-admin #3: latest DEV STARTED
22:49relud^ yesssss
22:49bhearsumwhee!
22:50cloudops-ansiblebalrog-admin #3: latest DEV FAILED
22:51relud*sigh* it's something tho
22:58* bhearsum tries to trigger the CI based tagging
22:58bhearsumversion tagging, that is
23:06cloudops-ansiblebalrog-admin #4: master-ecb6110ec6b51ce43047af6c36241397392b21b6 DEV STARTED
23:06cloudops-ansiblebalrog-web #204: building mozilla/balrog:v2.41
23:06cloudops-ansiblebalrog-admin #4: master-ecb6110ec6b51ce43047af6c36241397392b21b6 DEV FAILED
23:07cloudops-ansiblebalrog-web #204: failed in build step /cc relud
23:08cloudops-ansiblebalrog-admin #229: failed in build step /cc relud
23:08bhearsumwoo, at least we got the stage deploy cycle triggered with my tag
23:08bhearsumi can probably force a new rev somehow to workaround the issues
23:09cloudops-ansiblebalrog-web #1: master-ecb6110ec6b51ce43047af6c36241397392b21b6 DEV FAILED
23:13cloudops-ansiblebalrog-web #2: latest DEV STARTED
23:14relud^ me
23:15reludbhearsum: the build for v2.41 actually worked for the important thing. it was something on my side that failed intermittently
23:15bhearsumoh sweet!
23:16reludi think just pulling and repushing v2.41 should rebuild
23:17bhearsumhmm, i'm not sure what that means. deleting v2.41 and then repushing it?
23:17cloudops-ansiblebalrog-web #205: building mozilla/balrog:v2.41
23:18* bhearsum throws pom poms around to encourage stage
23:18reludyou do this (if #205 works): docker pull mozilla/balrog:v2.41 && docker push mozilla/balrog:v2.41
23:18cloudops-ansiblebalrog-web #205: failed in build step /cc relud
23:18bhearsumoh, huh
23:18relud*sigh*
23:18reluddocker why
23:18bhearsumi didn't know that would do anything
23:18reludyeah, that triggers a push event
23:18cloudops-ansiblebalrog-admin #230: failed in build step /cc relud
23:19reludbut apparently it also changes digest too >:(
23:19bhearsumhehe
23:19cloudops-ansiblebalrog-admin #5: latest DEV STARTED
23:19bhearsumdo you want me to try a new tag?
23:20bhearsumi can fire the same thing as before by creating a new Release in Github
23:23cloudops-ansiblebalrog-web #2: latest DEV FAILED
23:24cloudops-ansiblebalrog-admin #5: latest DEV FAILED
23:31reludbhearsum: sure
23:31reludis there a command to init an empty db for dev?
23:32reludupgrade-db throws an exception
23:32bhearsumyup, create-db should do it
23:32bhearsumi'm pretty sure that will bring it up to the latest version too
23:33bhearsumok, a v2.41-real tag should come along shortly
23:34reludcool
23:39cloudops-ansiblebalrog-admin #6: master-ecb6110ec6b51ce43047af6c36241397392b21b6 DEV STARTED
23:40bhearsumcome on dev, you can do it!
23:43cloudops-ansiblebalrog-admin #231: failed in build step /cc relud
23:45reluddev web is working now: curl https://aus5.dev.mozaws.net/__version__ -siw '\n'
23:45bhearsumsweet
23:46cloudops-ansiblebalrog-admin #6: master-ecb6110ec6b51ce43047af6c36241397392b21b6 DEV FAILED
23:46bhearsumah, looks like it ended up on the wrong rev because of the new tag i created for stage
23:46bhearsumi'll have to fix that script to not create a master-abcdef tag
23:46cloudops-ansiblebalrog-web #3: master-ecb6110ec6b51ce43047af6c36241397392b21b6 DEV FAILED
23:49bhearsumwe're getting a bit of nightly traffic to prod now, not a ton yet
23:49bhearsumeverything is good so far
23:51cloudops-ansiblebalrog-web #206: mozilla/balrog:v2.41-real deployed to stage /cc relud bhearsum
23:54relud^ yessss
23:56cloudops-ansiblebalrog-web #3: master-ecb6110ec6b51ce43047af6c36241397392b21b6 DEV FAILED
23:56bhearsum:D
23:57cloudops-ansiblebalrog-web #5: master-ecb6110ec6b51ce43047af6c36241397392b21b6 DEV STARTED
23:58cloudops-ansiblebalrog-web #5: master-ecb6110ec6b51ce43047af6c36241397392b21b6 DEV FAILED
23:59cloudops-ansiblebalrog-web #5: master-ecb6110ec6b51ce43047af6c36241397392b21b6 DEV FAILED
12 Oct 2017
No messages
   
Last message: 5 days and 13 hours ago