mozilla :: #balrog

13 Sep 2017
12:59catleebhearsum: a good first bug?
13:00bhearsumyeah, could be
13:15milesanother prod rds CPU alert
13:15milessame thing, steady 70%
13:21catleel10n repacks rolling in?
13:21bhearsumit sort of correlates with nightly submission, but it doesn't hit until they've been submitting for awhile
13:21bhearsumi wonder if it's nightly submission + instance recycling at the same time
13:22bhearsum+ increased traffic because of the east coast coming online
13:22bhearsummiles: this is new as of this week, isn't it?
13:23bhearsumi actually see a spike every hour if drill down to the hour view in datadog, eg:
13:25bhearsumeven going back a month it's there:
13:34bhearsummiles: i guess the spikes of a gotten a little bit higher or longer since you're getting alerts about them now?
13:36milesyeah, the alert threshold is avg 70% for over 5min
13:36milesand yeah, I see what you're saying
13:36milesdatadog smoothing over time makes it hard to see
13:37milesI think the correlation with east coast coming online makes sense
13:37miles9am there both times
15:13hhhh1612bhearsum:hi, whenever you have time , please notify me, have some silly questions.
15:18bhearsumhey hhhh1612
15:18bhearsumwhat's up?
15:21hhhh1612great !
15:23hhhh1612hey , how does applications from different channels request an update ,is it through web workers!
15:24bhearsumthey just make a simple http request to an URL like
15:24bhearsum"channel" is just one part of that (in the above example, it's "release")
15:25bhearsumyou can see the different URL formats that are used in
15:31hhhh1612alright ! those are kinda rules that are matched up with requests and validate it .
15:31bhearsumyou probably noticed that most of the columns on the Rules table line up with fields from the update URLs
15:33hhhh1612& we need Admin API over all this
15:34hhhh1612to control all things.
15:34hhhh1612using Admin UI
15:35hhhh1612database to store all the history related to previous rules in the form of history tables.
15:42hhhh1612are those requests by applications automatic or triggered !
15:44bhearsumthey're automatic, generally. Firefox checks for updates twice a day in the background, if i remember correctly
15:47hhhh1612bhearsum: what is a fallback mapping I did not clearly get it in context to users.
15:49bhearsumhhhh1612: did you see the description of it on
15:51bhearsumso, fallbackMapping, mapping, and backgroundRate are all sortof tied together
15:51bhearsumif backgroundRate is less than 100, we roll a die to determine whether you get mapping or fallbackMapping
16:01glassercAre there circumstances in which a change is implicitly signed off by the person who made the change?
16:01glassercI thought I saw something but can't find it now
16:03bhearsumglasserc: yup, if the user holds one (and only one) of the required roles, that will happen when the change is created
16:03bhearsumit doesn't happen on updates to scheduled changes at the moment, but that's a bug
16:09glassercI see, it's in the DB code
16:10bhearsumdoes it seem odd to you that it's in the db code?
16:10glassercA little bit, because I think of DB code as being contextless
16:10glassercI was looking for it in the views
16:11glassercOn the other hand, it does make a good central place to enforce such a constraint
16:12bhearsumi usually put that sort of thing in the db layer because it's centrally enforced, it's definitely not ideal though
16:12bhearsumjlorenzo filed awhile back about it, actually
16:14glassercIn my mind, the DB layer in Balrog is like an ORM, which is to say that it should have certain kinds of business logic, namely defining what kinds of queries are available and how to map them to the database structure
16:15glassercI usually get nervous whenever people talk about "service" or "middleware" layers
16:16glassercI guess it can go either way, because in WSGI-land, middlewares are generic and flexible
16:16glassercBut at my last job, the "services" layer existed because the front-end team didn't want to be bothered actually talking to more than one backend service at a time
16:17glassercI get a bit of a J2EE vibe when I hear about "services", like they cover one bad API with another, differently bad API
16:17bhearsumoh, we definitely don't want that :)
16:18bhearsumif i remember jlorenzo's suggestion correctly, is more like models on top of the database that helped fulfill specific use cases. i think the idea is that they're particularly helpful if you need to do things that touch multiple different objects
16:18glassercYeah, that makes sense to me
16:18glassercI'm not 100% averse to the idea
16:19glassercI just don't want "services" to end up as the catch-all bucket that code goes because there's no clear place for it
16:19glassercLike "helpers" in Ruby on Rails
16:19jlorenzo+1 too :)
16:19bhearsumi have a strong aversion to dumping grounds
16:19jlorenzoI agree with what bhearsum said above
16:19bhearsumwe had (have?) a file in our buildbot repo called "" that was ~8000 lines long
16:20bhearsumit scarred me.
16:20jlorenzoouch, never seen that file
16:22jlorenzoI confess the service pattern was from a previous J2EE experience. I'm open to new ideas :)
16:23glassercIf I have any, I'll let you know, but don't let my complaining distract you from actually improving software :)
16:23jlorenzothe original point was to let that layer fetch all the required objects for a single request
16:23jlorenzosounds good!
16:23bhearsumglasserc: no worries, i love getting other perspectives like yours
16:25jlorenzoso do I ;D
16:25bhearsummiles: i just realized that it's probably the end of the summer dip that's causing those load spikes to be longer compared to a couple of weeks ago...
16:26bhearsumand i'm not sure the alerts were in place before the summer ADI dip happened
16:46hhhh1612bhearsum: I m ready for a fresh balrog install now.
16:59bhearsumhhhh1612: sorry, what do you mean?
17:00hhhh1612should i try to compose up docker as before
17:00bhearsumif you're planning to work on another bug, that's a good idea :)
17:05hhhh1612yeah hoping this time it goes well.
17:13hhhh1612thats the part where things break
18:08bhearsumcan you paste with more context? npm is clearly having some issues
18:34bhearsummiles: any idea when we'll push today?
19:05milesbhearsum: I think I missed your message asking for it!
19:05bhearsumwhoops, sorry
19:05milesor is it standard to prod push on wednesdays?
19:05bhearsumi filed
19:05bhearsumit's usually the day we push, yeah
19:06bhearsumsorry, usually i just file the bug and relud does it
19:08milesgotcha - i will handle this!
19:09milesdoes order matter (web/admin) for this push?
19:09bhearsumthere's no migrations so either order should be fine - IIRC relud prefers to start with admin though
19:19cloudops-ansiblebalrog-admin #205: master-f705320369ef6c12d499e881906fcf9936c96f1c deployed to prod /cc relud bhearsum
19:29cloudops-ansiblebalrog-web #180: mozilla/balrog:master-f705320369ef6c12d499e881906fcf9936c96f1c canary deployed to prod /cc relud bhearsum
19:32bhearsumso far so good
19:34milesbhearsum: looks like the canary is healthy, shall i proceed
19:34bhearsummiles: let's give it another ~10min just be sure, if you don't mind
19:34milesis there a standard verification process?
19:35milesi do this infrequently enough that my understanding of it is ... limited
19:35milesand somewhat the consistency of alphabet soup
19:35bhearsumi'm not sure what relud does - i usually drill down into the past hour on datadog, and do some functional verification on admin
19:36bhearsumthe only thing we can really check for on web while we're at 1% is error rates, i think
19:47bhearsummiles: i'm good to go whenever now
19:49cloudops-ansiblebalrog-web #180: please check balrog canary and promote to full deploy /cc relud bhearsum
19:49milesbhearsum: hitting ze button
19:49bhearsumi hope it's big and red!
19:55cloudops-ansiblebalrog-web #180: mozilla/balrog:master-f705320369ef6c12d499e881906fcf9936c96f1c deployed to prod /cc relud bhearsum
19:57bhearsumthanks miles, and sorry for the confusion earlier
19:58milesno worries! thanks for clarifying :)
20:29203BACHD4balrog-admin #206: building master-682ddcb176ab18f210cafd34410022fa5eab321e
20:30cloudops-ansiblebalrog-web #182: building mozilla/balrog:master-16d40758cc761fe68ba9bad813fe6f3d5b9bb8e3
20:31cloudops-ansiblebalrog-web #182: failed in build step /cc relud
20:37milesthanks for handling the bug!
20:37milesi'll kick that new build
20:38cloudops-ansiblebalrog-web #183: building mozilla/balrog:master-16d40758cc761fe68ba9bad813fe6f3d5b9bb8e3
20:41cloudops-ansiblebalrog-admin #207: master-16d40758cc761fe68ba9bad813fe6f3d5b9bb8e3 deployed to stage /cc relud bhearsum
20:41cloudops-ansiblebalrog-web #181: mozilla/balrog:master-682ddcb176ab18f210cafd34410022fa5eab321e deployed to stage /cc relud bhearsum
20:41cloudops-ansiblebalrog-admin #206: master-682ddcb176ab18f210cafd34410022fa5eab321e deployed to stage /cc relud bhearsum
20:41cloudops-ansiblebalrog-admin #206: admin_scaledown in stage failed /cc relud
20:42cloudops-ansiblebalrog-admin #207: admin_scaledown in stage failed /cc relud
20:44cloudops-ansiblebalrog-web #184: building mozilla/balrog:master-6d22c6df2f74dc85a5e864a0f8237a2a93285692
20:48cloudops-ansiblebalrog-web #183: mozilla/balrog:master-16d40758cc761fe68ba9bad813fe6f3d5b9bb8e3 deployed to stage /cc relud bhearsum
20:55cloudops-ansiblebalrog-web #184: mozilla/balrog:master-6d22c6df2f74dc85a5e864a0f8237a2a93285692 deployed to stage /cc relud bhearsum
20:56cloudops-ansiblebalrog-admin #208: master-6d22c6df2f74dc85a5e864a0f8237a2a93285692 deployed to stage /cc relud bhearsum
14 Sep 2017
No messages
Last message: 8 days and 8 hours ago