mozilla :: #mdndev

18 May 2017
13:33jwhitlockgood morning #mdndev
16:22sheppyWow, the "Learn the best of web development" and other bold text is very very bold on the MDN home page.
16:28github[kumascript] Elchi3 opened pull request #182: Update compat macro to work in the new world (master...compat)
16:31rjohnsonfscholz: awesome! i was just starting to do the same work and wanting to talk to you about it, but you already did it!
16:32sheppyrjohnson: That's because fscholz is like a ninja!
16:55bensternthal"cat like typing detected"
17:03rjohnsonfscholz: i'll take a look at PR #182 right now
17:07fscholzrjohnson: great!
17:23github[kumascript] Elchi3 closed pull request #163: bcmacrotest (master...bcmacrotest)
17:32fscholzrjohnson: I added based on your comments in the meeting. Feel free to add more thoughts to it :)
17:33shobsonfscholz: I had been working in CompatDataTable.ejs to add the new compat tables, but it looks like you moved table generation to compat.ejs in is that correct?
17:34fscholzI was trying to create *one* macro :D Actually Will also did merge macros when he refactored for the new schema
17:34rjohnsonfscholz: beautiful! thank you! in fact, i already started a bit of the work yesterday (
17:35fscholzrjohnson: ohhh amazing :)
17:36fscholzshobson: so, yes, that's correct
17:40fscholzAfk for now, though I might be back later. Excited to find out how to publish that pre-release package.
17:44shobsonjwhitlock: Is" index, follow" correct in this comment?
17:44firebotBug 1259725 FIXED, [spam] Make user profile not indexable
17:47jwhitlocksomeday I'll search for a bug and not find 5 others bugs that need closing or updating
17:50shobsonAnd that will be because you took the time to close or update those related bugs.
18:14shobsonI'm going to spend this afternoon logged off of all the things.
18:15shobsonIf you need anything from me now's the time.
18:19jwhitlockshobson: oops, meant to do a prod push already
18:20shobsonI have time for that if we do it soonish
18:20shobsonThen... maybe... the zone CSS stuff is... done?
18:20jwhitlockyeah, that's the main thing going out
18:20jwhitlockk8s readiness endpoints
18:21jwhitlocklet's upgrade bleach too
18:21rjohnsona prod push sounds good to me
18:22jwhitlockhuh no KS merges since Monday
18:23mdnstagepushoh god, jwhitlock is pushing mdn-stage 101b0cffe30b828c366b414eb28ecddf1fc589e3
18:26mdnstagepushFinished: setup_dependencies (166.954s)
18:26mdnstagepushthe push is now going to the webheads!! (101b0cffe30b828c366b414eb28ecddf1fc589e3 jwhitlock)
18:26mdnstagepushFinished: update_info (2.150s), pre_update (171.346s)
18:27mdnstagepushFinished: update_assets (56.368s)
18:27mdnstagepushFinished: update_locales (13.173s), database (2.992s), update (72.533s)
18:27mdnstagepushFinished: rsync_project (6.266s), checkin_changes (6.298s)
18:27mdnstagepushFinished: deploy_app (4.842s), restart_web (4.285s)
18:28mdnstagepushFinished: restart_kumascript (18.708s)
18:28mdnstagepushjwhitlock pushed mdn-stage 101b0cffe30b828c366b414eb28ecddf1fc589e3
18:30jwhitlockok we've got bugs 944186, 1346462, and 1363916
18:30jwhitlockI guess I have to spoon feed firebot
18:30jwhitlockthat's bug 944186, bug 1346462, and bug 1363916
18:30firebot REOPENED Disable interface for managing custom Zone CSS and migrate all rules to /media
18:30firebot NEW upgrade to Bleach 2.0
18:30firebot NEW, rjohnson Add separate HTTP endpoints for use as Kubernetes "liveness" and "readiness" probes
18:32jwhitlockediting still works, so bleach isn't blowing up
18:32jwhitlockcurl -v # works
18:33jwhitlockcurl -v # works
18:33* shobson restarts Firefox before she can spot check zones
18:34fscholz"so bleach isn't blowing up"
18:36jwhitlockthanks for the push song
18:41jwhitlock"generic zones" look good to me -
18:43jwhitlockshobson: ready for prod?
18:43shobsonjwhitlock: yep, the ones I tested are good.
18:43* jwhitlock checks jenkins
18:44jwhitlockhmm failed
18:45jwhitlockall search failures
18:45jwhitlockok search is unavaible on staging
18:46jwhitlockmy guess is that the ElasticSearch upgrade went through some time this week
18:49jwhitlocktrying to confirm that, repopulate
18:54jwhitlockok, no ES on staging at the moment
18:54jwhitlockwill investigate later
18:55jwhitlockbut I think the push is OK for prod
18:55mdnprodpushI miss jbalogh,but anyway, jwhitlock is pushing mdn 101b0cffe30b828c366b414eb28ecddf1fc589e3
18:56jwhitlockericz: any info about elasticsearch on MDN staging?
18:57ericzjwhitlock: I'm making a mess of it now, packages are upgraded but the nodes don't yet want to talk to each other. Work in progress in other words.
18:57jwhitlockOK, good to know it is not this new kuma code
18:57jwhitlockericz: does it look like hours, days, weeks?
18:58mdnprodpushthe push is now going to the webheads!! (101b0cffe30b828c366b414eb28ecddf1fc589e3 jwhitlock)
18:58mdnprodpushFinished: setup_dependencies (157.728s), update_info (2.194s), pre_update (162.255s)
18:58ericzjwhitlock: It should be this week hopefully.
18:58ericzWorst case, next week.
18:59mdnprodpushFinished: update_assets (53.405s)
18:59mdnprodpushFinished: update_locales (11.783s)
18:59mdnprodpushFinished: database (3.213s), update (68.402s), rsync_project (4.096s), checkin_changes (4.130s)
18:59jwhitlockericz: thanks and good luck
18:59mdnprodpushFinished: deploy_app (23.387s)
18:59mdnprodpushFinished: restart_web (11.524s)
19:00mdnprodpushFinished: restart_kumascript (18.968s)
19:00mdnprodpushjwhitlock pushed mdn 101b0cffe30b828c366b414eb28ecddf1fc589e3
19:08wbambergthanks jwhitlock for fixing my BrowserSetting page move hell. everything seems fine now
19:08jwhitlockglad to hear it wbamberg
19:20salhello one of the mdn rabbit queues spiked
19:21salis anyone available to help?
19:21jwhitlocksal: do you know if it is prod or stage?
19:22saldeveloper-prod mdn_purgeable
19:22salgetting a bug ready
19:23jwhitlockI started a DB index on stage, but that will fail because elasticsearch isn't available
19:23saljwhitlock: need me to let webops know?
19:24jwhitlockwell, that was stage, not prod. give me 5 min to see if I can figure out what's up before we escalate
19:25jwhitlocksal: is #mdndev a good room for this, or would you prefer another?
19:29salyea should be good, mmm what component should I file this under?
19:30jwhitlockI usually use the IT request form, just a sec
19:31saljwhitlock: ill just do the usual
19:31jwhitlockInfrastructure & Operations, WebOps:Other
19:32jwhitlockI'm not seeing a spike in graphite
19:32salmm its at 31821 right now
19:32sal31804 actually
19:32firebotBug 1366044 NEW, server-ops-webops@mozilla-org.bugs developer-prod mdn_purgeable spiked 30k+ messages
19:36safwanMaybe a lot of celery task!
19:38jwhitlockyes, but why
19:39safwanjwhitlock: Maybe a lot of revision comparison again
19:46jwhitlockmost are cachebacj.tasks.refresh_cache
19:46jwhitlocksorry cacheback
19:48ericzI'm around if I can help at all
19:51jwhitlockit looks like they are processing, total count going down
19:51jwhitlockI need to step out, back in 30 min
20:14fscholz this is a cool calculator. I think writing "1.0.x" in your package.json would always give you the latest 1.0.1337 version etc.
20:15fscholzlike "1.0.30000670 is the latest of 765 releases"
20:16rjohnsonfscholz: yeah, that is cool
20:17fscholzIf we change the data structures or the schema, we would release 1.1.x or even 2.0.x
20:17fscholzAnd my understanding is that 1.0.x would not load it, but stick to 1.0.1337
20:21rjohnsonfscholz: that sounds great, yes, that's my understanding as well. i wonder what most people use for backwards-incompatible changes, do they use 1.1 --> 1.2 or 1 --> 2? in my experience outside of npm, 1 --> 2 is more common for changes that would break old code
20:24fscholz says that "1.1" would be new features but not breaking existing features and "2.0" would be break backwards compat.
20:24rjohnsonfscholz: that makes sense
20:24fscholzSeems like a repo about compat data should stick to these rules :D
20:24rjohnsonfscholz: yes! :)
20:25jwhitlockok, we're at 27,848 pending tasks
20:27jwhitlocklet's say 70 tasks a minute
20:27jwhitlockor let's say 73.259
20:28jwhitlock380 minutes to get through them all
20:28jwhitlockor six hours
20:28rjohnsonjwhitlock: did one or more workers stop processing for a while and that's why it built-up?
20:28jwhitlockgood question, let me look at NR
20:32jwhitlocksal: do you have a graph of the count over time?
20:32jwhitlockor ericz? ^
20:33ericzI don't know of a graph for that.
20:34salDon't think so let me look tho
20:34* safwan Thinking its thrilling to work as SysAdmin!
20:34jwhitlockI had, but shows 0
20:35jwhitlockrjohnson: no, all 4 celery works look like they've been busy
20:37jwhitlockthe document zone task is the new one, so it's the most likely suspect
20:40jwhitlockmeh, it's a mix
20:41ericzYeah I don't see anything in graphite or newrelic but seems like a great candidate for graphite. You'd think there's some sort of pre-built thing for rabbit.
20:41ericzin collectd or statsd.
20:42* jwhitlock googles it
20:43jwhitlockok, that may be a cool thing for a different day
20:51jwhitlockhmm I wonder if DocumentNearestZoneJob does the right thing for a page that is not zoned
20:52jwhitlockmy guess is the right thing is "cache that there is no zone"
20:52jwhitlockand the wrong thing would be "I don't know, but ask me next time too"
20:54jwhitlockbleach update requires rebuilding the docker image
20:56jwhitlockthe other possible thing is that 3 hours was a good choice for DocumentZoneURLRemapsJob, but DocumentNearestZoneJob should be a day or longer
20:58rjohnsonjwhitlock: just got out of my 1:1 (not my usual time)
20:59jwhitlockok still investigating
20:59rjohnsonjwhitlock: oh! you've hit on something
21:00rjohnsonjwhitlock: i return None if the doc has no zone, which is the same as "nothing in the cache"! ugh! didn't think of that
21:01rjohnsonjwhitlock: as you said, i need to cache somethign that means "there is no zone"
21:01rjohnsoni can't use None
21:01jwhitlockjust a sec, getting the code
21:01jwhitlockyou return self.empty()
21:02rjohnsonand the defaukt for that is None i think
21:04jwhitlockoops needed VERSION=latest make build
21:05rjohnsonyeah, None is interpreted as a cache miss
21:05rjohnsonok, i'll need to change that
21:06rjohnsongood catch jwhitlock!
21:07jwhitlockI'm seeing if I can verify
21:07rjohnsonit turns out i'm the culprit for this celery spike!
21:11rjohnsonjwhitlock: i'm going to start working on a pr to fix this right now
21:12jwhitlockok please include fix for
21:12firebotBug 1365960 NEW, Error if a translation's parent is deleted
21:15rjohnsonjwhitlock: will do
21:23jwhitlocksal: ericz: I think we've identified a recent change that would cause an increase in tasks
21:23ericzSounds promising!
21:23saljwhitlock: nice
21:23jwhitlockcan you live with alerts until Friday?
21:24salits at 25854 right now
21:24rjohnsonjwhitlock: should i open a new bug or reference bug 1366044 (for the nearest zone job bug)
21:24firebot NEW, developer-prod mdn_purgeable spiked 30k+ messages
21:25jwhitlockrjohnson: no need for a new bug. I'd use the original bug, maybe mention 1366044 in comment
21:26ericzYeah which bug is original? I'll reference it in mine.
21:27rjohnsoneriz, jwhitlock: i think it was bug 944186
21:27firebot REOPENED, Disable interface for managing custom Zone CSS and migrate all rules to /media
21:27rjohnsonericz: ^^
21:28jwhitlock and I need to make sure bug #s are in commit messages when I review
21:29saljwhitlock: sure, will the messages keep spiking tho? maybe purge after it goes over 20k? your call
21:30saluntil friday that is
21:30jwhitlocksal: there may be some cycles of adding a bunch at once, maybe a three hour cycle. I'd rather avoid purges, if you can ignore the alerts for 24 hours or so
21:31saljwhitlock: i can downtime for 24h then
21:35github[kumascript] stephaniehobson deleted stephaniehobson-empty-embedcompattable at e51dcd7:
21:42ericzjwhitlock|away: btw, mdn stage es is up at the new version now but I need to do a fair bit more work still to get this in puppet, experiment with backups and whatnot so expect some churn. I'll keep it as available as I can though.
22:20jwhitlockericz: thanks for the update. Take your time and get it right for prod!
22:22jwhitlockthe DocumentNearestZoneJob change would have gone to production last Thursday
22:22jwhitlockso this backup might have been steadily creeping up over a week
22:22jwhitlockand, I expect it will clear very quickly when fixed
22:31rjohnsonjwhitlock: of course, my docker env doesn't start now :(, i'm getting this:
22:31jwhitlockVERSION=latest make build
22:32jwhitlockand then
22:32rjohnsoni already did the kuma reset
22:36rjohnsonah, the kuma-base build and push failed on the bleach merge due to mysql --
22:37jwhitlockah that would explain why kuma reset was not enough
22:37rjohnsonyeah, i thought it was strange that iwasn't picking up the latest kuma-base
23:27safwanrjohnson: Good morning. You can manually install the requirements by ssh into the container
19 May 2017
No messages
Last message: 10 days and 14 hours ago