mozilla :: #mdndev

17 Apr 2017
14:04jwhitlockmoin #mdndev
15:39jwhitlockug I thought the unicode normalization was a quirk with Github, but we have the problem on MDN too
15:44rjohnsonjwhitlock: to help me understand, can you describe an example?
15:46jwhitlockwell, I'm trying, but coming up short :)
15:46rjohnson:)
15:46jwhitlockhere's a NFC URL - https://developer.mozilla.org/el/docs/%CE%95%CF%86%CE%B1%CF%81%CE%BC%CE%BF%CE%B3%CE%AD%CF%82
15:46jwhitlockand the NFD URL - https://developer.mozilla.org/el/docs/%CE%95%CF%86%CE%B1%CF%81%CE%BC%CE%BF%CE%B3%CE%B5%CC%81%CF%82
15:47jwhitlockbut they both appear to work
15:48rjohnsonjwhitlock: is it when someone uses a combination of unicode chars (meant to be a single char) in a url, and we don't understand that particular combination as matching the single char (because we only understand one of several combos)?
15:48jwhitlockI'm still digging into it. My scraper had problems with this page, got a 404
15:49jbradberryI get a 404 for the latter link
15:49jwhitlockjbradberry: which browser?
15:49jbradberryFirefox, naturally.
15:49jwhitlockme too
15:49jbradberry(I only use Chrome when I need to test cross-browser functionality)
15:49rjohnsoni get a 404 for the latter link as well
15:50jbradberry52.0.2 for the version.
15:50jwhitlockok it's only safari that does a normalization before requesting - interesting
15:51jbradberryAre you going to need to normalize the normalization? ;P
15:51jwhitlocklooks that way :/
15:52jbradberryStill on MySQL, or did the Postgres conversion get anywhere?
15:53jwhitlockstill on mysql
15:53jwhitlockmetadave experimented with it, but pgloader / postgresql didn't like some of our unicode data
15:54jwhitlockok, this is three levels away from the thing I was trying to do, so I'm going to take a note to fix it an go back to setting up the A/B test
15:56jwhitlockbut first
15:56jwhitlocknfd URL - https://developer.mozilla.org/bn-BD/docs/Mozilla/%E0%A6%AB%E0%A6%BE%E0%A6%AF%E0%A6%BC%E0%A6%BE%E0%A6%B0%E0%A6%AB%E0%A6%95%E0%A7%8D%E0%A6%B8
15:57jwhitlocknfc URL - https://developer.mozilla.org/bn-BD/docs/Mozilla/%E0%A6%AB%E0%A6%BE%E0%A6%AF%E0%A6%BC%E0%A6%BE%E0%A6%B0%E0%A6%AB%E0%A6%95%E0%A7%8D%E0%A6%B8
15:57jwhitlockhmm nevermind those are the same
16:01jwhitlockah https://developer.mozilla.org/bn-BD/docs/Mozilla/%E0%A6%AB%E0%A6%BE%E0%A7%9F%E0%A6%BE%E0%A6%B0%E0%A6%AB%E0%A6%95%E0%A7%8D%E0%A6%B8
16:02jwhitlockthat's the non-normal form in the MDN database
16:46* sheppy grumbles at search for "5.1" turning up results with any usage of the two digits.
16:47sheppy(either of them)
16:54jwhitlockhttps://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=site:developer.mozilla.org+5.1+audio
17:49sheppyrjohnson: So... constructors aren't tracked in the JSON. The pages for them on MDN get a "Constructor" tag, and that's used to identify them.
17:58rjohnsonthanks sheppy
17:58sheppyThat said, it might be logical to add a "constructor" key somewhere. I don't know.
18:52jwhitlockrjohnson: I think I forgot to mention it - the new macros code should help with the error messages
18:54jwhitlockmacro_sources gives a dict of case-sensitive names to repository names w/ extensions - https://github.com/mozilla/kuma/blob/master/kuma/wiki/kumascript.py#L290-L308
18:55jwhitlockit's not a simple lookup - you may have to iterate through the keys to match the case in the error statement
18:56rjohnsonjwhitlock: thanks! that will be helpful!
18:58* rjohnson goes for lunch
20:48jbradberrySo for the whole unicode url kerfluffle above, is it desired for both to land on the same page, or instead for one to redirect to the other?
21:02jwhitlockjbradberry: if we always stored URLs in the same form, then we could potentially normalize incoming requests and redirect them to the right URL
21:03jwhitlocksince we don't, we can't, and you have to use the same utf8 byte sequence the author used
21:04jwhitlockI'm not 100% sure what the next action is, but I suspect we need a few changes to the slugify functions, and the form cleanup and validation
21:09jbradberryI should also point out, I just noticed, that https://developer.mozilla.org/en-US/docs/Web/HTML is the same page as https://developer.mozilla.org/en-US/docs/Web/html
21:12jwhitlockha and https://developer.mozilla.org/en-US/docs/WEB/HTML and https://developer.mozilla.org/en-US/docs/wEB/hTmL
21:12jwhitlockwow I hate this website
21:13jwhitlockI'm not sure if "case-insenstiive URLs" is a designed feature or a side effect of mysql
21:13jbradberryI suspect side effect.
21:15jbradberryAnyway, back to the original can of worms, perhaps it'd be worthwhile to do a data migration to convert everything to NFC?
21:16jbradberry(which is, I think, the normalization form Safari et al uses)
21:17jwhitlockyes, but not until we fix the forms - need to turn off the hose of bad data before draining the basement
21:17jbradberryfair enough
21:17jwhitlockalso, not sure of the priority - this is not a recent problem
21:24shobsonLOL @ "hose of bad data"
21:30safwanjwhitlock: Hey, is it holiday in US?
21:31jwhitlocksafwan: not a company holiday. Some people, like one of my kids, have the day off
21:31jwhitlockseems to be a common holiday in Europe
21:31safwanOh, School holiday
21:32* safwan remembering the Scool life and many holidays!
21:32shobsonIn Canada it is mostly the union jobs that have today off. Postal workers, teachers, passport office...
21:32shobsonBut all of Canada got Friday off.
21:34557A68BAU[kumascript] escattone pushed 3 new commits to master: https://git.io/vSFyD
21:34557A68BAUkumascript/master 7ebc5da Joseph Medley: Add Media Session API to GroupData.json.
21:34557A68BAUkumascript/master 86133b7 Joe Medley: Fix typo.
21:34557A68BAUkumascript/master 7cf5b06 Ryan Johnson: Merge pull request #157 from jpmedley/msession...
21:34203A8AK40[kumascript] escattone closed pull request #157: Add Media Session API to GroupData.json. (master...msession) https://git.io/vS9za
21:34safwanHere I got one day extra holiday each year! "Military day" ;)
21:35shobsonhttps://en.wikipedia.org/wiki/Army_Day_(India) ?
21:36jwhitlockok, now it's started to get ridiculous - running two kuma docker-compose environments at once
21:37shobsono.O
21:37jwhitlockit requires editing the ports lines in docker-compose.yml, can't be overridden
21:38jwhitlockshobson: any definite feelings about https://github.com/mozilla/kuma/pull/4180 ?
21:39shobsonjwhitlock: Haven't had another change to revist the rabbit hole. You portrayed what I wanted accurately.
21:39jwhitlockok, new commit inbound
21:39shobsonNot sure it's worth making the change, maybe we should just accept the work you've done.
21:39shobsonOh, okay :)
21:44jwhitlockwe can change if it doesn't give the data we want
21:45jwhitlocksorry that sounded wrong - if it doesn't allow us to segment usage as desired
21:45shobsonheh
21:50safwanshobson: https://en.m.wikipedia.org/wiki/Armed_Forces_Day_(Bangladesh)
21:52shobsonOh, I don't know why I thought you were in India, sorry!
21:52safwanNo problem
22:05safwanjwhitlock: Do there any specific reason to make the "Yes" "No" localizable? https://github.com/mozilla/kuma/blob/b32a9c951ac32fb1ed28122d5ed297a45e7eee15/kuma/core/templatetags/jinja_helpers.py#L88
22:06jwhitlocksafwan: if you want to say "yes" and "no" in the page's locale
22:10safwanjwhitlock: Ok. Cool
22:17github[kumascript] jpmedley opened pull request #160: Add two more APIs to GroupData.json (master...credmng) https://git.io/vSFdN
23:56github[kumascript] escattone pushed 1 new commit to add-json-linter-1354243: https://git.io/vSbJ4
23:56githubkumascript/add-json-linter-1354243 3e2bf81 rjohnson: rename L10n-CSS.css and delete CustomSampleCSS.css
18 Apr 2017
No messages
   
Last message: 67 days and 2 hours ago