mozilla :: #datapipeline

12 Sep 2017
00:31joyRaFromBRC: ashort thanks much. This is for a personal flask app which needs to be authenticated
16:18sunahsuhi'm thinking about throwing the parquet-tools complete jar into s3://telemetry-spark-emr-2/jars so everyone has access to it without having to clone and build it (and run into issues with missing dependencies which require editing pom.xml, ugh) -- any objections?
16:21mreidsgtm. maybe we should put a readme in there too
16:21sunahsuhunder /jars?
16:22mreidsure
16:23mreidunless there's already documentation for what should go in there
16:23sunahsuhwell, the alternative i was considering was creating a telemetry-assets bucket or somesuch
16:24sunahsuhbecause it does seem to be a slight deviation from the intent for the telemetry-spark-emr-2 bucket
16:24mreidyeah, I've thought about that too - a place to put artifacts
16:24mreidbut I don't wanna bikeshed your useful suggestion out of commission :)
16:25sunahsuhhaha no, glad to hear a +1, i'm going to create telemetry-artifacts, and file a bug to put a README at the root of each `telemetry-` bucket stating its purpose :D
16:25mreidexcellent
16:25mreidnext step: publish t-b-v to that bucket on merge to master :)
16:26mreidand also possibly push the pipeline schemas there too
16:26sunahsuheverything's coming together
16:27sunahsuhhttps://media.giphy.com/media/8fen5LSZcHQ5O/giphy.gif
16:33frankmreid: how about publish it there on a new release; jobs can then choose which release to use
16:33sunahsuhwhy not both?
16:33mreidt-b-v.latest.jar + t-b-v.<specific_release>.jar?
16:36frankooh yeah, that sounds even better
16:58mreidkinda like the EMR release pinning
18:33mreidsunahsuh: d&#39;ya file that bug re: artifacts?
18:33mreidowait
18:33mreidthe bug was about readmes for all the buckets. n/m
18:33sunahsuhabout t-b-v specifically, no
18:34sunahsuh:)
18:34mreidnever mind. reading comprehension fail over here :)
19:09joywhere is the histogram viewer?(the one with cdfs)
19:15frankjoy: TMO has cdfs, it&#39;s an option
19:16frankjoy: e.g. https://mzl.la/2wX6OED
19:32frankwoah, you can kill jobs from the UI in Spark 2.2
19:42mreidjoy: https://gauss.telemetry.mozilla.org
19:42joythanks
20:53tcscamiyaguchi: any chance you&#39;ll get to look at https://github.com/mozilla/telemetry-batch-view/pull/290 any time soon?
20:54amiyaguchitcsc: ah, I&#39;ll try to get to it soon
20:54amiyaguchiI
20:54amiyaguchiI&#39;m sorry about that delay
20:54tcscawesome.
20:56tcscis there a plausible way for me to use https://github.com/mozilla/moztelemetry/pull/6 with it? i tried for a bit last week and (i think i) managed to get something working locally, but it seemed like it would need a version bump, or at least a republish of the snapshot to work
20:57tcscnote that that&#39;s landed. it also has a huge benefit if i select the parallelism there and just go with it instead of doing a repartition(100) (or even a coalesce(100) at the end
20:57franktcsc: I can publish moztelemetry
20:57tcscoh, could you do that?
20:57franktcsc: yup, doing it now
20:58tcscthat&#39;s awesome. i should have asked here earlier
21:01franktcsc: should be good to go now
21:41tcscamiyaguchi: ugh, i think i&#39;ve mucked up that pr on github somehow when trying to update it. (seems like github isn&#39;t a huge fan of my m-c workflow D:). i&#39;ll try to work this out, just an fyi in case you take a look soon
21:42amiyaguchiturned the branches into brambles?
21:43tcscwell, i rebased, and the pr now shows all the commits that i rebased over
21:43tcscwhich is :|
21:44harteramiyaguchi: heh, never heard that before.
21:48* tcsc thanks god for git-cherry-pick
13 Sep 2017
No messages
   
Last message: 9 days and 7 hours ago