mozilla :: #rust-infra

17 Jul 2017
01:26est31wtf
01:26est31simulacrum: do you know how cargo is being distributed with rust nightlies?
01:36simulacrumWe lock it to a particular commit and it's built on each PR
01:36simulacrumest31: I can possibly clarify further if you clarify the confusion/question :)
01:37est31ok
12:27carols10centsheads up for triage: niko is traveling for the next month and isn't going to be as available as he usually is. he is checking in, so we should ping him, but perhaps be less aggressive with the pings :)
12:31est31if you want to triage a bug, here is one w/o tags: https://github.com/rust-lang/rust/issues/43250
13:43nmatsakisanybody seen this stuff "test process::tests::test_process_output_fail_to_start has been running for over 60 seconds"?
13:44nmatsakiscarols10cents: also, re: "niko is traveling for the next month and isn't going to be as available as he usually is", I appreciate you not saying "is going to be even less available than he usually is" ;)
13:44nmatsakisI'm wondering about gh43271
13:44nmatsakishttps://github.com/rust-lang/rust/pull/43271#issuecomment-315724528
13:44nmatsakislooks like @bors: retry material to me
14:02acrichtonmatsakis: hm yeah that does look spurious, although not one we've ever seen before
14:02nmatsakisacrichto: ok I wasn't sure whether to just @bors retry, of there was a procedure (at minimum, ccing some tracking issue?)
14:03acrichtonmatsakis: oh sure no worries, I can retry it soon w/ a cc tag
14:03acrichtonmatsakis: in teneral though yeah we just try to cc some tracking issue whenever we see a `@bors: retry`
14:25carols10centsnmatsakis: <3 ;)
14:32est31acrichto: isn&#39;t the next release 1.19?
14:32est31or did I misunderstand sth?
14:33est31regarding https://github.com/rust-lang/rust/pull/43285
14:33acrichtoest31: I even wrote 1.19 in the commit
14:33acrichto(yes)
14:33est31:)
15:33tomprinceaturon: https://bugzilla.mozilla.org/show_bug.cgi?id=1380236#c13 is another cool bit of mozilla&#39;s CI infra.
18:23acrichtoTimNN: on a scale of 1-10, how excited would you be to lead the charge to update to llvm 5.0.0?
18:24acrichtomy prediction is like a -4, so no worries if you&#39;re busy :)
18:28TimNNacrichto: exited? -100
18:28acrichtolol
18:28TimNNAlthough I do have the Time starting next Weekend
18:29TimNNWhen will 5.0 be released?
18:29acrichto5.0.0 branch will happen this wednesday
18:29acrichtoso I figured it&#39;d be great if we could start to get out ahead of that
18:29acrichtoand ideally be ready to go by the time they cut 5.0.0
18:29TimNNAnd I can probably motivate myself despite the lack of motivation
18:30acrichtooh this isn&#39;t like pressing or anything
18:30acrichtoso if you&#39;ve got higher-priority things please feel free to tackle them
18:30acrichtoI just got the llvm weekly newsletter and figured I&#39;d ask
18:30acrichtowe may be able to solicit community contributions on this as well
18:30acrichtow/ a tracking issue and such
18:30TimNNNo, I can do the upgrade, just not before the weekend
18:30acrichtoheh ok!
18:31TimNNWill emscripten update as well?
18:31acrichtohm good point
18:31acrichtobut tbh
18:31acrichtoI don&#39;t know why we need them to update...
18:32acrichtoI think we can all start working in parallel though
18:32acrichtovs blocking on one another
18:32acrichtoI&#39;d be pretty surprised if we actually needed to update emscripten
18:32TimNN
18:32nagisadoesnt LLVM have the asmjs backend now
18:32steveklabnikwasm, no asmjs
18:32steveklabnik(iirc)
18:32nagisathat makes emscripten unnecessary
18:32nagisawasm, sure
18:32nagisasame turd other hand.
18:33nagisaso yeah maybe we can evaluate dropping emscripten now?
18:33steveklabniksoooooooooooo there was an internals thread....
18:34nagisanot like it is a particularly interesting target which would warrant wasting our time every LLVM release
18:34steveklabnikhttps://internals.rust-lang.org/t/moving-webassembly-support-forward/5460/33?u=steveklabnik
18:34nagisa(and also suffer the bugs it brings)
18:50carols10centsnagisa: could you please not call other people&#39;s hard work a turd? thanks!
19:44aturonacrichto: hey, so i was talking a bit to aidanhs about the overall infra revamp
19:44aturonthinking that, since we&#39;re sort of in &quot;evaluation&quot; mode right now, it&#39;d be good to prototype a minimalistic/incremental alternative
19:44aturonand then we can compare the cost/benefit
19:45aturoni plan to go back over the previous meetings etc and try to gather all of the concerns, so we can do a basic comparison
19:45acrichtoseems reasonable to me yeah
19:45aturonaidanhs has graciously offered to explore a minimalistic approach
19:45aturoni mentioned basically trying to follow the rust-central-station model
19:45aturonone thing that&#39;s not immediately clear: how to handle the bastion
19:46aturondo you have thoughts on that from a minimalistic perspective?
19:46acrichtohm
19:46aidanhs(acrichto: briefly so I&#39;m clear, rcs is just a docker container you restart occasionally?)
19:46acrichtoaidanhs: correct yeah
19:46acrichtoboot a stock ec2 image, install docker, run rcs itself via the scripts it emntions
19:46acrichtoe.g. just run it in the background with docker
19:46aidanhsok
19:46acrichtothe bastion is indeed tricky and may not mesh well with docker
19:47acrichtob/c it deals with user accounts and whatnot
19:47* tomprince things rcs should be split into multiple containers
19:47acrichtothere&#39;s that open source project I think called teleport which may work well here?
19:47aidanhswell also, a crucial part of the bastion is the network isolation
19:47acrichtooh right so in terms of deployment of the instance itself that&#39;s all hand-crafted for rcs
19:47acrichtoin that I spun up some server a long time ago
19:47acrichtoand manually configured a security group and whatnot
19:47tomprincehttp://gravitational.com/teleport/
19:48acrichtoit&#39;d in theory also be great to not do that, but I&#39;m not sure it&#39;s critical to not have to do that
19:48aturonacrichto: right, and one thing i was thinking: we&#39;ll want to use docker (or something similar) in the end for most of these services, so why not *start* there as an incremental step
19:48acrichtooh yeah definitely
19:48aturonand then we can later work on systematizing instance deployment etc
19:49acrichtoI&#39;m not sure if the bastion is a great fit for docker
19:49acrichtoit&#39;d require some investigation I think
19:49aturon(IOW i think this is all a prerequisite for e.g. terraform)
19:49acrichtoright yeah (or so I&#39;d imagine)
19:49aidanhsI&#39;m leaning -1 on bastion in docker
19:49tomprinceaturon: I&#39;m not sure what the incremental step from where we are is, for just doing something with docker is?
19:49aturonit could be that for the time being, just documentating the instances and having whatever key info is needed in 1password is already a big step toward maintainability
19:49aidanhsok I&#39;ll have a think about the pieces that fit together
19:50acrichtoaidanhs: lemme get you something
19:50tomprinceIt seems like that is already how RCS is deployed?
19:50acrichto aidanhs https://github.com/edunham/ansible-rust-infra/tree/master/roles/bastion
19:50acrichtothat&#39;s the old ansible config for the bastion
19:50acrichtonever actually deployed in prod
19:50acrichtobut that&#39;s an idea of what&#39;s running on there
19:50aidanhslol
19:50aturontomprince: yeah, this is more about making other services more RCS-like (or moving them into RCS, i believe)
19:50tomprinceI do think that setting up the bastion is starting at the wrong end of things, for a minimal improvement.
19:50acrichtobasically just this -- https://github.com/edunham/ansible-rust-infra/blob/master/roles/bastion/tasks/main.yaml
19:50aturonacrichto: ^ correct me if i&#39;m wrong
19:51acrichtoaturon: I like the rcs model in terms of leveraging docker for installing and configuring software and whatnot
19:51acrichtoand in terms of deployment it&#39;s very &quot;easy&quot; if not codified
19:51acrichtoin that we can do it immediately today
19:51acrichtobut there are improvements to the rcs model we should schedule up if we can
19:52aturonacrichto: right. what i mean is, in terms of the incremental step you had in mind, it&#39;s basically movign more things into RCS? (or making them more like RCS)?
19:52acrichtoe.g. rcs should be sharded into a number of docker containers, not justone, etc
19:52tomprinceI don&#39;t think the RCS model of stick everything in one container is good.
19:52acrichtooh yeah definitely
19:52acrichtoincrementally let&#39;s start moving things to docker
19:52acrichtoand deploying those today
19:52aturontomprince: make sense?
19:52acrichtothe bastion has a hard requirement to be a separate machine
19:53acrichtoit basically can&#39;t be merged literally into rcs
19:53tomprinceYeah, although I&#39;d pick heroku over docker for anything that can support it.
19:53acrichtobut we can move it into an rcs direction where the config is written down somewhere
19:53acrichtotomprince: sure but we&#39;re taking this one step at a time
19:53acrichtoe.g. we just want the bastion right now better than it is
19:53acrichtoe.g. anyone knows how to rebuild it
19:53acrichtoand we&#39;re actually using that rebuilding in prod
19:53acrichtodoing this incrementally will be super important
19:53aidanhstomprince: I think there are pros and cons to docker/heroku
19:54aidanhsstarting with what we have seems reasonable
19:54acrichtoaidanhs: did you have any other particular questions though? or want some more background on anything?
19:55aidanhssplitting out seems relatively clear, but you mentioned moving some things into docker
19:55aidanhsanything you have in mind?
19:55aturontomprince: yeah, i think it&#39;s worth talking through -- the goal right now is just to have a couple minimal prototypes of options along the spectrum (e.g. docker on the far minimal side, terraform etc on the maximal side)
19:55aturontomprince: we&#39;ll put together a more in-depth comparison of tradeoffs once we have it more concrete
19:55tomprinceacrichto: I agree with taking things one step at a time. I think any service other than the bastion would be a better first step than the bouncer.
19:55acrichtoaidanhs: my thinking right now is that we&#39;re, the rust project, generally familiar with configuring docker and whatnot
19:56acrichtoin terms of we&#39;re using it in a number of places
19:56acrichtothat plus I find the docker format easy to learn, debug, play with, and understand
19:56aidanhssorry, I mean more - what do you see moving inside docker?
19:56aidanhsI&#39;m sold on using docker at this point in time :)
19:56acrichtooh ok, so for the bastion I&#39;d be thinking basically the config for the server itself
19:56acrichtoe.g. whatever service we decide to expose on the port
19:56acrichtoright now that&#39;s just the sshd and the ssh config file probably
19:56acrichtobut again, this is where it&#39;s kind of tricky
19:57aidanhshmm
19:57acrichtoI&#39;m not sure if it makes sense to do the bastion with docker
19:57acrichtob/c it&#39;s the gateway to everything else
19:57acrichtodocker I think is way more important for other stuff like the nginx proxy etc
19:57aidanhslet&#39;s say bastion is great how it is and we&#39;re not looking at changing it
19:57aidanhsI guess maybe the irc bot might be a good poc
19:57acrichtooh sure yeah, if we want to ski pthe bastion, then play or the irc bot
19:58acrichtothose&#39;d be goo starting points
19:58acrichtogood starting points*
19:58acrichtoalthough those are also tricky lol
19:58acrichtob/c you&#39;d need to acces the docker daemon inside docker
19:58acrichtoin that in the container you&#39;re spawning new containers
19:58acrichtoI&#39;ve never done that before personally at least
19:58acrichtobut I think it&#39;s possible?
19:58aidanhsI&#39;ve done this kind of thing many times in a previous life
19:58acrichtooh nice!
19:59acrichtothen yeah let&#39;s start w/ that maybe? play + irc bot in docker?
19:59acrichto(or jsut one)
19:59tomprinceI think the biggest reason it doesn&#39;t make sense to tackle the bastion first is that it has different enough requirements from everything else that improving how we deploy it isn&#39;t representative enough to give us good information on the rest of our infrastructure.
19:59acrichtooh part of this is also setting up CI for these services
19:59acrichtoso each merge of a PR does a build of the docker container
19:59acrichto(currently set up as a webhook through the docker hub)
19:59tomprinceHow about one of the services that aren&#39;t run by infra right now? rfcbot or the like?
20:00aidanhsI figure easier to get buy in for an experiment on something we &#39;own&#39;
20:00aidanhsacrichto: you&#39;re talking about automated builds, right?
20:00acrichtoaidanhs: yeah
20:01aidanhsok
20:01acrichtoe.g. https://hub.docker.com/r/alexcrichton/rust-central-station/
20:01acrichtoit&#39;s just under my accuont for now, but we may want to change that as well
20:01aidanhsyes, your docker hub username appears in a lot of my build scripts :)
20:02acrichtolol
20:02tomprinceaidanhs: Well, the question is, do we not own rfcbot for any good reason, or was there just no team to own it beofre?
20:03tomprinceI guess I assumed that we&#39;d take ownership of running things like rfcbot and rusty-dash as part of this work.
20:03aidanhstomprince: oh yeah, but I figure it doesn&#39;t need to happen right now before we&#39;ve confirmed how we&#39;re going to host it
20:04aidanhsonce there&#39;s some kind of &#39;framework&#39; for hosting infra apps, I think we should gobble up everything we can
20:05aidanhsif only because &#39;hosting things reliably&#39; kinda feels like something an infra team should do
20:05tomprinceI guess my thinking that picking one externally admined thing as the example would give us the highest payoff for work done, for an example.
20:05aidanhsyou can think of the irc bot as externally admined if you like
20:05aidanhsit runs on my personal server
20:06carols10centsaidanhs: is that server under your desk? ;3
20:07aidanhslol no, but I do aspire to evolve to become more like perf collection ;)
20:07carols10cents:)
20:08aidanhsI have hosted critical things on servers under my desk before though...if it works, it works!
20:10dikaiosunetomprince: rfcbot/rusty-dash were started before there was an infra team, correct
20:10dikaiosunei currently admin the VPS they live on
20:10dikaiosunehaving a &quot;proper&quot; deployment for them would be great, though
20:10dikaiosunewhere &quot;they/them&quot; is actually just a single service
20:11aidanhsdikaiosune: I&#39;m warming to tomprince&#39;s idea in the last couple of minutes actually, so I may be in touch
20:11dikaiosunewhat&#39;s the idea?
20:11dikaiosunehaven&#39;t backlogged
20:12tomprinceDoes rusty-dash not need docker and ircbot need docker? If that is the case, then rusty-dash seems like a better minimal example.
20:12aidanhsdockerise the stuff on your vps and trial it on rust infra in aws
20:12dikaiosuneit doesn&#39;t need docker today
20:12shepI&#39;d bet that irc bot should be rewritten to run on Heroku
20:12shepand just make HTTP requests to the playground
20:12aidanhsmost things don&#39;t *need* docker, but it makes them easier to throw on servers
20:13tomprincedikaiosune: I mean in the sense of starting docker containers.
20:13tomprinceircbot/play both do, to take advantage of isolation
20:14dikaiosunetomprince: right, the only reason rusty-dash isn&#39;t on a PaaS today (like heroku) is because it has both webhook ingestion and a scraper that needs to be a singleton
20:15tomprincePaaSes support having signletons.
20:15dikaiosunesure thing
20:15dikaiosunethey just function best with external coordination
20:15dikaiosuneand there&#39;s no notion (yet) of retrying or rate limiting in the rusty-dash scraper
20:15aidanhswhat&#39;s testing heroku apps locally like?
20:23carols10centsdefine testing
20:24carols10centslike... with crates.io.... i just run the app
20:24carols10centsand also tell heroku how to run the app
20:24carols10centsand i have a staging heroku app
20:27tomprinceheroku has a thing to turn an app into a docker container.
20:28tomprinceThey also have stuff to turn PRs into running apps and even run tests against those apps now.
20:31carols10centsooo fancy
20:31carols10centsi guess i&#39;m old school
20:32tomprinceI haven&#39;t used any of that, I&#39;ve just seen that it exists.
20:39carols10centsok new topic
20:39carols10centsrust-lang/rust&#39;s travis queue is long sometimes
20:40carols10centsand the rust-lang org is all in the same queue
20:40carols10centswhich means things like https://travis-ci.org/rust-lang/rust-www/builds/254546031 sometimes sit in the queue for like 3 hours
20:41carols10centsi&#39;ve experienced this with crates.io and the book as well
20:42carols10centsso how do people feel about moving those repos into a different org that would be in its own OSS queue?
20:45aidanhsI&#39;ve noticed that there is a &#39;customer&#39; field in build json details
20:45aidanhsI wonder if we could ask travis to make that not be &#39;builds.customer.rust-lang-macos&#39;
20:46aidanhs(that was for an osx build of course)
20:46aidanhsno harm in asking before moving orgs
20:56carols10centsyou think it&#39;s only the osx builds making the queue take a long time?
21:08aidanhsI&#39;ve changed my mind on the &#39;queue&#39; field observation after checking a few other jobs, but I still think it&#39;s worth asking first
21:09acrichtocarols10cents: fwiw we hope to get some more builders in the near-ish future
21:09acrichtowhich will hopefully relieve the pressure on the queue
21:09acrichtoI&#39;d personally prefer to not organize based on infra limits
21:10tomprinceAlthough, it might make sense to split stuff that we expect users to run from stuff that we run?
21:22aidanhsinteresting, libc is in rust-lang-nursery on gh but rust-lang on travis
21:23aidanhsI suppose it makes sense, given we presumably do use parallel builders for libc
21:23aidanhsbut it maybe means moving the repos out of the org would not fix the problem by itself
21:32carols10centsacrichto: i&#39;d personally prefer to not wait 6 hours for a website update :(
21:33acrichtocarols10cents: you saw we&#39;re getting more builders?
21:33acrichtothe msg that is
21:33acrichtoas in the problem is being fixed
23:18simulacrumMore builders seems like they won&#39;t actually fix the problem since rust-lang/rust will use them up again presumably
23:18simulacrumEither for PR builds or for CI builders etc
23:20aidanhsCI builders are N (~30?) per build on the auto branch, if there are more spare then it&#39;ll presumably be first come first served for the other repos
23:21aidanhsin fact, I thought we had a few builders of spare capacity right now, it surprises me that they were all used up
23:22tomprincePRs will also consume builders.
23:24acrichtosimulacrum: right now we use 42/45 builders on each CI run
23:24acrichtomy plan is to instead use 50/60
23:24acrichtoor at most 55/60
23:25aidanhsyes but we&#39;d need the number of PRs in the space of 1.5hrs to consume them all
23:25aidanhsah but pushed commits retrigger as well
23:26acrichtoI just started my data collection for this again
23:26acrichtowhich tracks over time slots we have in use and number of builds queued
23:26acrichtohistorically we&#39;ve been fine w/ a 5-10 gap
23:27acrichtohttps://docs.google.com/spreadsheets/d/1cAQou4qpzS3PRKnv-AWccDBznZzuO7AAOjvPbs3y2Gs/edit?usp=sharing
23:27acrichtothat&#39;s the data last I collected this
23:27acrichtowe clear out the pending queue super quickly for prs and whatnot
18 Jul 2017
No messages
   
Last message: 64 days and 12 hours ago