mozilla :: #rust-infra

14 Jul 2017
00:53acrichtosimulacrum: r? https://github.com/rust-lang/rust/pull/43227
00:55simulacrumacrichto: r=me codewise, though I don't know build-manifest well enough to comment on whether it'll work :)
00:55acrichtolol neither do I know if it'll work
00:55acrichtobut if it doesn't work we don't publish and we just do it all again!
18:23aturonerickt: gonna make the meeting today?
18:24ericktaturon: yep!
18:24aturonerickt: sweet! i'm hoping we can spent much of the time collectively walking through your aws work
18:24ericktAlso trying to get the_nozzle to come too, who's on my team at work
19:05acrichtozomg
19:05acrichtotravis just alerted us to a mac failure!
19:05acrichto@alexcrichton one of your Mac Pros is not responding and I've opened a ticket with our hosting provider about it, in the mean time you're running on the other and so 5 of 10 macOS _slots_ are available right now
19:05acrichtoso now I can convey it to everyone as FYI
19:05acrichtoman this is great
19:05acrichtolike, legitimately
19:13steveklabnikneat!
19:43aidanhson the subject of osx, https://aidanhs.com/osx1.png shows the cumulative time taken by different phases of the xcode8 x86_64 builders
19:44aidanhsi.e. distance between lines shows the time a phase took
19:44aidanhsy axis is seconds
19:44aidanhsx axis is 'nth build since https://github.com/rust-lang/rust/pull/42958'
19:45aidanhs(working on getting a few more builds in time for the meeting, this is just the first graph I've managed to produce)
19:59acrichtooh nice!
19:59ericktdid something happen to https://rust-lang.org/? I thought it used to redirect to https://www.rust-lang.org
19:59ericktbut now it's not working
19:59acrichtoerickt: afaik it's never worked
19:59ericktoh, we should make that work then :)
20:00ericktwhat's hosting www.rust-lang.org?
20:00ericktgithub pages?
20:01ericktacrichto: ^
20:02acrichtoerickt: easydns redirect
20:02erickthuuuh. http://rust-lang.org/ works
20:03acrichtoyes easydns doesn't support custom ssl certs for that
20:06ericktLet's get fixing that on the agenda
20:07ericktWe could do redirection on our side
20:13aidanhserickt: how do you mean?
20:14aidanhs(as in, what precisely are you referring to by 'our side'?)
20:14* carols10cents whispers dnsimple alias records
20:14ericktaidanhs: we really ought to have https://rust-lang.org be a real thing. otherwise a malicious user could setup an evil dns server that serves http://rust-lang.org
20:15ericktand serves infected artifacts
20:17aidanhsmy question is less about why, and more about your proposed mechanism for fixing, i.e. the 'redirection on our side'
20:17aidanhscarols10cents may have answered my question
20:18carols10centsidk how erickt solves this problem but dnsimple alias records is usually what i go with
20:19carols10centsi don't actually know anything about dns
20:23erickthey, sorry. anyway, we need to point an A record at rust-lang.org, and have a server serving said record
20:24ericktI know how to set this all up on aws
20:24ericktassuming we got dns delegation working
20:43brsoni've started preparing a cargobomb for the new beta
21:07frewsxcvNot going to make it to the meeting. Family stuff is still ongoing
21:08carols10centsfrewsxcv: <3
21:30aturono/
21:30brsoni&#39;m here
21:30simulacrumo/
21:30shepo/
21:30carols10centso/
21:30aidanhso/
21:30aturonit&#39;s the friday infra dance party!
21:31acrichtoo/
21:31aturonerickt: tomprince: TimNN: heya
21:31tomprinceo/
21:31TimNNo/
21:32aturonok most folks are here, let&#39;s get started
21:32erickto/
21:32* erickt couldn&#39;t find the window that had irccloud
21:32aturonso, checking in on the PR graph
21:32* carols10cents #fridayproblems
21:32aturonseems like overall we continue to be pretty steady
21:33aturonnothing of particular not
21:33aturoni don&#39;t see any logged spurious failures for the past couple weeks on the spreadsheets
21:33aturonanything new to mention there?
21:34aturonaidanhs: did you have a chance to look at timing graphs?
21:34shepShould that include &quot;ran out of disk space&quot;? I saw one of those today.
21:34simulacrumI think a lack thereof is part of it, the other part is laziness to an extent on my part, I&#39;ve retried a couple times w/o logging, though mostly for &#39;known&#39; problems
21:34aturonyeah, that&#39;s fine
21:34simulacrumshep: That was accidental -- Travis switched configurations under us
21:34shep
21:34aturonthe spreadsheet tracker was always kinda iffy anyway, and it seems like we usually have some gut feel regardless
21:35simulacrumgut feel is less
21:35aidanhsyeah I posteda link to a perf graph above of a number of osx builds
21:35simulacrumfor, as usual, unknown reasons
21:35aturonaidanhs: oh cool, i missed that!
21:35aidanhs(on mobihttps://aidanhs.com/osx1.png
21:35aidanhsargh mobile mangling of paste
21:36simulacrumhttps://aidanhs.com/osx1.png
21:36aturonso at first glance, this seems too coarse-grained to lend much insight into the timeouts
21:36aturonanybody have more specific thoughts/suggestions?
21:36aidanhsits a bit rough at the moment, but we can drill down into specific times a bit more
21:36simulacrumhm, well, it&#39;s somewhat interesting to me that make check (afaict) is taking 2x as long as actually building the compiler
21:37acrichtoI&#39;d personally find this sort of analysis great
21:37acrichtoit&#39;s already way finer grained than anything we have already :P
21:37aturonheh
21:37simulacrumI&#39;d be interested if that means we&#39;re building stuff in that step that we aren&#39;t in make all
21:37acrichtoin the sense that if we had this graph for all platforms on all builds, I&#39;d be super happy
21:37carols10centswhat is the x axis?
21:37aidanhsparticularly interesting is thr dist builders (not shown here) which occasionally get very miserable
21:37aturonoh hm, i was assuming the graph was cumulative
21:37simulacrumcarols10cents: # run since data start
21:38aidanhssorry, the graph is cumulative
21:38aturonsimulacrum: ^
21:38simulacrum(so this is 18 builds, I think)
21:38aturonso make all and make check take comparable time
21:38simulacrumah, I see
21:38simulacrumso stacked area?
21:38aturonacrichto: so, in terms of diagnosing timeouts and granularity, can you elaborate your thoughts a bit?
21:39tomprinceIs this a timeout for the overall build? Would splitting things up using travis&#39; new pipeline stuff get around that?
21:39tomprinceEven if it doesn&#39;t allow us to speed things up?
21:39acrichtoaturon: just a graph of overall time per builder is typically enough to diagnose things
21:39acrichtoaturon: in the sense that we&#39;re really just looking for spikes and regressions
21:39simulacrumtomprince: generally the timeouts indicate underlying problems
21:39aidanhsthere are more builds cooking a graph as we speak, and we can continue generating these and i csn work on more fine grained measurements
21:39acrichtoa breakdown of the time is helpful for general optimization but for over-time things I&quot;ve found it less useful
21:40acrichtoas in, I figured the intent here was to detect regressions in build time
21:40aturonacrichto: ahha ok, i thought it&#39;d be useful to get more detailed
21:40aturonwell so let me put it more sharply:
21:40acrichtooh well if we want a secondary intent of debugging slow builds then the more detailed the better for sure
21:40aturonwe continue to get timeouts on macos
21:40aturondoes a graph like this help?
21:41aturon(if we run it for long enough)
21:41acrichtosort of I think
21:41aturonaidanhs: wanna talk about what additional granularity you have in mind?
21:41acrichtowe could either rule out or confirm &quot;builds are getting slower over time&quot;
21:41acrichtowhich means we&#39;d hit the ceiling
21:41acrichtobut most timeouts (esp on osx) tend to be spurious
21:41aidanhsso it is somewhat useful for timeouts - i mentioned dist builders getting slow, and i noticed other io heavy steps also getting slow on some specifc builds
21:41aidanhsindiciating io problems
21:42aturonoh btw, we didn&#39;t get an answer to tomprince&#39;s question
21:42aidanhsadditional granularity like breaking down make all into the stages
21:42aturonaidanhs: ok, that&#39;s sorta what i figure
21:42acrichtothat&#39;s true yeah actually, breakdown allows seeing which stage spiked during a timeout
21:43tomprinceI think stacked probably makes identifying which part spiked harder, though.
21:43aturonok, so it&#39;s sounding like we&#39;re thinking: let&#39;s keep gathering these, explore stage granularity, and see what insights we garner over time?
21:43acrichto:+1:
21:43aidanhshuh interesting idea re pipeline. it wont speed us up but if it effectively extends our timeout...
21:44simulacrumI kind of prefer to cancel builds if they go over
21:44aturonsimulacrum: to force us to fix the issue?
21:44simulacrumwell, sort of, but also to avoid builds spinning
21:44* tomprince is sad that mozilla seems to have no interest in making their CI infrastructure available to others, since it seems they&#39;ve got some nice tools for doing this kind of analysis
21:44aturontomprince: hm, got any pointers?
21:44simulacrumthere&#39;s been a couple PRs where the PR makes e.g. debug logs generate which cause the build to slow down (but still be successful)
21:45aturon(the rust team tends to be pretty much off to the side doing their down thing)
21:45aturonsimulacrum: gotcha
21:45steveklabnikone small thing that might be of interest to #rust-infra
21:45carols10centssteveklabnik: we&#39;re in a meetinggggg
21:45steveklabnikgithub is apparently tracking duplicates now
21:45steveklabnikcarols10cents: oh my b
21:45aturonaidanhs: acrichto: oh one other simple question -- i assume it&#39;s easy to &quot;see&quot; a build timeout on this graph just by it going over a certain time --
21:45ericktlol
21:46aturonbut i recall there being some weirdness about the timing
21:46aturonlike, disagreement with apparent wall-clock time
21:46aturonso i&#39;m wondering how easy it is to correlate this with specific builds
21:46aturon(or if you think that&#39;s not worth worrying about)
21:46acrichtoyeah the logs on travis are sometimes... odd...
21:46acrichtoI say we do this and see what happens
21:47tomprinceaturon: I&#39;ve only really noticed in passing, as I&#39;ve been looking at other stuff, but things like https://treeherder.mozilla.org/perf.html#/graphs?series=[mozilla-inbound,c3ecdb53e181bd111524311860839bd6218c804b,1,2]
21:47acrichtounless we&#39;d like to conclude that we *shouldn&#39;t* do this, which seems unlikely
21:47aidanhsyes, ive not yet got round to extracting and analysing that data
21:47aturontomprince: ok cool; i can try to track down the relevant people
21:47aturonacrichto: ok sure, was just exploring improvements as we iterate
21:47aturonbut ok, thanks aidanhs much for putting this together!
21:48aturonit sounds like our spurious failure plan for this week is to keep collecting this data, and watching it as builds time out
21:48aturonno nominated issues this week
21:48simulacrumyep
21:48tomprinceaturon: Well, I think it is fairly deeply tied into the rest of their CI infrastructure.
21:48aturonso on to normal agenda
21:48aturonfirst one is &quot;cargobomb traige procedure&quot;
21:49aturonnot sure who was leading that one?
21:49simulacrumbrson?
21:49brsoni don&#39;t know if there&#39;s much to discuss
21:49brsonwe need more people to run and triage cargobomb
21:49aturon:)
21:49tomprinceBut that and treeherder and taskcluster all seem like somewhat generic, useful tools, but they aren&#39;t usable outside mozilla.
21:49brsoni have an action item to make it more clear and make a call for volunteers
21:50brsonthere are decent instructions here https://github.com/brson/cargobomb
21:50brsonfor running it and doing triage
21:50brsonbut not for getting access to the live systems
21:50aturonbrson: ok, so just to be clear, i think it would make good sense to bring this under the PR triage system, once folks are able to reliably run it
21:50brsoni heard for djc on internals some interest but not sure if they are going to bite
21:51aturonwe also would need to develop a way to integrate this into the procedure, which is not so easy since the runs take multiple days
21:51aturon(so there needs to be some kind of handoff, until we have fully automated job tracking)
21:51aturontomprince: you&#39;ve been able to run cargobomb yourself, right?
21:51tomprinceYep.
21:51aturondoes someone else want to try to take up the mantle?
21:52aturonand we can keep polishing those directions until we feel they&#39;re good for anybody?
21:52brsontomprince: can you remind me where your spreadsheet is?
21:52brsontomprince has a spreadsheet scheduling who&#39;s using the machines
21:52tomprinceFor running it, is running ~5 commands and waiting.
21:52aturononce we&#39;re at that stage, i can work on sketching an adjustment to the PR triage process
21:52aturonbrson: oh nice!
21:52tomprinceIt should be in the infra folder.
21:52brsonok
21:52ericktaturon / brson there are some things at aws that might help with this
21:52aidanhsI&#39;d be interested in taking a look
21:52aturonaidanhs++
21:53ericktsuch as their new Batch service
21:53tomprincehttps://docs.google.com/spreadsheets/d/1VPi_7ErvvX76fa3VqvQ3YnQmDk3bS7fYnkzvApIWkKo/edit#gid=0
21:53aturontomprince: oh that&#39;s perfect
21:53brsoni will write consolidated instructions that include access info, the spreadsheet, and operation instructions
21:53simulacrumerickt: iirc, cargobomb is &quot;single threaded&quot; today so presumably aws can do nothing for us right now
21:53aturonbrson: sounds great
21:53aturonand i&#39;ll start sketching the triage process part of it
21:53ericktsimulacrum: true, we&#39;d need a rewrite
21:53tomprinceWell, 2-threaded. ;)
21:53aturonlemme record the action items real quick
21:54aturonok cool
21:54aturonthat&#39;ll be good progress on cargobomb for this week
21:55aturonit would be a great step for us to be running that reliably as part of our infra process
21:55aturonso that it&#39;s not all on brson&#39;s shoulders
21:55aturonnext up: 422 Update is not a fast forward Homu https://github.com/rust-lang/rust/pull/43104 https://github.com/rust-lang/rust/pull/43116
21:55aturon(not sure whose this is)
21:55aidanhsmine
21:55aidanhswe saw it a couple of times
21:56simulacrumServo did as well, iiirc
21:56aidanhsI assumed it only happened when someone manually merged to master and then bors tried
21:56aturonacrichto: you ever seen this before?
21:56aidanhsbut...that doesn&#39;t seem to be the case
21:56acrichtonever seen before mysefl
21:56aidanhsI dug into one case and master wasn&#39;t touched
21:56acrichtono idea what caused it
21:57aturonok but it looks like retrying resolves it
21:57aturonso maybe just another spurious to track?
21:57aturonand if it keeps happening we can up priority
21:57aidanhsseems so
21:57aidanhsI&#39;ll create a tracking issue
21:57simulacrumhttps://github.com/servo/homu/issues/24
21:57aturonhuh fascinating
21:58simulacrumso old problem
21:58aidanhsyeah I saw that, sadly a bit light on details
21:58aturonok, next up, the big one:
21:58aturonerickt: tell us all about aws!
21:58ericktheyo!
21:58ericktso first off, please meet the_nozzle, who is on my team at work
21:58ericktand does rusty things
21:59ericktso I&#39;m trying to get him to help out
21:59aturono/
21:59ericktanywho
21:59carols10centshi the_nozzle !!!!
21:59tomprince\o
21:59shepohai
21:59ericktso where we&#39;re at is we got a bastion running on rust-dev that only I can log into :)
21:59aturonsuccess!
21:59simulacrumSo we can hand off everything and run? :)
22:00ericktand a PR that fixes things that hasn&#39;t landed yet: https://github.com/rust-lang/infra/pull/4
22:00ericktacrichto rightly points out that it&#39;s hard to actually know what&#39;s going on with ^
22:00ericktif you don&#39;t have any terraform experience
22:00ericktso anyone on the team play with it yet?
22:00shepI&#39;m still unclear on how to play with it
22:01ericktthe_nozzle: is not a huge fan of terraform
22:01the_nozzle.. be ready for some trauma .
22:01ericktlol
22:01ericktshep the terraform guide is decent: https://www.terraform.io/intro/index.html
22:02aidanhsI think step 1 for me on that PR is having some documentation - I&#39;d prefer for the code not to be the documentation
22:02ericktthe whole idea is that it&#39;s a declarative expression of the infrastructure
22:02aturonso just to check, does anyone other than erickt/the_nozzle have experience with terraform?
22:02aidanhsI can imagine it&#39;s pretty straightforward for people who know terraform but it&#39;d be nice to cast the net a bit wider
22:02ericktshep: I wrote up docs in https://github.com/rust-lang/infra/tree/master/terraform
22:02ericktit&#39;s not super thorough but it&#39;s a start
22:03aturonerickt: so i feel like there&#39;s a pretty big gap here
22:03erickt:)
22:03aturonin that to most of the rest of us, this is a pretty intimidating wall of configuration goo
22:03ericktI can do a little walk through the code now if ya all want
22:03tomprinceI don&#39;t know terraform, but it seems somewhat readable, knowing a bit about AWS.
22:03carols10centswell like erickt could you define &quot;play with&quot; some more
22:04carols10centslike what can i *do* with this
22:04carols10centswhat accounts do i need
22:04carols10centswould i be setting up my own terraform cloud or playing with rust-dev&#39;s?
22:04aturontomprince: i agree, BUT being able to *write* it is another matter :)
22:04carols10centswhat do i need to install
22:04ericktcarols10cents: terraform embodies the whole infrastructure-as-code
22:04carols10centswhat do i run
22:04carols10centshow can i tell that it worked
22:04ericktwell first you install terraform
22:05ericktif on a mac, do &quot;brew install terraform&quot;
22:05ericktthen you need to get your aws configuration setup
22:05ericktwhich you can do by installing the aws-cli
22:05ericktand creating an access key from the amazon console
22:05carols10centshow much is this gonna cost me
22:06ericktit can cost you as much as you want :)
22:06aturonlol
22:06ericktamazon will happily take all of your money
22:06carols10centserickt: is $0 possible?
22:06ericktyeah
22:06carols10centserickt: how can i tell how much before i do it
22:07ericktso I&#39;m not quite sure if it&#39;ll work without having the amazon profile setup
22:07aidanhserickt: just to give some background to what I&#39;m thinking, if I put a shell script inside a docker container and run it, the resulting is fully defined by the shell script. unfortunately that doesn&#39;t mean the shell script is readable!
22:07ericktbut it does have a mode called &quot;terraform plan&quot; that will tell you what it wants to do without doing it
22:07aidanhsit&#39;d be great to have a 1000-mile above view (maybe a diagram) so I can read parts of the config and go &quot;oh that relates to this box on the diagram&quot;
22:08aturonerickt: that&#39;s the sample output on the readme, right? i&#39;m personally not sure how to read it, but i am an aws novice
22:08carols10centserickt: the sample terraform plan output just looks like it&#39;s printing out the config in a different format?
22:08carols10centslike what am i looking for
22:09ericktoh hey there&#39;s a graph output to terraform
22:09ericktaidanhs: I&#39;ll make you a view
22:10ericktaturon: and yeah, that&#39;s the output from terraform for what it wants to do
22:10aidanhserickt: that would be fantastic, thankyou
22:10aturonacrichto: just a gut check, how much of that sample output is legible to you?
22:11acrichtospecifically just the terraform folde readme?
22:11aturonyeah
22:11acrichtobasically unreadable :(
22:11ericktaturon: so this particular output is just describing an autoscaling group for the bastion
22:11ericktlol
22:11aturonerickt: are the various parameters listed here standard aws things, or terraform things?
22:11tomprinceYeah, autoscaling groups seem overkill for what we need at the moment.
22:12ericktit corresponds to the configuration options you are implicitly creating when you make some resource in aws
22:12aturonso like each of these is a knob in aws?
22:12ericktaws is full of knobs
22:12aturonbasically i&#39;m asking: is the complexity here from aws, or terraform?
22:12brsonit&#39;s all pretty recognizabe
22:12brsonbut slightly mutated
22:13erickttomprince: we&#39;re big fans of autoscaling groups at my work, even for a group of size 1
22:13ericktsince it&#39;ll automatically recover if there&#39;s a crash
22:13aturonacrichto: i don&#39;t know how to square your and brson&#39;s reactions :)
22:13brsonwe&#39;ve never used autoscaling groups before so there&#39;s some options here we haven&#39;t seen
22:13aturongotcha
22:13brsonwe should use autoscaling groups sometimes
22:13acrichtooh well it could be that I&#39;ve just never used autoscaling
22:13brsonwe&#39;ve not used vpc&#39;s in a long time either
22:14acrichtobut I personally understand like 2/20 keys
22:14aturonso stepping back for a sec here, we have basically two goals:
22:14brsonbut some of this stuff is recognizable from confurations we&#39;ve done before
22:14aidanhs(I&#39;ve just googled vpcs to refresh my memory :( )
22:14aturon1) get our infra to a more principled place (best practices etc)
22:14aturon2) make our infra more maintainable, partly by (1) and partly by making it more accessible
22:14brsontags are typical stuff and it&#39;s easy to figure out what these mean
22:14brsontermination policies we use
22:15aturonpresumably if we use AWS at all for these services, there&#39;s a baseline AWS knowledge that&#39;s expected (and that we can document)
22:15ericktalrighty, if you go to https://goo.gl/AhFakN, you should see &quot;infrastructure.png&quot;
22:15ericktaturon: yep
22:16carols10centserickt: re: &quot;**WARNING**: If you do an errant apply, you could destroy everything.&quot;
22:16ericktit&#39;s now my responsibility to either teach all y&#39;all this stuff, or recruit folks like the_nozzle to help maintain all these bits
22:16carols10centshow can i tell i&#39;m about to destroy everything?
22:16aturonre: this setup, i guess the question is how much of the complexity is inherent in aws+best practices, vs added by taskcluster
22:16brsonthat graph is kinda terrifying
22:16tomprinces/taskcluster/terraform/ ;)
22:16aturonlol thanks
22:17ericktcarols10cents: yep. so just don&#39;t run &quot;terraform destroy&quot;, or keep an eye out for output that prefixes a &quot;-&quot;
22:17aturonerickt: i feel like if we have to rely on people even more specialized than those in this channel, we&#39;ve failed :-/
22:17ericktthe main reason why I wanted rust_dev is so that it&#39;s okay if we mess up it doesn&#39;t delete crates.io
22:18carols10centsuh crates.io is on heroku
22:18ericktbrson: it&#39;s like looking at an AST in dot form
22:18tomprinceerickt: I certainly understand why one might want autoscaling groups for everything, even if the size=1, but I feel that starting there might make the adoption more difficult.
22:18ericktcarols10cents: the crates are on s3 in rust-prod though
22:18erickttomprince: true!
22:18carols10centsah ok, the crates.io *data*
22:19aturonerickt: so i&#39;m sorta getting the feeling that either we need (1) a lot more docs about what this small setup even does or (2) to simplify it much more if possible
22:19tomprinceAlso, I suspect that a hand drawn sketch would be more understandable than the auto-generated graph.
22:19aidanhshttps://aidanhs.com/Figure2.1_overview.png
22:19aidanhsthat&#39;s the kind of thing I&#39;m thinking about, minus the labels probably
22:19erickttomprince: as far as terraform is concerned, defining an instance is just one block, whereas autoscaling groups is 2
22:20aturonerickt: is there something closer to a &quot;hello world&quot; we could be starting with?
22:20erickttomprince: yeah the graph is terrifying for people new to this :)
22:20aturon(also i&#39;m feeling nervous about the_nozzle&#39;s &quot;trauma&quot; comment ;-)
22:20ericktaturon: he&#39;s just doing way more complicated things, and just hit some terraform bugs
22:20tomprinceerickt: Not just terrifying, but I think it obscures more than it clarifies.
22:21aturonok, so we&#39;re getting low on time
22:21aturonwhat i&#39;d like to do with teh remaining ten minutes
22:21brsonfwiw, i feel fine about this, but i also don&#39;t want to train up on it, and don&#39;t want to be the one maintaining it
22:21aturonis figure out a good action item for this week related to the AWS setup, to get us one step closer
22:21ericktheh
22:22ericktso I can put together a little hello world example for y&#39;all
22:22aturonaidanhs: acrichto: brson: carols10cents: what do you think woudl be the single most helpful thing here, to start coming to grips with the tool?
22:22aidanhsI think we should define the role of this repo more explicitly
22:22brsonit seems vital that there is buy-in from multiple people here
22:22aidanhscurrently it &quot;describes the rust-dev aws infra with code&quot;
22:22aturonbrson: agreed
22:22ericktwe could go simpler without the vpc and the autoscaling group for educational, but there are some good reasons to do that
22:23acrichtoI&#39;m not 100% sure, I think a simpler example will help for sure, but the real meat here won&#39;t be one-off configuration like setting up an instance but rather changes over time that happen in small deltas
22:23aturonaidanhs: hm so we set out some explicit goals last time -- bastion + play + one other thing i forget
22:23aidanhsmaybe it should extend to include &quot;a person starting without knowledge of terraform or aws, but knowledge of networking can pick it up&quot;
22:23acrichtoso in some sense it&#39;s not critical that a bucnh of us understand 100% of what&#39;s written down already
22:23tomprinceI think a sketch of the components involved would be helpful.
22:23brsonwe could have a training session where everybody rebuilds the environment
22:23carols10cents to acrichto, erickt can you prepare like... a diff that would add the next thing to this?
22:23ericktyep
22:24ericktcarols10cents: what do you mean by add the next thing?
22:24aturonaidanhs: &quot;a person starting without knowledge of terraform or aws, but knowledge of networking can pick it up&quot; strong +1, though it&#39;s a tall order
22:24carols10centsso this is just the bastion, right?
22:24ericktand the core networking for the account
22:24aturonaidanhs: i&#39;m sort of wondering how we can iterate toward that over time
22:24carols10centssay you were going to add the playground
22:24ericktyep
22:24carols10centswhat would that look like?
22:24carols10centswhat would the commit in the repo be?
22:24aidanhswell this is why I think we need to think about the role of the repo right now - maybe &quot;rust-dev works&quot; is fine and we do one-on-one training
22:24carols10centswhat commands would you run to deploy it?
22:25tomprinceerickt: One thing that might make things clearer is to not show all the direct interconections between the subcomponents.
22:25carols10centswhat would teh output of &quot;terraform plan&quot; look like
22:25aturonerickt: also, is there any chance you could do the bastion in the simplest possible way, which maybe doesn&#39;t match where we eventually want to go but is easier for people to start with?
22:25ericktcarols10cents: yeah sure, I&#39;ll make a tutorial for that
22:25shepOr, an actual change to the playground in the last few days: I had to add an environment variable to the init.d service
22:25acrichtoso one ohter piece that I&#39;m slightly worried about is it seems like this is very geared towards spinning up instances, but what about configuring what&#39;s actually running on the instance iteslf? how do we mnaage and/or change that over time?
22:25aidanhsbut longer term we do need to consider what happens when membership of the infra team rotates and we need to train others
22:25aturonerickt: like, e.g. avoiding autoscaling
22:25tomprinceLike, there are a bunch of things that make up the VPC, but all the interconnections between them and other parts seem like noise.
22:25ericktI&#39;d prefer not to do that for the bastion since that&#39;s a security feature
22:26aidanhsif the repo is &#39;self documenting&#39; to begin with then we can skip the whole training thing
22:26aturonaidanhs: yeah i agree, just see that as the goal more than the starting point
22:26ericktbut we can do this on the other side of the bastion
22:26carols10centserickt: not to do what?
22:26tomprinceSecurity that we aren&#39;t using isn&#39;t actaually any security at all.
22:26aturonaidanhs: hm, fair point
22:26aidanhsor even
22:26aidanhsif the terraform-newbies get training, then we write the docs and everyone maintains it from then on
22:26tomprinceWe can always switch to doing a autoscaling group later.
22:27aturonaidanhs: yeah that&#39;s sorta what i figured would be needed, but maybe we can do better
22:27aturonok so, action items:
22:27aturonwhich i think are all for erickt ;-)
22:27ericktlol
22:27tomprinceAnd, the bastion is stateless, so we can get the same effect by asking terraform to recreate it, can&#39;t we?
22:27ericktI&#39;ll just reassign them to the_nozzle
22:27aturon- see if you can make a very very stripped down version of the basion, which maybe isn&#39;t where we want it in the end, but is easier to grok to start
22:27aturon- similarly, show the smallest delta you can imagine to add play
22:27erickttomprince: yeah we can tear down and recreate the baston
22:27aturon- write more docs :P
22:28aturon(not sure if we can get more specific on that last)
22:28shepI reiterate my comment from last week: I&#39;m willing to help setup play
22:28aturonerickt: are you willing to take those things on?
22:29aturonothers: are there specific docs you&#39;d want? carols10cents already mentioned some of the absolute basics needed for playing around
22:29aidanhs&#39;more docs&#39; might be pretty minimal depending on how simple the examples are!
22:29aidanhs(just some motivation along the lines of simplicity ;) )
22:29ericktyeah I&#39;ll work on those things
22:30aturonaidanhs: hah
22:30aturonerickt: yay, thanks! and feel free to ping folks here along the way for early feedback
22:30aturonand with that, we&#39;re out of time
22:30erickty&#39;all also should look at https://www.terraform.io/intro/getting-started/install.html
22:30aturonthanks y&#39;all!
22:30shep\o
22:30ericktwhich I found pretty approachable
22:31aidanhs\o
22:31ericktbut then I know about these things :)
22:31aidanhserickt: how do you feel when you look at the aws dashboard out of interest? I tend to feel mild panic :)
22:31carols10centsi&#39;d like to raise a to everyone who made the play deployment this week go seamlessly!
22:31ericktaidanhs: haha
22:32ericktaidanhs: at this point, it&#39;s like trying to imagine what&#39;d it&#39;d be like to see rust for the first time
22:32tomprincehttps://github.com/brson/cargobomb/compare/master...tomprince:fs-model and https://github.com/tomprince/cargobomb/compare/fs-model...tomprince:db could use some help
22:33* tomprince started a new job this week, so has less time
22:33brsoni&#39;ll review them sometime
22:33ericktbtw here, I need your gpg and ssh keys, preferably from places like https://keybase.io/ and https://github.com/erickt.keys
22:34aidanhserickt: seeing rust isn&#39;t so bad, it&#39;s when you need to use some syntax that you don&#39;t even know exists to solve a lifetime issue that rust becomes tricky ;)
22:34erickthehe
22:35ericktanyone mind if I just merge in my PR?
22:35ericktand then I can work on some simpler examples
22:35tomprinceIf it is what is actually deployed, then it should probably be merged.
22:35shepI do not mind.
22:35brsonerickt: both my keys are linked from brson.github.io
22:36aidanhserickt: sorry not clear, who do you need keys from?
22:36ericktfrom everyone on the team that wants access to the rust-dev account
22:36brsondoes keybase have a way to specify ssh keys?
22:36aidanhshttps://github.com/aidanhs.keys
22:36ericktbrson: I dunno, but I know you are github.com/brson
22:38simulacrumhttps://github.com/Mark-Simulacrum.keys
22:39brsonoh that&#39;s neat that github does that for you
22:40ericktyeah
22:41carols10centshttps://github.com/carols10cents.keys
22:42sheperickt: https://keybase.io/jakegoulding https://github.com/shepmaster.keys
15 Jul 2017
No messages
   
Last message: 9 days and 18 hours ago