mozilla :: #build

13 Jul 2017
01:31glandiumand without cache: https://drive.google.com/open?id=0B_wfRHd_Wd-1aVN2aElpSm1NSm8
02:48swuDoes the mach artifact mode support GDB debugging with symbol table? I tried to enable debug version based on what mentioned in https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Build_Instructions/Artifact_builds but cannot see symbol table.
02:56glandiumswu: no.
02:56glandiumbut the symbols could actually be downloaded if necessary
02:56glandiumfeel free to file a bug
02:58swuglandium: got it, thank you.
02:59glandiumrillian: looks like objective-c works
06:51bobowenglandium: ping
07:02glandiumbobowen: pong
07:03bobowenglandium: hi, do you think we might be able to move to requiring Win10 SDK for windows builds, now that ted (I think) seems to have sorted out the issues with picking it up?
07:04bobowenI think that was for VS2017 support
07:04glandiumI don't remember what was blocking it
07:04bobowenglandium: it didn't get picked up by mozilla-build / configure
07:05bobowenglandium: but it looks like it does now
07:05bobowenI had to #if out some of the chromium sandbox code we weren't using, but we'd like to use it now
07:06bobowenand in needs win10 SDK v10.0.10586.0 and above
07:06bobowen*it
07:08glandiumiirc I wasn't getting the win10 sdk when installing VS2015 by default
07:08bobowenglandium: no you have to select it on install
07:09glandiumand I don't remember if mach bootstrap ensures you have it
07:09bobowenbut we could update the instructions for that
07:09bobowenI don't think bootstrap installs it
07:10glandiumI don't even remember what bootstrap does about msvc
07:10glandiummaybe it does nothing
07:10bobowenglandium: I think you have to have it installed (I never tried without)
07:10bobowenglandium: certainly our instructions are to install it first
07:11bobowenand mozilla-build of course
07:11bobowenif we want people to find VS2015 then the link will need updating as it points to VS2017
07:12bobowenMS make it very hard to find VS2015 community edition
07:12bobowenif not impossible on their site
07:12glandiumI'm not sure we've fixed all the problems with 2017, though
07:12glandiumanyways, the short answer is "I don't know"
07:13bobowenfair enough, maybe I should bug ted later?
07:13glandiumfile a bug?
07:13bobowenor that :-)
07:14bobowenglandium: thanks
14:45bsmedbergcatlee, ping about bug 1380381: mac/TC builds don't have XUL symbols again
14:46firebothttps://bugzil.la/1380381 NEW, nobody@mozilla.org build symbols missing on macOS/OS X, unhelpful crash signatures like [@ XUL + 0xddb7c]
15:00bsmedbergor coop ^^
15:05rillianglandium: that's awesome.
15:05rillianCan you reproduce the jobserver issue locally?
15:16bsmedbergaobreja|buildduty, perhaps I can ping you about bug 1380381?
15:16firebothttps://bugzil.la/1380381 NEW, nobody@mozilla.org build symbols missing on macOS/OS X, unhelpful crash signatures like [@ XUL + 0xddb7c]
15:32coopbsmedberg: mshal is going to help
15:32bsmedbergty
15:40gpshappy fire drill thursday
15:51gpsmshal: unless you call me off, i'll start looking for the regression build when i get into the office in ~30 minutes
16:58gpsmshal: need any help?
16:59mshalgps: sure, a second pair of eyes would be great. I'm currently trying to see if the dsymutil error actually occurred between the two revisions d.major pointed out, or if it is actually just a random failure
16:59gpsok. i was going to help with that :)
17:00gpsi've been wanting to write a tool that iterates over taskcluster jobs, downloads logs, and allows you to grep, etc
17:00gpsbasically make it easy to find occurrences of things in logs to make it easier to spot regressions like this
17:00mshalthat's what I'm trying now! :)
17:00mshala generalized version would be great
17:11mshalI did find a recent revision that appears not to have the dsymutil failure: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=6c8692cc9c497492ee36bfcc42362db16a434f69&selectedJob=114022875
17:12mshalso far it seems to be random, at like a 4% success rate
17:36gpsmshal: pull down https://hg.mozilla.org/users/gszorc_mozilla.com/firefox/rev/ffc3a7f17bb6
17:37gpsthen run: mach taskcluster-logs walk-logs public/logs/live.log
17:37mshalthat was fast o_O
17:37gps| grep dsymutil for goodness
17:37gpsonly fast because super hacky
17:37gpsit is hardcoded for build logs for macosx64-opt :)
17:37mshalhaha
17:37gpsand it is only grabbing the last 1000 mercurial revisions
17:38gpsyou can hack as appropriate. i'm sure you can hack it to work for your needs
17:40gpsin fact: https://hg.mozilla.org/users/gszorc_mozilla.com/firefox/rev/d6fd057cac59
17:41mshalthis is much better than my even hackier hack :)
17:42gpsit also didn't hurt that i had this tc index/artifact stuff paged in from the release scraper i hacked together :)
17:43gpscb6056e7c4907996a31ec6b4a7d0bdccf7611ff3 bad
17:43gpsc50a863d1705c70808b42f95e4c75129195fe5e1 good
17:43gps7404c5c961cc7cce97228fa32b40d78a77c0f2dd bad
17:44gpscommits in that order
17:44gpswhich is weird
17:44mshalyeah, I found a bunch of random passes
17:44mshalbut mostly fails
17:45mshalhow hard would it be to bump the memory on the osx-cross builds temporarily?
17:45gpsthose ec2 instances should be beefy
17:45gpsc4.4xlarge no?
17:45mshallooks like it, yeah
17:46gpsi would be shocked if we're hitting memory limits
17:46mshaloh, 30GB
17:49mshalI'm currently trying to run dsymutil manually in the one-click loaner and see what happens
17:50dmajormshal: https://stackoverflow.com/questions/9828618/generatedsymfile-dsymutil-fails-with-exit-code-11 suggests possibly a full disk. Is that easy to check?
17:51gpsthat's plausible given intermittent failure rate
17:52mshalI'm not sure if there's a way to check after the fact on a build from treeherder. On my loaner, df says the disk is 14% full and dsymutil still segfaulted
17:52dmajorok
17:52gpsstrace that sucker
17:52gpsor gdb
17:53gpssounds like you are close if you can repro
17:53mshalbash: gdb: command not found :(
17:54gpsnot sure you'll be able to strace from docker though
17:54mshalthough this is returnning 139 instead of -11
17:55gpswe need to create a shared index route for {central, inbound, autoland}
17:55gpsprobing 3 repos for the same hg revision is silly
17:56gpsthis gets back to the issue of "jobs vary across repos"... which is a complicated problem
17:56gpsbut not so much on the "trunk" repos
17:57mshalwell it looks like it only got up to 4% memory usage, still 9G+ free when it failed
18:03gpsi think i found the inflection point where this went from constant to intermittent
18:03mshaloooh, where?
18:03mshal(constant good, I assume?)
18:03gpsyeah
18:04gpshttps://gps.pastebin.mozilla.org/9027019
18:05gpsthese are only revisions indexed against central
18:05mshalahh
18:06gpsnot having a linear repo is so annoying
18:06mshalhuh, that first bad rev is just a backout?
18:07gpsyeah
18:07gpsit looks even weirded on inbound
18:08gpshttps://gps.pastebin.mozilla.org/9027021
18:08gpsthere are some suspicious C++ changes around the inflection point though
18:09mshalwat
18:09gpsand autoland https://gps.pastebin.mozilla.org/9027022
18:11mshalweird, I think that's the same pattern as my brain state the more and more I look at OSX... the occasionally bad day, followed by a decent into madness
18:30wcostamshal: https://treeherder.mozilla.org/#/jobs?repo=try&revision=43362462339c99c5ba8e00aef5ee051ad9d77051&selectedJob=113957343
18:32mshalwcosta: looks promising! Can you check with valgrind folks to see if obj:* is the right thing to do? I'm not sure if that would mean it's ignoring too much
18:32mshalwcosta: from https://bugzilla.mozilla.org/show_bug.cgi?id=1338651#c166 it sounded like some debuginfo wasn't being loaded correctly
18:32firebotBug 1338651 REOPENED, wcosta@mozilla.com taskcluster cross-compiled OS X builds show Talos performance regressions vs. buildbot builds
18:37wcostamshal: just a poc, to confirm that is always the same case
18:40wcostamshal: I did strace in open calls, but nothing spotted my eyes, will update the bug
18:40mshalty!
18:40mshalI wonder if there's some symbols package that was installed in the original image, but not in the new image or something
19:09gpsbut docker images are pinned in the firefox repo
19:14mshaloh, the above comment was about the valgrind failures from updating the docker image for the /home/worker patch
19:15mshalI tried using dsymutil from clang-4.0, but that dies too :(
19:16gpsi wonder if this is a sccache bug
19:16gpswe're encountering some kind of cache poisoning
19:16mshalwhat makes you suspect that?
19:16gpstrying to think of what would cause this to be intermittent
19:16dmajorgps: what if the cross builds push succeeded by accident? could this be tested by retriggers?
19:16gpswhat things aren't deterministic
19:16mshalcertainly plausible
19:17gpsdocker workers aren't
19:17gpscached on workers aren't
19:17gpscaches
19:17gpssccache isn't
19:17gpsthe fact that it is still periodically good after a few weeks is really troubling
19:17mshalyeah
19:18gpsthere's state *somewhere* causing this
19:19mshalwant to kick off a few builds with sccache disabled or shall I?
19:19gpsyou can do it. i can't remember where it is defined
19:20mshalk
19:22gpsi think i found it :)
19:22gpshttps://treeherder.mozilla.org/#/jobs?repo=try&revision=46fe88cf1d076f12f7c9f2beeb6b03a9fb4f2867
19:28* mshal retriggers a few
19:29gpsthanks
19:29gpsi was about to do that myself :)
19:29mshalhah
19:52gpsthis is like watching paint dry
19:52* gps departs for quick lunch
19:53mshalsrsly
19:53gpsrustc is so slow
19:53mshaldebugging on a one-click loaner is frustrating, since the instance keeps dying :/
19:53mshalI'm trying to reproduce it locally
19:56gpsdammit
19:56gpsjob w/ sccache disabled still failed
19:57gpsstill running, but https://treeherder.mozilla.org/#/jobs?repo=try&revision=46fe88cf1d076f12f7c9f2beeb6b03a9fb4f2867&selectedJob=114087909 has a failure
19:58* gps really goes to lunch
20:03mshalboo.
20:06mshalwell, at least it's one more thing off the list
20:14gpsmshal: i assume you read bug 1301751?
20:14firebothttps://bugzil.la/1301751 FIXED, ted@mielczarek.org llvm-dsymutil crashing on XUL
20:16mshalyeah, did we undo that now?
20:16gpsi dunno
20:16mshallooks like yes
20:17mshalbug 1305731
20:17firebothttps://bugzil.la/1305731 FIXED, ted@mielczarek.org Revert bug 1301751 once we update to Rust 1.12 on OS X
20:24gpswtf
20:24gpshttps://treeherder.mozilla.org/#/jobs?repo=autoland&revision=c50a863d1705c70808b42f95e4c75129195fe5e1 initially had a good build
20:24gpsretriggered 3 times and all 3 exhibited dsymutil failure!
20:25mshalouch
20:27mshalhooray, it fails locally at least
20:28mshalrunning with the rust debug patch again: https://treeherder.mozilla.org/#/jobs?repo=try&revision=869f55cc5478afebe64d79813377088261ad2dd1
20:31mshalerr, guess not
20:33gpsi suppose i should go to my 1:1 w/ coop :)
20:39coopbacking out the cross-compile builds, but we should ping releng to make sure the mac builders are still up to the task if we switch back
20:40cooper, backing out is an option for
20:41catleethat puts stylo at risk
20:42catleewe would need to update OSX on them first
21:18mshaladding that rust debug hack seems to work again: https://treeherder.mozilla.org/#/jobs?repo=try&revision=4881fd24bfc15e83a72afebbc71b03ba6b7a8bd3
21:18mshalgps: ^
21:21wcostamshal: one-click-loaner dies because of expiration date, on top of task definition, if you increase the value, it must work
21:22mshalwcosta: oh, thanks! I thought it was just the spot dying
21:23wcostamshal: no, we are going to increase it by default, there is no hint on why the it dies
21:23mshaloic
21:33gpsmshal: r+ if you have confidence this fixes it
21:33gpsconsidering it worked a year ago, i'm feeling pretty good about it
21:34gpssucks that we lose some rust debugging info
21:34gpsperfect is the enemy of good, especially since trees are closed
21:35mshalhow bad is it going to be now that we have much more stuff in rust?
21:35gpsi'm not sure
21:35gpsit likely becomes a P1 for us to remove the workaround
21:35glandiummshal: f-, you should be able to replace debug = true with debug = 1
21:36gpsoh, glandium is here. get review from him, since he knows the rust stuff better than me
21:36mshalglandium: huh, why would 1 be preferred to true here?
21:36glandiummshal: true means 2
21:37glandiummshal: that is, debug = true is -C debuginfo=2
21:37rillianglandium: I tried to approve your sccache changes, but is seems I don't have the merge bit.
21:37glandiumyou can also pass debug=2 for the same effect
21:37gpsi'm an owner on the github org
21:37gpswhat permissions do you need ;)
21:38rilliangps: write on mozilla/sccache
21:38glandiummshal: you're removing debug=true and passing an explicit -C debuginfo=1... debug=1 should have the same meaning
21:38gpsrillian: what's your github username?
21:38rillianrillian
21:38gpsof course it is. you now have write access
21:39gpsted is the only project admin. so i think it is him or an org owner who can grant write
21:39mshalthis is just the patch from bug 1301751, but I'll give that a shot
21:39firebothttps://bugzil.la/1301751 FIXED, ted@mielczarek.org llvm-dsymutil crashing on XUL
21:39rilliangps: thanks. hopefully he won't mind.
21:40glandiummshal: although, come to think of it... why not keep debuginfo=2 for non-cross builds
21:41mshalcan we do that conditionally in the toml file?
21:42glandiummshal: would need to declare a new profile... so command line is more flexible for that
21:45mshalmeaning we still do https://hg.mozilla.org/try/rev/bc247d14c9c022ff332cbd8d741ee2993397bf38 but just conditionally set the value of debuginfo from rules.mk?
21:46glandiummshal: yes
21:46* mshal tries
21:47glandiummshal: note, you might as well remove the debug= lines from the toml files
21:47glandiumso that we don't assume anything from looking at them and seeing debug=false
21:47mshalok
22:19gpsglandium: regarding https://reviewboard.mozilla.org/r/156922/diff/1#index_header, is cargo aware of MAKEFLAGS so that it knows to no-op during `make -n`?
22:20glandiumgps: mmmmm does make -n execute + commands?
22:21gpsyup
22:21gpshttps://www.gnu.org/software/make/manual/make.html#How-the-MAKE-Variable-Works
22:21gpsand https://www.gnu.org/software/make/manual/make.html#Instead-of-Execution
22:21glandiumhuh... acrichto ^
22:22gpswait
22:22gpslet me parse this
22:22glandiumgps: I tested, it does
22:22acrichtoglandium: gps: ah no cargo doesn't know to do anything on `make -n`
22:22acrichtoalthough maybe make doesn't execute cargo?
22:22glandiumacrichto: only way for make to pass down the fds is to add the + prefix, which does make the command executed unconditionally
22:23acrichtoah that makes sense
22:23gpsin the rare case that we're using `make -n`, i think it would be OK to not pass down fds
22:23gpsso you can make RUN_CARGO conditional on -n in MAKEFLAGS
22:24glandiumgps: true, but generally speaking, cargo and rust should be aware of it
22:24glandiumwe can surely work around it in the meanwhile
22:24gpsyup
22:24gpsi'll leave comment on review
22:27plujonI am attempting to do an artifact build of Fennec, based on https://hg.mozilla.org/releases/mozilla-release, and it is failing (TIER: artifact pre-export export misc libs toolsTried 18 pushheads, no built artifacts found.). Is this a fit channel for disuccsion?
22:27plujondiscussion
22:27rilliangps: there are no recursive-make invocations in ay of the force-cargo recipies, so I think we're fine.
22:27gpsplujon: we don't support artifact builds from the release channel
22:28gpsin theory they work - we just don't check for them the last i looked
22:28plujongps: Ah, good to know. So, to build a release, one must start from scratch?
22:29gpsyes
22:31plujongps: Thanks. Maybe I'll try to repackage a release instead of building it..?
22:32gpsyou can hack up python/mozbuild/mozbuild/artifacts.py and add "releases/mozilla-release" to the CANDIDATE_TREES list
22:32gpsthat /may/ work
22:33plujongps: Yeah, tried it. It got farther, but then crapped out.
22:34plujonI _really_ only want to add an extension to Fennec. But I don't know how to repack fennec.apk after unpacking it.
22:37gpsi don't know that much about fennec packaging, sorry
22:52glandiumgps: "I do insist that the --stop-server call also be made unconditionally." not following
22:53gpsglandium: if you + prefix --start-server, you need to + prefix --stop-server as well
22:53glandiumgps: no
22:53gpsotherwise `make -n` orphans sccache
22:53gps?
22:54glandiumah
22:55glandiumactually...
22:58glandiumyumyum make -n -f client.mk runs configure
22:59gpsif you say `make -n` is so broken already, i'm somewhat willing to turn a blind eye to it
22:59gpsideally only for automation and client.mk
22:59gpsi&#39;d hate for `make -C <path> -n` to break
23:01nalexanderplujon: what are you trying to achieve with your Fennec repackage?
23:01nalexanderplujon: you want to ship an extension in Fennec itself?
23:01plujonnalexander: Yes, like https://wiki.mozilla.org/Mobile/Distribution_Files .
23:01nalexanderplujon: you&#39;re not going to be able to do that and deliver it &quot;as Mozilla&quot;, since you won&#39;t be able to sign the package.
23:02nalexanderplujon: okay. Why do you need to do this for release?
23:02nalexanderplujon: in any case, the CANDIDATE_TREES thing should work. Can you tell me what it said?
23:02plujonnalexander: Right; I&#39;ll have to rebrand. https://wiki.mozilla.org/Mobile/Distribution_Files implies firefox can be repackaged without any building, which seems ideal.
23:03nalexanderplujon: no, tha&#39;ts not correct.
23:03plujonnalexander: Oh, well, CANDIDATE_TREES didn&#39;t work for me. Perhaps I can verify something for you?
23:03nalexanderLet me read that Wiki.
23:03plujon&quot;Like desktop, mobile Firefox supplies a way for partners to repackage a set of customization files into a Firefox build without recompiling&quot;
23:04nalexanderplujon: ah, that&#39;s mostly correct. It&#39;s not obvious, but yes, you can use the compiled libxul/classes.dex without change. You need to repackage and sign yourself.
23:04nalexanderplujon: can you tell me what happened with the CANDIDATE_TREES?
23:05plujonnalexander: Sure. Let me retrieve the buffer...
23:05plujon1. hg clone https://hg.mozilla.org/releases/mozilla-release
23:06plujon2. mach bootstrap and stuff
23:08plujon3. ./mach configure && ./mach build
23:08glandiumgps: fwiw, +- works, but I didn&#39;t go that route
23:09plujonnalexander: http://ix.io/ytZ
23:11nalexanderplujon: I think that the Task Cluster build jobs aren&#39;t pushing artifacts to the Task Cluster index for mozilla-release in the same way that they do for Nightly and other configuartons.
23:12plujonnalexander: Ah, sounds like Greek to me. I&#39;d probably be happier getting the repackage and signing thing working. But I have yet to figure that out.
23:14nalexanderplujon: yeah, the last thing in the index (the thing that maps source revisions to artifacts for artifact builds) dates from April 2017 (https://hg.mozilla.org/releases/mozilla-release/rev/fef4c3a1cf544e1ced8c477d6ce0e080130cb274)
23:14nalexanderplujon: which is far too old to use for your tree. Why does this need to be based on release?
23:15mshalglandium: can you take a look at the patch if you have a minute?
23:15glandiummshal: already typing a nit.
23:15plujonnalexander: Well, I want to release as well.
23:15plujonI.e., I want a stable, fit-for-users version of Fennec.
23:15mshalhah, thanks :)
23:17nalexandermkaply: it seems like you&#39;re the person most likely to know how to do the repackaging plujon wants.
23:20plujonmkaply: FWIW, here is the problem I hit: http://ix.io/sHB
23:21plujon(using a naive unzip and zip)
23:27nalexanderplujon: oh! That&#39;s because the ZIP needs to have special compression types for certain files. Let me find the script that does the right thing.
23:27nalexanderplujon: use http://searchfox.org/mozilla-central/source/python/mozbuild/mozbuild/action/package_fennec_apk.py
23:28nalexanderplujon: following the bits at http://searchfox.org/mozilla-central/source/toolkit/mozapps/installer/upload-files-APK.mk#94
23:28nalexanderplujon: hwich you&#39;ll take from your unzipped APK rather than MOZ_PKG_DIR, etc.
23:30* plujon looking
23:30nalexanderplujon: looks like that script is even built to update an existing APK, so you just need to figure out how to make it glue in new distribution/ files.
23:30nalexanderplujon: that won&#39;t do the rebranding, thouhg -- for that you really need an (artifact) build.
23:31plujonOh. Hmm. What is the Distribution_Files thing all about, then?
23:31nalexanderplujon: anyway, I can help (a little) with this if you&#39;re still stuck tomorrow. email might be best -- nalexander [AT] mozilla.com
23:31plujonnalexander: Thanks for the help. Farthest I&#39;ve gotten on this!
23:31nalexanderplujon: it&#39;s about customizing certain parts of Fennec, designed to be used by OEMs/partners to set initial bookmarks, toggle preferences, etc.
23:32plujonI&#39;d be happy to partner with Mozilla. But I doubt I qualify.
23:35plujonnalexander: Is it likely that artifact builds will be supported for releases? I&#39;m not sure I grok.
23:36nalexanderplujon: it&#39;s not easy to arrange, but I&#39;ll try to file the bug about the index not being updated.
23:36nalexanderYou can always just build yourself -- it&#39;s not that hard or that long.
23:36plujonI suppose I can build from source. I&#39;ve got them downloaded now. 4 GB and about ?? minutes (yet to test).
23:58gpswow - dump_syms takes forever to run :/
14 Jul 2017
No messages
   
Last message: 10 days and 17 hours ago