mozilla :: #jsapi

8 Aug 2017
00:06RyanVM(sorry, couldn't resist)
00:08pbonei often say g'day, both IRL and on IRC.
00:08pboneit's cool.
00:09pboneif I actually hear you say it and pronounce it in the weird way non-aussies do, then that's something else.
00:10pboneit sounds worse than that to me, but yeah, that's the general effect.
00:11pboneShu was asking me about Australian food. I also explained that we don't "throw shrimp on the barbie" we might however "Barbeque some prawns"
00:12pbonebut it's not an every day thing, sometimes we'll do that for Christmas.
00:12pboneThen that had everyone a little surprised because Christmas is good BBQ weather.
00:13RyanVMyeah, that's easy to forget
00:13pboneSo it's about 50/50 whether we have roast meat or cold meat for christmas lunch. And usually it's lunch rather than dinner, I don't know why.
00:14pboneRoast pork cooked in an outdoor BBQ is popular with my family. But we also do roast chook sometimes.
00:14pboneoh, chook = chicken.
00:14* RyanVM was about to ask
00:14pboneand chicken = chick.
00:14pbone(when they're alive)
00:14pbonewhen it's meat it's just chicken.
10:16sewardjjandem: ping
10:17jandemsewardj: pong
10:18sewardjjandem: shall I back out 1386680 ?
10:18jandemsewardj: maybe we can check if we indeed grow the stack now on sunspider?
10:19sewardjjandem: I can put a check in, but .. can I run sunspider from inside a browser?
10:20jandemsewardj: yeah, click Start now here:
10:21sewardjjandem: ok, thanks. I'll try in a minute.
11:00sewardjjandem: with the initial stack size at 256 (per my patch), we double it 133147 times on SunSpider
11:00sewardjjandem: with it at 512, we double it 456 times
11:00sewardj(and, it looks like, zero times in sunspider itself)
11:00jandemsewardj: do you know if we double it with the original 1024?
11:01sewardjjandem: I print out old/new sizes on doubling. So yes, there is some doubling 512->1024 and 1024->2048
11:01sewardjbut I think that is the browser itself -- those happen before I start sunspider and after it finishes (during shutdown)
11:02* sewardj re-checks
11:03sewardjjandem: yeah, even at an initial size of 512, there is no doubling whilst sunspider (the test itself) runs
11:05jandemsewardj: since the jit RegExpStack is per-thread, I wonder if things are different in the shell, or do we restore the size immediately afterwards?
11:06sewardjjandem: uh, I have no idea how to find out
11:06sewardjjandem: is it easy to get awsy to test the initial size = 512 case? If it turned out to be no perf loss,
11:07sewardjthat would be good, since it still halves the useless memory traffic due to free compared to the original (1024) start size
11:10jandemsewardj: let me check If I can repro the regression on os x
11:10sewardjjandem: thx
11:27jandemsewardj: in the shell, when running sunspider once, we grow from 256 to 512 ~18000 times
11:28jandemsewardj: 512 should be fine, we don't grow it to 1024
11:28jandemsewardj: are you using a 64-bit build btw?
11:28sewardjjandem: yes, I now think that 512 would be a better starting size.
11:28sewardjjandem: yes, linux 64
11:28jandemwe could be smarter and not grow/shrink each time we execute a regex
11:28jandembut that's a pre-existing issue
11:29jandemsewardj: rs=me if you want to change it to 512
11:29sewardjjandem: I do want to change it to 512, yes. Thanks. what does "rs" mean?
11:29sewardj(I have seen it but i don't know what it means)
11:30jandemrubberstamp, usually for simple changes
11:31sewardjjandem: ok, so I'll just make the change and push it immediately, with rs=you, yes?
11:31jandemsewardj: sounds good
11:32sewardjjandem: thanks. Sorry I didn't check the perf effects first :-/
11:32jandemsewardj: no worries. I didn't expect the stack to grow this big
11:34sewardjjandem: btw, I think the stack is word-size independent. AFAICS it's a stack of int32_ts
11:35jandemsewardj: ah interesting
11:37jandemevilpie: we removed so many non-standard features, just some big ones left
11:37jandemevilpie: but for-each is almost dead and legacy generators getting there too \o/
11:37evilpieyeah, I just started looking into enabling the warning for watch/unwatch
11:38evilpieI don't think we really need it anymore
11:38jandemyeah i think the JITs don't always respect it even
11:39jandemjust removing these properties from Object.prototype would be great, "watch" is probably pretty common as method name
11:39jandemIterator is also sad, shu posted a patch for that
11:39evilpie__iterator__ would be really nice
11:39evilpieI kind of wish people on the Firefox side would fix that stuff
11:45jandemdisabling legacy generators would be nice, now if you forget to use function* you don't get any errors
12:24sewardjjandem: who would be a good person to ask about js/src/jit/MIRGraph.h, in particular the threading aspects?
12:24jandemsewardj: i can talk about it
12:25sewardjjandem: I have a MESI (multiprocessor cache coherence) simulator/profiler
12:26sewardjjandem: and it tells me that there are a lot of false-sharing cache misses happening in MDefinitionIterator::MDefinitionIterator
12:27sewardjand InlineListIterator::operator++
12:27sewardjjandem: now, it is somewhat experimental, but I also wouldn't completely say it's useless.
12:27jandemsewardj: most of this data is allocated on the main thread.. then used off-thread.. that would do it I guess?
12:28sewardjjandem: maybe (yes). This is false sharing
12:28sewardjso it's not really that one thread is writing fields and the other is reading them
12:28sewardjInstead, that two different fields in the same cache line are accessed by different threads
12:29sewardjjandem: is that plausible?
12:30jandemsewardj: i don't see how, offhand. All of this data should be used by one thread at a time
12:31bbouvierPSA: stopping AWFY shells for a while, trying to identify what's causing chrome to not build
12:32sewardjjandem: ok. I'll leave it for now.
12:33sewardjjandem: fwiw, js::jit::MakeMRegExpHoistable is shown as the second highest source of MESI protocol misses when running Speedometer
12:34sewardj(for all of Gecko).
12:34jandemsewardj: that makes sense, sadly. It's the first thing we do off-thread after we generate the graph on the main thread
12:34sewardjSo I dunno whether that's true, or my tooling is wrong, or what.
12:35sewardjjandem: does that involve visiting a large number of relatively small objects?
12:35jandemsewardj: yeah all MIR instructions etc. These are LifoAlloc (bump allocator) allocated
12:36sewardjjandem: so .. I think this will only slow down the compiler thread, as it effectively "pages in" the MIR from the main thread
12:37sewardjproviding that the main thread doesn't try to access the MIR once the compiler thread has started working on it
12:37sewardjbut that sounds like it would be a race-y thing, so I assume it doesn't do that
12:37jandemsewardj: correct
12:38sewardjjandem: ok. so then it's an "unavoidable cross-core communication cost"
12:39jandemyeah building the MIR on the background thread is complicated unfortunately
12:44sewardjjandem: at least the performance measurements sync with your understanding of how it works. That's good.
12:44jandemsewardj: yeah the tool seems to work well :)
12:51charlesWill Ion/aarch64 be officially supported?
13:00lthwoot, SharedArrayBuffer is enabled by default in Chrome 60
13:00* lth does a little dance
13:00lthalso in FF55, so that makes three out of four
13:03joncolth: nice!
13:06bbouvieris it enabled in the tor browser yet?
13:11bbouvierbtw, PSA: AWFY shells have restarted and v8 should be working again
13:15nbpsewardj: If you have a way to identify what else might be writting on a different thread, I would be interested in that.
13:16nbpsewardj: but by design it is not supposed to happen.
13:18sewardjnbp: are you familiar with the kcachegrind GUI? I can send you the profile output, and you can peer at it
13:18sewardjbut you won't be able to see annotated source
13:18nbpsewardj: js::jit::MakeMRegExpHoistable is the first thing which runs when we resume the compilation off-main-thread.
13:19nbpsewardj: So, this is expected as all the cache lines have to migrate from one core to the other.
13:19nbpsewardj: what would be nice, is if we had a way to either tell the CPU to discard pages from the main thread, or to move them to another thread.
13:20nbpsewardj: Yes, I am familliar with kcachegrind gui.
13:20sewardjnbp: when you say "page", you mean "cache line", right? Because that's the unit-of-memory-management here
13:21nbpsewardj: I mean TLB pages, allocated by the LifoAlloc.
13:22nbpsewardj: but I guess we could write a small loop to iterate over all the LifoAlloc memory to get all the memory prefected on the other thread.
13:25sewardjnbp: that seems to me like a very dangerous game
13:26nbpthe problem is that we don't know which thread is going to be scheduled where.
13:26sewardjnbp: imagine if the compile thread was running on the same core
13:26sewardjnbp: a generally safer game seems to me to reduce the size of transferred data as much as possible
13:27sewardjand also to be sure that it is packed efficiently into cache lines
13:27sewardj(iow, you're not transferring fields that don't need to be transferred)
13:27nbpThis sounds like a stupid idea, but could it be better to lock the main thread while we run IonBuilder off-main-thread, and unlock it as soon as IonBuilder is complete.
13:28nbpjandem: ^ This sounds crazy, but if the number of MESI un-shared is too high, this might be worth having the contention on a lock plus all JSObject manipulated by this compiler thread.
13:29sewardjnbp: I don't think that will help (locking the main thread)
13:29sewardjcurrently what happens is (AIUI)
13:29sewardjmain thread creates Ion structures (in its own cache)
13:29sewardjtells the compiler thread to process it
13:29sewardjcompiler thread runs, takes a bunch of L2 misses to pull in the data from the main thread ('s core)
13:29nbpsewardj: We are generating multiple KB of data, locking the main thread would be benefitial as we would not be trashing the the major part of the cache of the L1.
13:30sewardjnbp: the point is, the cache misses happen for the compilation thread
13:30sewardjbut not on the main thread
13:30nbpsewardj: but the main thread cache is trashed by the compiler content, and everything would have to be reloaded.
13:30nbpsewardj: including the instruction cache
13:31* sewardj confused
13:32nbphum I guess no because the lock would cause the scheduler to put another thread on the same core :/
13:32nbpso this would only depend on the inability of other threads to not trash as much as IonBuilder.
13:39sewardjnbp: /me seriously doubts it is possible to "beat the hardware at its own game" here
13:41nbpsewardj: another option, would be to allocate one more 32K page and cheat the CPU into using a write-back mode for the copied data, before using it on another thread.
13:41nbp32K "page" (segment) of LifoAlloc
13:41nbpbut this would be hard as all of it are containing pointers :/
13:42nbpI wonder if we can use the write-back mode by copying into it's own cache-line.
13:50nbpI guess this is something we could do, if we have enough data t fill the L2.
13:54nbpsewardj: we are not trying to beat the hardware, we are trying to fit in the sweet spot of the hardware.
14:14sewardjjandem: I just pushed the stack size "rs" fix to m-i. Will that get merged to m-c in the normal way, given that the bug was already closed?
14:14jandemsewardj: yeah that's fine
14:15sewardjjandem: k, thx
14:27lth(for the tor joke :)
14:30bbouvieri don't actually know if JS is even enabled in the default tor settings
14:48nbpbbouvier: it is, I think they even enabled Ion lately.
14:59jonco!seen anba
14:59firebotanba was last seen 21 days and 22 hours ago, saying 'Does anyone know why it costs so much to null the ArrayIteratorSlotIteratedObject slot of ArrayIterators? At least in micro-benchmarks, nulling the slot is about 25% of the time. For example this
14:59firebot-benchmark improves from 475ms to 350ms with this patch applied' in #jsapi.
15:48nbpglandium: do we have an environment variable for clang arguments, which is not CPPFLAGS?
15:49nbpglandium: I can no longer configure with gcc because we attempt to add -Qunused-arguments, to tell clang to shut-up, which is not supported on gcc.
16:53joncoshu: ping
16:55shujonco: pong
16:55joncoshu: hey
16:55joncoshu: I have a modules patch I need reviewed but I know you're not around much longer
16:55joncoshu: do you have time to look at it or should I find someone else?
16:55shujonco: how big is it?
16:55joncoshu: fairly big tbh
16:56shujonco: link?
16:56joncoshu: bug 1374239
16:56firebot NEW, Store and rethrow module instantiation/evaluation errors
16:57shujonco: that's not terrible
16:57shujonco: i'll look at it today
16:57joncoshu: great, thank you
16:58shujonco: (i've blocked r? requests, so no need to flag me)
16:58joncoheh, ok cool
17:04jimbWow, rr is definitely not hitting breakpoints in children
17:06jimbOh, rr replay --onfork
17:26tcampbellwell, that explains why I was having trouble with rr last week..
17:33jimbYeah, you have to tell it which process you want to replay, and it only replays that one
17:33jimbwhich, in my case, is actually what I really want
17:41sfinkhow do I get speedometer numbers out of awfy for a try push? I scheduled a bunch of jobs. Most failed for mysterious reasons. succeeded, giving "Benchmark: speedometer 1.0 __total__ 75.8926251019"
17:41sfinkI have no idea what that compares to in any of the awfy graphs
17:41sfinkand is that speedometer v1 or v2? (I know it says "1.0". I don't know what to believe anymore.)
17:54jimbtcampbell: Actually, for the record, it's rr replay --onprocess PID
17:57tcampbellthanks. I see that now in the instructions
18:16naveedjorendorff: ping
18:18jorendorffnaveed: pong
18:18jorendorffsorry, sound was off
20:32glandiumnbp: we only add -Qunused-arguments for clang
20:36bbouviersfink, speedometer-misc is the actual speedometer v2, in the list of benchmarks
20:36naveedhey team if you have not updated the q3 goals Sheet please do so today
20:47nbpglandium: in, we add it to the CPPFLAGS, and with Stylo enabled I need to have both clang and gcc, and gcc gets called with the -Qunused-arguments, which causes the configure to fail, unless I comment these lines in
20:55glandiumnbp: so, your gecko build is using gcc and your cargo build is using gcc? that shouldn't be happening
20:58nbpglandium: I had to specify the --with-libclang-path and --with-clang-path options to get things to go pass the stylo checks for bindgen.
20:58nbpglandium: before that I had no need for clang.
20:59nbpglandium: and yes, I tihnk cargo build relies on gcc.
20:59glandiumnbp: are you setting CC=clang?
21:00nbpglandium: no CC="ccache gcc"
21:01glandiumthat makes no sense
21:01nbpI agree.
21:01glandiumwhat's the output from configure?
21:02nbpglandium: gcc complains under the MOZ_CONFIG_SANITIZE that it does not know the -Qunused-arguments command line option.
21:05glandiumnbp: if you read that code you commented out, you'll see the flag is only added for clang, which means configure thinks you're using clang. The question is why. So, what is the output of configure?
21:05nbpglandium: no that is added only if clang is present
21:05glandiumnbp: no, those variables mean "the compiler is clang"
21:05nbpglandium: with I guess is somehow implied by --with-clang-path option.
21:06glandiumnbp: no
21:06glandiumSo, what is the output of configure?
21:06glandium3rd time I won't ask once more
21:12nbpglandium: I can't reproduce it anymore. I don't get it.
21:16nbpglandium: but I got tons of these errors:
21:18glandiumnbp: so you want me to ask a 4th time or what?
21:19nbpglandium: I cannot reproduce it, no need to ask any more.
21:21nbpglandium: but the latest error seems to be related to the fact of having a CONFIG_SITE to a file which contains CC="ccache gcc" and CXX="ccache g++"
21:22nbpglandium: I don't any copy of the log, as I am compiling within emacs, and forgot to make a copy of the log.
21:57WeirdAl*squeaky wheel* Bug 1384619, would anyone with a symbols (but not debug) build be available to look into it?
21:57firebot NEW, es7-membrane test page breaks depending on Firefox build configuration
23:56sfinkglandium: ping
23:56glandiumsfink: pong
23:56sfinkglandium: I haven't paid attention. What's the current state of toolchain build dependencies?
23:56glandiumsfink: in place
23:56sfinkas in, if I wanted to build the sixgill gcc plugin in a taskcluster job and transmit that to a hazard job, is there a good path for that now?
23:57glandiumas of last week
23:57sfinkoh, cool!
23:57sfinkmy procrastination pays off again
23:57sfinkhow do I use it?
23:58glandiumsfink: look at taskcluster/ci/toolchain/*.yml for toolchain jobs, and look at taskcluster/ci/build/*.yml for the toolchains sections
23:59sfinkexcellent, thank you very much for working on that! (at least, I assume this was you)
9 Aug 2017
No messages
Last message: 13 days and 5 hours ago