mozilla :: #ionmonkey

10 Aug 2017
10:35jandembbouvier: ping
10:39bbouvierjandem, hi
10:40jandembbouvier: hey sfink said in the nursery strings bug "Can you do (or tell me how to do) an awfy try push? Most of mine seem to fail for reasons I don't understand. I've gotten a speedometer v1 run out of it, but no v2."
10:40jandembbouvier: is this supposed to work?
10:41bbouvierjandem, oh right, i meant to comment in this bug and forgot
10:41bbouvierjandem, it is supposed to work
10:42jandembbouvier: do we have docs somewhere? I haven't used this yet :)
10:43bbouvierjandem, not much, no :/
10:43bbouvierthat's a page we could enhance, at the moment one can only test one benchmark per awfy try run
10:44jandembbouvier: ah that looks great actually
10:44jandembbouvier: so maybe steve used speedometer instead of speedometer-misc?
10:44bbouvierjandem, yes, definitely could be useful for anyone working on the JITs
10:44bbouvierjandem, maybe; i think i pinged him on irc to tell him to use misc, though
10:45bbouvierjandem, if an awfy try run fails, please let me and automation team (bob clary in particular) know
10:48jandembbouvier: ok let me try this :)
12:31nbptcampbell: is there a JIT meeting today, asmentioned in the pad?
12:52nbptcampbell: oh, was this yesterday, or next week?
12:53nbptcampbell: I will move the meeting to next wednesday, I don't think I can do any video because my connection is not even stable enough for IRC :/
12:54bbouviernbp, are you not in the paris office?
12:55nbpbbouvier: no, in normandy, using the wifi of the neighbour.
12:58nbpbbouvier: the troughput is not bad, but it keeps getting disconnecte, which is not ideal for video/audio meetings.
13:12jandembbouvier: hm something seems broken
13:13jandemnbp: isn't it supposed to be next Wednesday?
13:13jandemi thought it was biweekly
13:14nbpjandem: it was flaged as being on the 9, I just updated it to be next week.
13:14nbpjandem: I thought that too, but as I was not in the previous meeting, I figured it might have changed.
13:14nbpjandem: then I realized that we were not Wednesday, and updated the pad.
13:17bbouvierjandem, ha, it failed to download the build
13:17jandemnbp: ah right
13:26tcampbellnbp: yeah, date was just a typo
13:26nbpok :)
13:26tcampbellnbp: I'm really not understanding your interest invariants in the phi analysis patch
13:27nbptcampbell: do you understand the need for the useRemoved flag?
13:27tcampbellnbp: I understand how the original worked and why it can be slow.
13:28tcampbellI see roughly what the new version is doing, but can't convince myself if it is or isn't full of holes
13:28nbptcampbell: this is basically the same, except that instead of looping into inqueries about usage, I do the opposite by pushing the uses to the producer.
13:29nbptcampbell: but we have to handle loops, by doing the worklist on loop header/backedges.
13:29nbptcampbell: in which case this loop mimic what a fix-point would do.
13:30tcampbell(I'm still sick and not at my best, but I'm still not seeing the picture)
13:30tcampbellnbp: we are guaranteed to have structured CFGs in Ion, right?
13:31nbptcampbell: structured, as RPO / PO being well defined? yes.
13:31tcampbellWe'll, and loop headers being simple
13:31nbpthe only weirdness into that is the OSR entry block which is n2.
13:32nbpThis is one of the effort made in IonBuilder, to make a RPO set of basic blocks.
13:34tcampbellnbp: is the BPPrevRemoved flag needed at all in the new version?
13:38nbptcampbell: yes, it is needed for the following case:
13:39nbptcampbell: var a, b; for () { if () { call(a); } else { a = b ; } } or swap the if-block with the else-block.
13:40nbptcampbell: you need to find out that b has removed uses, when removing the branch with the call.
13:42nbptcampbell: BPPredRemoved is a way to be able to know that a block is going to be removed when propagating the uses-removed flags.
13:44nbptcampbell: when we add the BPPredRemoved, we do the propagation of the uses-removed flag, but when we cross the back-edge we have to pay attention to the BPPredRemoved flag, to do what would have been done, if we had known that b had removed uses.
13:45tcampbellnbp: In your use of BPPredRemoved, we use as a guard, and then look at all fan-in and check the status of those blocks. Could you not do the same thing without the BPPredRemoved guard? You are already traversing the operand list in the code below based on phi->block status
13:45bbouvierjandem, this is actually a quirk of AWFY: the requested revision has to be the top one
13:45bbouvierjandem, so in our example: the top one is 479b80d39f53, not 0bcf65f05079
13:48nbptcampbell: you want to propagate the uses-removed flag beyond the BPPredRemoved flag?
13:48jandembbouvier: interesting, thanks! I just took the one steve posted in the bug, else i'd have picked the top one :)
13:48jandembbouvier: thanks, let me try again with that
13:49nbptcampbell: so you want to push all the used-removed to non-removed Phi?
13:49tcampbellnbp: I'm thrown off by the single BPPredRemoved flag if any Pred block is marked-for-removal, and then you iterate over the Pred's looking if they are marked-for-removal without knowing which indicated
13:50nbptcampbell: I think you are right.
13:51nbptcampbell: for () { if() {} else {} if() {} else {} }
13:51tcampbellnbp: I don't know if this is a) unneeded, b) an edge case we miss, c) very subtle and needs documentation
13:52nbpif we flag one Phi has having predecessor being removed, we have to continue iterating over the operands of this Phi.
13:52nbpat least until we reach a block which we have not yet visited.
13:53nbp^^ and iterating over the operands of the Phis after it.
13:53* tcampbell back in 5-10min
13:54nbptcampbell: guess, an easy solution would be to forward the BPPredRemoved flag as well, like is already done for the isUsesRemoved.
14:11tcampbellnbp: would keeping the original approach, but adding the BPUsed flag help enough? or does it still fall apart when there are too many MPhis that are not used?
14:18nbptcampbell: the original approach is going in reverse, which is the costly factor, because we are not caching the used-removed flags.
14:18tcampbellhmm.. right
14:20tcampbellnbp: is it alright if I close out this review since at the very least some of this subtlety needs better comments? (happy to keep discussing)
14:20nbptcampbell: sure
14:20tcampbellnbp: sorry it took so long =\
14:22tcampbellI guess all the compiler transformation stuff had swapped out of my head. I'm always worried about edge cases on this dataflow stuff
14:26nbptcampbell: hum there is also gvn using this code, so it might actually be safer to keep the old algo :( and some caching to it.
14:27tcampbellnbp: bah
14:28tcampbellI should probably study our GVN pass one day
14:28nbptcampbell: adding caching to it promess to be a wonder of memory allocations :(
14:29tcampbellnbp: does keeping the BPUsed flag do any help with the worklist, or is it entirely in the wrong order like you mentioned above?
14:30nbptcampbell: I guess we could iterate the worklist in reverse order, and flag any phi operands as we go.
14:30* nbp thinking aloud
14:30* tcampbell understands
14:31tcampbellcorrectness wise, it all sounds fine
14:31nbpthe instead of only removing it from the work list, we would do the flag propagation that way,\.
14:33nbpI will try that.
14:36tcampbellnbp: is there a reason we don't use dominance frontiers to build our SSA in the first place?
14:36tcampbell(only tangentially related)
14:36tcampbelldo we need very specific Phis to exist for ResumePoints to behave?
14:36nbptcampbell: we do it after, but I wanted to avoid doing it twice as branch pruning is removing tons of blocks.
14:37nbpGVN reruns the dominator computation at every iteration.
14:37nbpno we don't need Phi for resume points.
14:38nbptcampbell: I guess the main reason is that we had no dominance information from the bytecode.
14:38nbptcampbell: since we are generating the CFG during baseline compilation, we might as well do it then.
14:39nbptcampbell: oh, I think I got a nicer idea.
14:39tcampbellnbp: I just remember getting good improvements in a past project when we updated to using the dominance frontier SSA algorithm rather than generating naive Phis and pruning later
14:40tcampbellI notice that both V8 and JSC use the frontiers algorithms for their SSA
14:40nbptcampbell: instead of having the loop to query whether the uses of a Phi have uses them-selfs. we could just add a flag similar to the BPUses, but computed without the BPPredRemoved boundary
14:41nbptcampbell: we have dominators.
14:41nbptcampbell: they are just being computed later.
14:42tcampbellnbp: yeah, I understand. The original intent of the algorithm was to avoid generating PHis you will quickly throw away. Avoids memory churn
14:43nbptcampbell: ok, what if we unconditionally flag all Phi which have uses or indirect uses in the graph?
14:44nbptcampbell: iterating over operands is faster than iterating over the use list.
14:45* tcampbell thinking
14:45nbptcampbell: Thus, as a first pass, even before Branch Pruning (and GVN too), we would iterate over the graph looking for Phi uses.
14:46nbptcampbell: and flag all these Phi with "Used" flags.
14:47nbptcampbell: that reminds me I wanted to split our uses into live/dead uses.
14:47tcampbellinteresting... when does the ImplicitlyUsed flag get set on MPhis?
14:47tcampbellduring IonBuilder?
14:48nbphaha it means literaly the same as UsesRemoved, and when we make a call we flag all arguments as having implicit uses.
14:48nbpyes, when we inline calls.
14:49nbpI have not get around merging the 2 flags yet
14:49nbpI should do that at one point
14:52nbpthe problem of the test case is that we have tons of resume points, I wonder if we should not move the hasUses flag into IonBuilder.
14:52nbpI guess a better design would be to distinguish the dead uses from the live uses.
14:53tcampbellI'm not seeing how IonBuilder would help
14:53nbphum maybe we can segregate the uses list, adding live uses at the beginning and dead uses at the end.
14:53nbphasLiveUses would then just check the first element of the use list.
14:53tcampbellnbp: does that give you trouble with MPhi predIndex ordering?
14:56nbptcampbell: This is not an uncommon pattern.
14:58nbpSo, if we were to add all resume points to the end of the use list, while all other we add then to the beginning, we could easily spearate the 2. When looking for resume point uses, this is for asking the isObseravble question, and this would avoid the isResumePoint/isDefinition question.
15:00nbpResumePointUsesIterator / DefinitionUsesIterator / and UsesIterator could be used in 2 loops UsesIterator u(def); for(; u && u.isDefinition(); u++) {} for (; u; u++) {}
15:01nbpThus we can remove one branch from all these loops. :)
15:01nbpI will file a bug for that and look at it after the JSBC
15:03tcampbelloh.. yeah, I see. that makes sense
15:04tcampbellwas confusing myself on uses vs operands
15:07nbpone issue might be instructions which are recovered-on-bailout, while they are instruction, they are actually similar to a resume point.
15:08nbp So we might have to store LiveDef / ResumePoint / RecoveredDef
15:09nbpor LiveDef / RecoveredDef / Rp, knowing that if there is no Rp we might still have the liveto recovered transition.
15:12tcampbellyep. bad idea time: If the traversal is really important, we could allocate sentinel uses in the list to seperated
15:12tcampbellwith consumer == null, or whatever
15:14nbptcampbell: sentinell would not be cheap, as they would basically cost memory for each MIR instruction.
15:15tcampbellyeah, I can't imagine it ever being worth it
15:27nbpso we have an unused flag for Phi instructions, maybe we should just split the phase in 2, and rely on that for flagging phi as having removed uses, if they don't have a the isUnused flag.
15:28tcampbellwhat is the definition of the unused flag? the MPhi has no uses?
15:28nbpjandem: union constructor seems to pass on Try.
15:29nbptcampbell: yes, i-e the opposite of what I am looking for.
15:29tcampbellnbp: why does that MPhi exist then?
15:29nbptcampbell: but the pass in questionworks-around by flagging all Phi ahead of time as being unused :/
15:30nbptcampbell: because our bytecode is a stack-based interpreter.
15:30nbptcampbell: and we don't know ahead of time if a stack slot is going to be used or not.
15:31jandemnbp: nice
15:31nbptcampbell: and baseline follow the same stack layout as the interpreter
15:31nbptcampbell: and Ion allow it-self to give Magic(OptmizedOut) values to baseline, when some phi are unused.
11 Aug 2017
No messages
Last message: 11 days and 21 hours ago