mozilla :: #e10s

18 Apr 2017
13:02posidrondoes anyone know who works on shmem ?
13:13jesupping - have a content process hung in sandbox code (m-c build, local, caught in GDB)
13:22gaborjesup: bobowen might want to hear about it... do you have a stack?
13:23gabormconley: ping
13:23gabormconley: is there a chance you can take a look at the probe patch today?
13:24* gabor will be on PTO from tomorrow and e10s-multi needs this probe for the beta exp.
13:36jesupUTCToLocalStandardOffsetSeconds () -> __GI_mktime() -> __tzfile_read("/etc/localtime"...)
13:36bobowenjesup: I take it this is Linux ... calling gcp ^
13:37jesupgdb works better in Linux :-)
13:37jesupgcp: ping ^
13:37jesupNoticed it was taking forever to switch to a tab, so I gdb'd it
13:38gcpI think you crashed elsewhere
13:38jesupjust updated to latest m-c (local build)
13:38jesupMaster process and other content processes seem fine
13:38gcpbroker in master process isn't responding
13:38jesupI'm in gdb - what should I search for in other threads?
13:39jesupah. I could gdb the master if you like.... (though I'm typing in it here ;-). If I do, what should I look for?
13:40gcpand a thread
13:40gcp SprintfLiteral(threadName, "FS Broker %d", mChildPid);
13:58mconleygabor: hey - yeah, will do
13:58mconleygabor: once I break out of these meetings. :)
14:00gabormconley: hah :) thanks a lot, and embrace the pa...*cough*... meetings
15:59bsmedbergchutten, rvitillo, ehsan_ mconley filed bug 1357457
15:59firebot NEW, Create a variant of INPUT_EVENT_RESPONSE_MS which coalesces overlapping "hangs"
16:06gaborelan: bug 1352961
16:06firebot NEW, Add a measure for the delay before a new tab starts processing the first URL
16:27mrbkapgabor: can we try to uplift bug 1324428 to (now) beta?
16:27firebot FIXED, [e10s-multi] Adding back a simplified pre-allocated process mechanism
16:28* mrbkap didn't realize that hadn't been on Aurora.
16:28gabormrbkap: I could not even land it yet on nightly...
16:29mrbkapgabor: hm? Isn't it landed but preffed off?
16:29gabormrbkap: well... some fixed are not landed yet, I could uplift those without the pref though
16:30gabormrbkap: do we want to turn it on for users during the experiment though, without testing it first on nightly even?
16:30mrbkapgabor: ah, I see.
16:31gabormrbkap: there was a security issue that held it up, other than the few failing tests...
16:31mrbkapgabor: no, but the changes to ContentParent.cpp and ProcessSelector.js would be nice to have.
16:31gabormrbkap: hmm...
16:31mrbkapgabor: I could pull that stuff into a separate patch to uplift.
16:32mrbkapgabor: since that is (AFAICT) safe
16:32gabormrbkap: sounds good to me
16:32mrbkapgabor: k
16:46mrbkapgabor: still here?\
16:47gabormrbkap: yepp
16:47mrbkapgabor: don't we need bug 1346288 on beta?
16:47firebot FIXED, Disable e10s-multi for SDK addon users
16:49gabormrbkap: damn :( I thought I uplifted that long time ago :(
16:54gabormrbkap: I remember you changing some of this, are some of your uplift attempts blocked by this?
16:54mrbkapgabor: well, the beta experiment code won't work without that uplfited.
16:55mrbkapgabor: since it relies on the *BlockedBy* prefs to be updated.
16:55mrbkapgabor: but the experiment code does apply on top of changes from that bug.
16:55gabormrbkap: sure... but if we uplift this first then it won't be an issue I hope
16:56gaboranyway, I file the request
16:56gaborsorry about this...
16:59mrbkapgabor: but I think this bug depends on the process manager work too.
17:00gabormrbkap: the preallocated pm?
17:00mrbkapgabor: Yeah.
17:00mrbkapthe same pieces I was asking about: the ContentParent.cpp bits
17:01gabormrbkap: I don't think so, this one is landed the other one is not...
17:02mrbkapgabor: ?
17:02firebotBug 1324428 FIXED, [e10s-multi] Adding back a simplified pre-allocated process mechanism
17:06gabormrbkap: right... now I see why I was confused earlier, I thought we were talking about bug 1341008
17:06firebot NEW, [e10s-multi] Use the preallocated process manager by default
17:06mrbkapgabor: ah, no
17:07gabormrbkap: so... the same comment says: status-firefox54: --- fixed
17:08gabormrbkap: that's why I never requested any uplift for this one
17:08mrbkapMaybe all I need is the addon bug, then.
17:10gaboralright, I'm filing the request for that then check if this patch is really on 54
17:17mrbkapgabor: hm, the patch to ProcessSelector might need merge help, fyi.
17:17mrbkapgabor: depending on hg's algorithm.
17:17mrbkapgabor: bug 1344711 changed it
17:17firebot FIXED, Add an eslint rule to report an error when a get*Pref call is the only instruction in a try block
17:21gabormrbkap: hg's algorithm... nice catch btw!
17:23elanmrbkap, felipe: RelMan has moved the go to build for Beta 1 for us
17:23elanbut they can't do a Beta 1 and a Beta 2 this week
17:24gaborbsmedberg: Hi, do you think you will have time today for a data review for bug 1352961? Or if not, could you suggest any peer who might have time for it? I should land it as soon as possible...
17:24firebot NEW, Add a measure for the delay before a new tab starts processing the first URL
17:25bsmedberggabor, I am behind but hope to get through reviews today. You can also ask francois.
17:25elanmrbkap: I heard that we would land the system add-on changes directly to beta
17:26mrbkapelan: I was hoping to land to nightly & beta.
17:26elanso that means if that can happen today, we would still have a fighting chance for beta 1
17:26mrbkapbut yeah, that sounds right.
17:26elanok, that's good for me to know
17:26mrbkapelan: we also need approval for bug 1346288 to land on beta.
17:26firebot FIXED, Disable e10s-multi for SDK addon users
17:26elanso landing in Nightly today and getting the beta Uplift request tonight or first thing tomorrow
17:27elanso that it would be part of the build tomorrow our early evening
17:27elanwe'd possibly be in the clear
17:27elanmrbkap: ^
17:27mrbkapelan: great.
17:27gaborbsmedberg: that sounds good thanks, I'll ask francois tomorrow if you wouldn't get there by then
17:31felipemrbkap: ping
17:31mrbkapfelipe: pong
17:34felipemrbkap: should there be a change that defines the pref "dom.ipc.processCount.web"? I see the patch "Use a centralized function to tell if e10s-multi is on" but I'm having trouble seeing that that's all the places that need changes
17:35mrbkapfelipe: I wasn't sure if I should define it.
17:35felipewhat will ensure that Nightly users continue to use 4?
17:37mrbkapI knew I was forgetting something.
17:38mrbkapfelipe: My original version of the patch did the right thing and then I changed it :(
17:38* mrbkap fixes.
17:38felipemrbkap: I see that there's still a lot of references to "dom.ipc.processCount" in tests.. Are they fine to remain like that
17:39mrbkapfelipe: yes because setting dom.lipc.processCount overrides the .web version no matter what.
17:39felipeah ok
17:40felipeI guess you'll need to check if that has a user-value too to call that an "opt-out"
17:42mrbkapfelipe: so, here's the idea:
17:43mrbkapfelipe: if dom.ipc.processCount is user-set, that overrides everything.
17:43mrbkapfelipe: if not, and dom.ipc.processCount.web exists, then we use that (part of the experiment)
17:43mrbkapfelipe: otherwise, we fall back to dom.ipc.processCount's value
17:44mrbkapfelipe: which will default to 1 on beta and 4 on nightly.
17:45felipeah, I just saw it being checked in the optedIn() function
17:45* felipe had missed that
17:46felipemrbkap: so .web is only set by the experiment, and then we don't need to define in all.js?
17:46mrbkapfelipe: right.
17:46mrbkapfelipe: So, I shouldn't ever check extensions.e10sMultiBlocksEnabling right?
17:46felipeit's like the .2 for the e10s-single experiment.. I understand it now
17:47felipemrbkap: right. If it's false then e10sMultiBlockedByAddons will never be set anyway
17:47froydnjI have a content process hanging for a full second or so trying to send spellchecking messages
17:47froydnjbut no other process appears to be processing the spellchecking message
17:47froydnjis that expected?
17:47felipemrbkap: also,
17:47mrbkapfroydnj: kan-ru is working on fixing that.
17:48mrbkapfroydnj: I assume you recently gave the window focus again?
17:48felipemrbkap: you either want to replace this, or add Services.appinfo.maxWebProcessCount as another thing being recorded
17:48froydnjmrbkap: which part? the hanging? or the bit not showing up in the profiler?
17:48mrbkapfroydnj: the hanging
17:48mrbkapfelipe: oh
17:48* mrbkap does that
17:48mrbkapfelipe: so far I have
17:49felipemrbkap: what's this aCheckEnabledPrefs?
17:49mrbkapfelipe: do you know if that does the right thing with undefined pref values?
17:49* felipe didn't understand
17:50mrbkapfelipe: I'm trying to avoid duplicating the code that checks dom.ipc.processCount.web and if that doesn't exist then checking dom.ipc.processCount.
17:51mrbkapfelipe: right now ContentParent::GetMaxProcessCount calls mozilla::GetMaxWebProcessCount.
17:51mrbkapfelipe: I want to let GetMaxWebProcessCount call back into ContentParent in the default case.
17:51mrbkapfelipe: but need to avoid infinite recursion.
17:51mrbkapfelipe: so the easiest way to do that is to add a flag
17:54felipemrbkap: hmm, and why this wasn't part of the patch before?
17:54felipe(trying to understand if it was due to something we talked about here)
17:55mrbkapfelipe: your question about making sure that nightly users continue to get 4 processes made me realize.
17:55mrbkapfelipe: the patch in the bug only returns dom.ipc.processCount if it's user-set.
17:55felipeah! ok
17:56felipemakes sense
17:56mrbkapfelipe: so nightly users would get 1
17:56mrbkapfelipe: Now, we'll check ...web and fall back to the default .processCount value.
17:56felipemrbkap: also, I just noticed that, contrary to what I commented, the existing code _does_ check both e10sBlocksEnabling and e10sBlockedByAddons.. I'm trying to dig history here to see if it was a mistake
17:56mrbkapfelipe: I swear this is the most complicated patch I've written at Mozilla.
17:57felipeyeah, it looked so simple on the beginning
17:57felipebut, so many conditions
17:58mrbkapfelipe: why is processCount marked as requiresRestart in TelemetryEnvironment.jsm?
18:01felipemrbkap: apparently it required a restart in the past?
18:01firebotBug 1185061 FIXED, Report dom.ipc.processCount to Telemetry
18:01felipeor so people thought
18:02felipe(thank god for searchfox's "show earliest version / show latest version")
18:05mrbkapfelipe: "oops" :)
18:06mrbkapfelipe: I don't understand the e10sMultiBlocksEnabling pref :(
18:07felipemrbkap: sorry to go back on the suggestion there.. I'm pretty sure what I said is ok, but to be safe here, leave that check for e10sMultiBlocksEnabling... But change it to "&&"
18:09mrbkapfelipe: So, here's the interdiff as a result of our conversation...
18:12mrbkapfelipe: I'll have new patches in the bug in ~5 minutes.
18:12felipemrbkap: alright.. I was expecting the TelemetryEnvironment patch to record the final value as computed by your Services.appinfo.maxWebProcessCount
18:12felipeLike it's stored here:
18:13mrbkapfelipe: oh, I'll add that in addition.
18:13felipeSo we don't need to run the logic when doing that analysis.. That will tell the final answer "how many content processes are being used"
18:13felipes/that analysis/data analysis/
18:13felipeI'll wait for the changes on the bug
18:13felipebut it looks good
18:46felipemrbkap: hey, looking at the new patch that changed GetMaxWebProcessCount
18:46felipethe past version was so much easier to understand without the recursion.. I believe it wouldn't be hard to adapt the previous code to just check hasUserPrefValue on dom.ipc.processCount and do the logic right there, even if there's some duplication
18:50felipemrbkap: something like this:
19:04mrbkapfelipe: ah ok
19:06mrbkapfelipe: but with && instead of || for the two addon-related prefs, no?
19:06felipeah, yes, I just copied from the previous version
19:07felipedoes that cover everything?
19:07felipeIt turned out a simpler than I expected, so I'm not sure if it's missing anything
19:09mrbkapfelipe: I think it does, yes.
19:22mrbkapfelipe: I don't understand your last comment about the telemetry ping split.
19:23felipemrbkap: all those watched prefs trigger a "stop accumulating data on this ping, and start a fresh new one with the new pref values"..
19:24felipesince the e10srollout will be changing that pref, it's just to avoid it
19:24mrbkapfelipe: ok
19:24felipeunless you really want to record it
19:24mrbkapfelipe: and just to be clear, it's just the .web version that we shouldn't record.
19:25feliperight.. (I guess the other one could be dropped too, but since it's already there..)
19:25felipethe other one has a meaning if it has a different value (the user opted in)
19:25felipethe ".web" one doesn't, it's just stored state
19:25mrbkapfelipe: new changes are up.
19:26mrbkapfelipe: I think without the cohort per bucket it was useful.
19:26mrbkapfelipe: but now it's redundant with the cohort.
19:34felipemrbkap: done!
21:55mrbkapmconley: ping?
21:55mconleymrbkap: pong
21:56mrbkapmconley: Hey, did you mean to needinfo yourself on bug 1350324? I was going to update the assignee and I can clear it if that was a click-o.
21:56firebot FIXED, Tabs permanently go into 'busy' state (grey spinner on the tab in the tabstrip) when dragged to anot
21:56mrbkapmconley: and thanks so much for finding that!
21:56mconleymrbkap: ah yeah, I meant to set that needinfo. I want to verify in tomorrow's Nightly
21:57mconleymrbkap: *blushes*
22:11billmdvander: ping
22:49dvanderbillm: pong
22:49billmdvander: hey, I was wondering why we do EnsureGPUReady all the time before sending messages to the GPU process. is that necessary?
22:50billmdvander: I'm kinda hoping it's not. it seems like we open the channel before starting to launch the process.
22:50dvanderbillm: we have to wait for an ACK from the gpu process before we start painting, pretty much any message indicates we want to paint though maybe some aren't
22:51dvanderbillm: what message areyou looking at?
22:51billmdvander: oh, so it's a GPU-specific thing?
22:51billmdvander: I'm hoping to do this for content processes, so we can launch them async.
22:52billmdvander: bug 1348361
22:52firebot ASSIGNED, sync IPC when launching a child process
22:54dvanderbillm: it might not be strictly necessary, i'd have to audit it. but something has to be sync somewhere and that was the easiest way to do it, while letting the actual process launch be async
22:54dvandermaybe it's enough that PLayerTransaction's constructor is sync
22:55billmdvander: ok, I wasn't really specifically asking about the GPU stuff. I was just wondering if there was something I hadn't thought of. thanks.
22:56dvanderbillm: ah okay. i dont think so then
19 Apr 2017
No messages
Last message: 181 days and 22 hours ago