Closed
Bug 1418201
Opened 7 years ago
Closed 7 years ago
Crash: ShmSegmentsWriter failed to allocate chunk #0, Could not create content compositor bridge
Categories
(Core :: Graphics: WebRender, defect, P3)
Tracking
()
RESOLVED
DUPLICATE
of bug 1432375
People
(Reporter: jan, Unassigned)
References
(Blocks 1 open bug)
Details
(Keywords: crash, nightly-community, Whiteboard: [wr-reserve] [gfx-noted])
Crash Data
Attachments
(1 file)
|
402.01 KB,
image/png
|
Details |
I am unsure whether this is a clear WebRender bug. You can see WebRender in one crash stack and the patch from webrender bug 1403539 was about this topic in some way. But I suspect omtp being the cause ("Using observer service off the main thread!"), maybe it doesn't like the gpu process? I will disable omtp now and test this further after some sleep. Is the OMTP pref a legacy feature at the end that I should avoid? Nothing should disturb WebRender.
Nightly 59 x64 20171116220410 de_DE a3f183201f7f183c263d554bfb15fbf0b0ed2ea4 @ Debian Testing (KDE, Radeon RX480)
main profile: gpu process, layers force accel, webrender, blob-images, image.mem.shared, omtp, stylo-chrome
STR (Only for me. Didn't work in a fresh profile. :/ )
1. Open History Sidebar (Ctrl + H > Recent History)
2. Grab the scrollbar and scroll your very long history very fast up and down until the UI breaks
3. the Sidebar gets white, practically every text is gone, toolbar icons are gone (back, reload etc.)
bp-4d883526-3692-4224-bb5d-5b5b20171117 17.11.17 05:41 [@ InvalidArrayIndex_CRASH | mozilla::dom::ContentChild::RecvReinitRendering ]
> Process Type content (web)
> MOZ_CRASH Reason ElementAt(aIndex = 0, aLength = 0)
> GraphicsCriticalError |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=161.271)
bp-6370bffa-c622-4ee6-a4c5-a929d0171117 17.11.17 05:41 [@ @0x4084a1 ]
> Process Type gpu (web)
> MOZ_CRASH Reason called `Option::unwrap()` on a `None` value <------- I saw this message 4 days ago on Socorro in bug 1416602
The patch from webrender bug 1403539 was about this topic in some way:
> GraphicsCriticalError |[0][GFX1-]: ShmSegmentsWriter failed to allocate chunk #0 (t=5408.58) (etc.)
bp-587b05bc-77ca-45f6-8ae1-e16630171117 17.11.17 05:41 [@ nsObserverService::RemoveObserver ] (bug 1276919?)
contains WebRenderBridgeChild:
> 0 libxul.so nsObserverService::RemoveObserver(nsIObserver*, char const*) [clone .cold.123]
> 1 libxul.so ExpirationTrackerImpl<mozilla::layers::ActiveResource, 3u, detail::PlaceholderLock, detail::PlaceholderAutoLock>::~ExpirationTrackerImpl xpcom/ds/nsExpirationTracker.h:405
> 2 libxul.so mozilla::layers::ActiveResourceTracker::~ActiveResourceTracker xpcom/ds/nsExpirationTracker.h:533
> 3 libxul.so mozilla::layers::WebRenderBridgeChild::~WebRenderBridgeChild mfbt/UniquePtr.h:528
> 4 libxul.so mozilla::layers::WebRenderBridgeChild::Release gfx/layers/wr/WebRenderBridgeChild.cpp:44
> 5 libxul.so mozilla::detail::RunnableMethodImpl<RefPtr<mozilla::MediaFormatReader>, void (mozilla::MediaFormatReader::*)(already_AddRefed<mozilla::layers::KnowsCompositor>), true, (mozilla::RunnableKind)0u, already_AddRefed<mozilla::layers::KnowsCompositor>&&>::Run xpcom/threads/nsThreadUtils.h:1142
> 6 libxul.so mozilla::AutoTaskDispatcher::TaskGroupRunnable::Run xpcom/threads/TaskDispatcher.h:214
about:support
> (#0) Error ShmSegmentsWriter failed to allocate chunk #0
> (#122) Error Receive IPC close with reason=AbnormalShutdown
> (#123) Error Could not create content compositor bridge: 0x80610002 <- I haven't seen this in one of the three reports
> (#124) Error Receive IPC close with reason=AbnormalShutdown
> (#125) Error Receive IPC close with reason=AbnormalShutdown
> (#126) Error Receive IPC close with reason=AbnormalShutdown
> (#127) Error Receive IPC close with reason=AbnormalShutdown
> (#128) Error Receive IPC close with reason=AbnormalShutdown
> (#129) Error Receive IPC close with reason=AbnormalShutdown
> (#130) Error Receive IPC close with reason=AbnormalShutdown
> (#131) Error Receive IPC close with reason=AbnormalShutdown
> (#132) Error Receive IPC close with reason=AbnormalShutdown
> (#133) Error Receive IPC close with reason=AbnormalShutdown
> (#134) Error Receive IPC close with reason=AbnormalShutdown
> (#135) Error Receive IPC close with reason=AbnormalShutdown
> (#136) Error Receive IPC close with reason=AbnormalShutdown
Tried to reproduce my STR, got:
bp-c15c8410-d865-4b4e-9c09-be73d0171117 17.11.17 05:58 [@ InvalidArrayIndex_CRASH | mozilla::dom::ContentChild::RecvReinitRendering ]
> Receive IPC close with reason=AbnormalShutdown
> ElementAt(aIndex = 0, aLength = 0)
bp-baaf1cb8-801f-4596-a7e7-d0a530171117 17.11.17 05:58 [@ nsObserverService::RemoveObserver ]
> MOZ_CRASH(Using observer service off the main thread!)
bp-1d63166f-5ab4-4234-ac9e-3e1d50171117 17.11.17 05:58 [@ InvalidArrayIndex_CRASH | mozilla::dom::ContentChild::RecvReinitRendering ]
> Receive IPC close with reason=AbnormalShutdown
> ElementAt(aIndex = 0, aLength = 0)
bp-2cbee981-9ffa-461a-bb14-891ec0171117 17.11.17 05:58 [@ InvalidArrayIndex_CRASH | mozilla::dom::ContentChild::RecvReinitRendering ]
> ShmSegmentsWriter failed to allocate chunk
Updated•7 years ago
|
Flags: needinfo?(vliu)
Whiteboard: [gfx-noted]
Updated•7 years ago
|
Whiteboard: [gfx-noted] → [wr-mvp] [triage] [gfx-noted]
OMTP should not play in here.
| Reporter | ||
Comment 2•7 years ago
|
||
(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #0)
> 3. the Sidebar gets white, practically every text is gone, toolbar icons are gone (back, reload etc.)
= Some things disappear immediately, some other things only when I hover over them.
Updated•7 years ago
|
Whiteboard: [wr-mvp] [triage] [gfx-noted] → [wr-mvp] [gfx-noted]
| Reporter | ||
Updated•7 years ago
|
Crash Signature: [@ InvalidArrayIndex_CRASH | mozilla::dom::ContentChild::RecvReinitRendering ]
[@ @0x4084a1 ]
[@ nsObserverService::RemoveObserver ]
See Also: → 1276919
| Reporter | ||
Updated•7 years ago
|
Crash Signature: [@ InvalidArrayIndex_CRASH | mozilla::dom::ContentChild::RecvReinitRendering ]
[@ @0x4084a1 ]
[@ nsObserverService::RemoveObserver ] → [@ InvalidArrayIndex_CRASH | mozilla::dom::ContentChild::RecvReinitRendering ]
[@ @0x4084a1 ]
| Reporter | ||
Updated•7 years ago
|
Blocks: wr-stability
Comment 3•7 years ago
|
||
It seems that both above two crash signature were about to something wrong happens in gpu process. Hi David, do you think we are fine to run gpu process in Linux? Thanks
Flags: needinfo?(dvander)
| Reporter | ||
Comment 4•7 years ago
|
||
So far, the gpu process runs fine and has been essential for a stable use of WebRender.
* The only WebRender GPU process bug is bug 1406230 so far.
* If there is a fallback to OpenGL compositing, there can be bug 1415020 (has clear STR). Such a fallback should be avoided.
* bug 1415609 was a regression of blob-images and gpu-process and got fixed.
* Maybe this would be also helpful for proprietary Nvidia driver issues from other users?
Updated•7 years ago
|
Priority: P2 → P3
Whiteboard: [wr-mvp] [gfx-noted] → [wr-reserve] [gfx-noted]
| Reporter | ||
Comment 5•7 years ago
|
||
Nightly 59 x64 20171123220110 de_DE 0bb0f14672fdda31c19aea1ed829e050d693b9af @ Debian Testing (KDE, Radeon RX480)
main profile: gpu process, layers force accel, webrender, blob-images, image.mem.shared, omtp, stylo-chrome
(STR from comment 0)
I just pressed Ctrl+H and scrolled my recent history down and up (very fast) by scrollbar.
Meldungs-ID Sendedatum
bp-df09dde8-a4b2-4731-986a-9a7bc0171124 24.11.17 04:50 [@ EMPTY: no crashing thread identified; ERROR_NO_THREAD_LIST ]
> MOZ_CRASH Reason ElementAt(aIndex = 1, aLength = 1)
bp-76ece4ec-2caa-4d4c-97f0-730e70171124 24.11.17 04:49 [@ EMPTY: no crashing thread identified; ERROR_NO_THREAD_LIST ]
> MOZ_CRASH Reason MOZ_RELEASE_ASSERT(result.mFd.fd != -1) (DuplicateDescriptor failed)
bp-4a603da7-eeef-40ba-9cdc-07d650171124 24.11.17 04:46 [@ nsObserverService::RemoveObserver ] = bug 1419255
> Process Type content (web)
> MOZ_CRASH Reason MOZ_CRASH(Using observer service off the main thread!)
bp-ce98c36d-88ce-4183-ad0b-6a7270171124 24.11.17 04:46 [@ mozalloc_abort | abort | webrender::resource_cache::{{impl}}::get_font_data ] = bug 1413571
> MOZ_CRASH Reason called `Option::unwrap()` on a `None` value
bp-315b3617-1532-4e8e-b9c5-1ce670171124 24.11.17 04:45 [@ EMPTY: no crashing thread identified; ERROR_NO_THREAD_LIST ]
-- reenabled the gpu process here --
bp-1b22b626-05c2-4155-bdbb-7b5ab0171124 24.11.17 04:43 [@ EMPTY: no crashing thread identified; ERROR_NO_THREAD_LIST ]
> MOZ_CRASH Reason ElementAt(aIndex = 1, aLength = 1)
bp-0175e9a8-0acc-4b8d-8c49-87eed0171124 24.11.17 04:36 [@ EMPTY: no crashing thread identified; ERROR_NO_THREAD_LIST ]
> MOZ_CRASH Reason ElementAt(aIndex = 1, aLength = 1)
bp-3af8eb96-2865-4c01-a86a-0c3c60171124 24.11.17 04:35 [@ EMPTY: no crashing thread identified; ERROR_NO_THREAD_LIST ]
> MOZ_CRASH Reason ElementAt(aIndex = 1, aLength = 1)
-- disabled the gpu process here --
bp-7202a123-c198-4b89-8584-966540171124 24.11.17 04:32 [@ nsObserverService::RemoveObserver ] = bug 1419255
> MOZ_CRASH Reason MOZ_CRASH(Using observer service off the main thread!)
bp-de426b3b-9d81-4ea3-8f17-1d5730171124 24.11.17 04:32 [@ mozilla::ProfilerParent::CreateForProcess ]
bp-a0035082-75a1-42f3-8f20-166e10171124 24.11.17 04:31
[@ InvalidArrayIndex_CRASH | mozilla::dom::ContentChild::RecvReinitRendering ]
> Process Type content (web)
> MOZ_CRASH Reason ElementAt(aIndex = 0, aLength = 0)
--------------------------------------
This seems to be independent from the gpu process.
[@ InvalidArrayIndex_CRASH | mozilla::dom::ContentChild::RecvReinitRendering ]
> MOZ_CRASH Reason ElementAt(aIndex = 0, aLength = 0)
can sometimes be
[@ EMPTY: no crashing thread identified; ERROR_NO_THREAD_LIST ]
> MOZ_CRASH Reason ElementAt(aIndex = 1, aLength = 1)
--------------------------------------
After doing my STR *without* gpu-process + webrender + blob-images there can be "Failed to lock new back buffer." on about:support (without any crash).
https://dxr.mozilla.org/mozilla-central/source/gfx/layers/client/ContentClient.cpp#263
If we get to ReinitRendering, then that means the GPU process already died once for some reason. In your case, it sounds like the incorrect number of elements is getting passed to the "namespaces" array. I'm not sure how that could happen. Maybe CreateContentBridges failed.
Flags: needinfo?(dvander)
Updated•7 years ago
|
Flags: needinfo?(vliu)
Comment 7•7 years ago
|
||
(In reply to David Anderson [:dvander] from comment #6)
> If we get to ReinitRendering, then that means the GPU process already died
> once for some reason. In your case, it sounds like the incorrect number of
> elements is getting passed to the "namespaces" array. I'm not sure how that
> could happen. Maybe CreateContentBridges failed.
If the parent ever crashes, then it should have annotated the logs with IpcCreateEndpointsNsresult and/or IpcCreateTransportDupErrno. I didn't see any parent crashes with those annotations however. My best stab in the dark is that we are hitting the file handle limit as part of the reinit process. Maybe we need to throttle our recovery. It wouldn't be hard to add a MOZ_RELEASE/DIAGNOSTIC_ASSERT if and only if gfxVars::UseWebRender (or some other pref just in case people hit it a lot) to confirm the error code. I will do this in another bug.
Comment 8•7 years ago
|
||
I completely missed the critical log (that was even mentioned in the description -- clearly the first bug I've looked at this week ;)). That shows 0x80610002 or NS_ERROR_DUPLICATE_HANDLE. So yeah, I'd say we ran out of handles again.
| Reporter | ||
Comment 9•7 years ago
|
||
(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #0)
> STR (Only for me. Didn't work in a fresh profile. :/ )
> 1. Open History Sidebar (Ctrl + H > Recent History)
> 2. Grab the scrollbar and scroll your very long history very fast up and down until the UI breaks
> 3. the Sidebar gets white, practically every text is gone, toolbar icons are gone (back, reload etc.)
Today: I grab the scrollbar and scroll my very very long history sidebar very fast up and down and get a crash.
-----
mozregression --launch 2018-02-07 --profile ~/main-profile-copy --profile-persistence reuse --pref gfx.webrender.all:true image.mem.shared:2
-> Crash.
try build from bug 1432375 comment 16:
mozregression --repo try --launch 4b4caf3bf0a51b7e05370867c7d6494ea71000aa --profile ~/main-profile-copy --profile-persistence reuse --pref gfx.webrender.all:true image.mem.shared:2
-> No crash.
Tested multiple times.
-----
(Andrew Osmond [:aosmond] from comment 8)
> So yeah, I'd say we ran out of handles again.
(Andrew Osmond [:aosmond] from bug 1432375 comment 14)
> Instead shared surfaces caused us to consume too many file handles at once for shared memory due to how it was implemented.
You knew it all the time! :D
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → DUPLICATE
Comment 10•7 years ago
|
||
(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #9)
>
> You knew it all the time! :D
Indeed, although I apparently lacked the humility to recognize I might have caused the problem in the first place ;). Thanks for your persistence!
| Reporter | ||
Updated•7 years ago
|
status-firefox57:
unaffected → ---
status-firefox59:
affected → ---
You need to log in
before you can comment on or make changes to this bug.
Description
•