Closed Bug 1489189 Opened 7 years ago Closed 7 years ago

Perma tresize | application crashed [unknown top frame]

Categories

(Testing :: Talos, defect, P1)

Version 3
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: intermittent-bug-filer, Assigned: nical)

References

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell disable-recommended])

Filed by: ccoroiu [at] mozilla.com https://treeherder.mozilla.org/logviewer.html#?job_id=197807761&repo=mozilla-central https://queue.taskcluster.net/v1/task/RBcssKhTQhCaTZ9I-XJIYw/runs/0/artifacts/public/logs/live_backing.log 10:55:30 INFO - TEST-INFO | started process 9012 (C:\Users\task_1536230942\build\application\firefox\firefox -profile c:\users\task_1536230942\appdata\local\temp\tmpcdkdwh\profile http://localhost:49818/startup_test/tresize/addon/content/tresize-test.html) 10:56:35 INFO - PID 9012 | [Parent 9012, Gecko_IOThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346 10:58:00 INFO - Timeout waiting for test completion; killing browser... 10:58:00 INFO - mozcrash kill_pid(): wait failed (-1) terminating pid 9012: error 5 10:58:00 INFO - Terminating psutil.Process(pid=9012L, name='firefox.exe', started='10:55:30') 10:58:00 INFO - mozcrash Downloading symbols from: https://queue.taskcluster.net/v1/task/Ui0vVI6WSFGula21pFjucQ/artifacts/public/build/target.crashreporter-symbols.zip 10:58:03 INFO - mozcrash Copy/paste: C:\Users\task_1536230942\build\win32-minidump_stackwalk.exe c:\users\task_1536230942\appdata\local\temp\tmpcdkdwh\profile\minidumps\f1d1b8b8-ed64-4011-a235-347df2afcdf5.dmp c:\users\task_1536230942\appdata\local\temp\tmp4p1foi 10:58:05 INFO - mozcrash Saved minidump as C:\Users\task_1536230942\build\blobber_upload_dir\f1d1b8b8-ed64-4011-a235-347df2afcdf5.dmp 10:58:05 INFO - PROCESS-CRASH | tresize | application crashed [unknown top frame] 10:58:05 INFO - Crash dump filename: c:\users\task_1536230942\appdata\local\temp\tmpcdkdwh\profile\minidumps\f1d1b8b8-ed64-4011-a235-347df2afcdf5.dmp 10:58:05 INFO - Operating system: Windows NT 10:58:05 INFO - 10.0.15063 10:58:05 INFO - CPU: amd64 10:58:05 INFO - family 6 model 94 stepping 3 10:58:05 INFO - 8 CPUs 10:58:05 INFO - GPU: UNKNOWN 10:58:05 INFO - No crash 10:58:05 INFO - Process uptime: 150 seconds 10:58:05 INFO - Thread 0 10:58:05 INFO - 0 ntdll.dll!NtWaitForMultipleObjects + 0x14 10:58:05 INFO - rax = 0x00000000000000e5 rdx = 0x0000027d8dd00000 10:58:05 INFO - rcx = 0x0000000000002000 rbx = 0x0000000000000001 10:58:05 INFO - rsi = 0x0000000000000000 rdi = 0x0000000000000001 10:58:05 INFO - rbp = 0x0000027d8e2bccc0 rsp = 0x000000f1327fe718 10:58:05 INFO - r8 = 0x0000000000020000 r9 = 0xe5e5e5e5e5e5e5e5 10:58:05 INFO - r10 = 0x0000027da1701328 r11 = 0x0000027da05b7000 10:58:05 INFO - r12 = 0x00000000ffffffff r13 = 0x000000f1327fea60 10:58:05 INFO - r14 = 0x0000000000000000 r15 = 0x0000000000000000 10:58:05 INFO - rip = 0x00007ff906df5f04 10:58:05 INFO - Found by: given as instruction pointer in context 10:58:05 INFO - 1 KERNELBASE.dll!SbCleanupTrace + 0x18 10:58:05 INFO - rbx = 0x0000000000000001 rbp = 0x0000027d8e2bccc0 10:58:05 INFO - rsp = 0x000000f1327fe720 r12 = 0x00000000ffffffff 10:58:05 INFO - r13 = 0x000000f1327fea60 r14 = 0x0000000000000000 10:58:05 INFO - r15 = 0x0000000000000000 rip = 0x00007ff90361dd20 10:58:05 INFO - Found by: call frame info 10:58:05 INFO - 2 mozglue.dll!BaseAllocator::malloc(unsigned __int64) [mozjemalloc.cpp:0c947d96e8f3c9f04979975c69aa7d0bcb8a4266 : 4163 + 0x23] 10:58:05 INFO - rbx = 0x0000000000000001 rbp = 0x0000027d8e2bccc0 10:58:05 INFO - rsp = 0x000000f1327fe750 r12 = 0x00000000ffffffff 10:58:05 INFO - r13 = 0x000000f1327fea60 r14 = 0x0000000000000000 10:58:05 INFO - r15 = 0x0000000000000000 rip = 0x00007ff8fceef426 10:58:05 INFO - Found by: call frame info 10:58:05 INFO - Thread 1 10:58:05 INFO - 0 ntdll.dll!NtRemoveIoCompletion + 0x14 10:58:05 INFO - rax = 0x00000000000000e5 rdx = 0x0000027d8dd00000 10:58:05 INFO - rcx = 0x0000000000004000 rbx = 0x000000f132fffb20 10:58:05 INFO - rsi = 0x000000f132fffb20 rdi = 0x000000f132fffb2c 10:58:05 INFO - rbp = 0x0000027da037cda0 rsp = 0x000000f132fffa58 10:58:05 INFO - r8 = 0x000000000000e000 r9 = 0xe5e5e5e5e5e5e5e5 10:58:05 INFO - r10 = 0x0000027da1701328 r11 = 0x0000027da1455000 10:58:05 INFO - r12 = 0x0000027d8de0b3a0 r13 = 0x0000000000000260 10:58:05 INFO - r14 = 0x000000f132fffb2c r15 = 0x000000f132fffb30 10:58:05 INFO - rip = 0x00007ff906df54d4 10:58:05 INFO - Found by: given as instruction pointer in context 10:58:05 INFO - 1 KERNELBASE.dll!LdrResGetRCConfig + 0x3ff 10:58:05 INFO - rbx = 0x000000f132fffb20 rbp = 0x0000027da037cda0 10:58:05 INFO - rsp = 0x000000f132fffa60 r12 = 0x0000027d8de0b3a0 10:58:05 INFO - r13 = 0x0000000000000260 r14 = 0x000000f132fffb2c 10:58:05 INFO - r15 = 0x000000f132fffb30 rip = 0x00007ff903629aff 10:58:05 INFO - Found by: call frame info 10:58:05 INFO - Thread 2 10:58:05 INFO - 0 ntdll.dll!NtWaitForSingleObject + 0x14 10:58:05 INFO - rax = 0x0000000000000000 rdx = 0x0000027d8de35060 10:58:05 INFO - rcx = 0x00007ff8cea605e0 rbx = 0x0000000000000000 10:58:05 INFO - rsi = 0x00000000ffffffff rdi = 0x00000000000002ac 10:58:05 INFO - rbp = 0x0000000000000000 rsp = 0x000000f1333ff638 10:58:05 INFO - r8 = 0x0000000000000000 r9 = 0x0000000000000000 10:58:05 INFO - r10 = 0x0000000000000000 r11 = 0x0000000000000000 10:58:05 INFO - r12 = 0x00007ff905a92080 r13 = 0x0000000000000000 10:58:05 INFO - r14 = 0x0000000000000000 r15 = 0x00007ff905a91e60 10:58:05 INFO - rip = 0x00007ff906df5434 10:58:05 INFO - Found by: given as instruction pointer in context 10:58:05 INFO - 1 KERNELBASE.dll!RtlpHpLargeAlloc + 0x1ab 10:58:05 INFO - rbx = 0x0000000000000000 rbp = 0x0000000000000000 10:58:05 INFO - rsp = 0x000000f1333ff640 r12 = 0x00007ff905a92080 10:58:05 INFO - r13 = 0x0000000000000000 r14 = 0x0000000000000000 10:58:05 INFO - r15 = 0x00007ff905a91e60 rip = 0x00007ff90360988f 10:58:05 INFO - Found by: call frame info 10:58:05 INFO - Thread 3 10:58:05 INFO - 0 ntdll.dll!NtRemoveIoCompletion + 0x14 10:58:05 INFO - rax = 0x0000027d986bd580 rdx = 0x0000000000000000 10:58:05 INFO - rcx = 0x0000027d986b8260 rbx = 0x000000f1335ff718 10:58:05 INFO - rsi = 0x0000027d8de0d4c0 rdi = 0x000000f1335ff710 10:58:05 INFO - rbp = 0x0000000000000000 rsp = 0x000000f1335ff668 10:58:05 INFO - r8 = 0x000000f1335ff668 r9 = 0x0000000000000000 10:58:05 INFO - r10 = 0x0000000000000000 r11 = 0x0000000000000246 10:58:05 INFO - r12 = 0x0000000000000000 r13 = 0x0000000000000000 10:58:05 INFO - r14 = 0x000000f1335ff770 r15 = 0x000000f1335ff778 10:58:05 INFO - rip = 0x00007ff906df54d4 10:58:05 INFO - Found by: given as instruction pointer in context 10:58:05 INFO - 1 KERNELBASE.dll!LdrResGetRCConfig + 0x3ff 10:58:05 INFO - rbx = 0x000000f1335ff718 rbp = 0x0000000000000000 10:58:05 INFO - rsp = 0x000000f1335ff670 r12 = 0x0000000000000000 10:58:05 INFO - r13 = 0x0000000000000000 r14 = 0x000000f1335ff770 10:58:05 INFO - r15 = 0x000000f1335ff778 rip = 0x00007ff903629aff 10:58:05 INFO - Found by: call frame info 10:58:05 INFO - 2 0x3e8 10:58:05 INFO - rbx = 0x000000f1335ff718 rbp = 0x0000000000000000 10:58:05 INFO - rsp = 0x000000f1335ff780 r12 = 0x0000000000000000 10:58:05 INFO - r13 = 0x0000000000000000 r14 = 0x000000f1335ff770 10:58:05 INFO - r15 = 0x000000f1335ff778 rip = 0x00000000000003e8 10:58:05 INFO - Found by: call frame info 10:58:05 INFO - 3 KERNELBASE.dll!ResCKeOpenRuntimeView + 0x104 10:58:05 INFO - rsp = 0x000000f1335ff840 rip = 0x00007ff90362ab9c 10:58:05 INFO - Found by: stack scanning 10:58:05 INFO - 4 0x7ff8cb770000 10:58:05 INFO - rsp = 0x000000f1335ffab0 rip = 0x00007ff8cb770000 10:58:05 INFO - Found by: call frame info 10:58:05 INFO - 5 ntdll.dll!RtlUserThreadStart + 0x21 10:58:05 INFO - rsp = 0x000000f1335ffb00 rip = 0x00007ff906dc0d61 10:58:05 INFO - Found by: stack scanning 10:58:05 INFO - Thread 4 10:58:05 INFO - 0 ntdll.dll!NtWaitForMultipleObjects + 0x14 10:58:05 INFO - rax = 0x0000000000000000 rdx = 0x0000000000000001 10:58:05 INFO - rcx = 0x0000000000000203 rbx = 0x0000000000000002 10:58:05 INFO - rsi = 0x0000000000000000 rdi = 0x0000000000000002 10:58:05 INFO - rbp = 0x00000000ffffffff rsp = 0x000000f1323cec88 10:58:05 INFO - r8 = 0x0000000000000000 r9 = 0x00007ff8d00bc4ee 10:58:05 INFO - r10 = 0x0000000000000015 r11 = 0x0000000000000028 10:58:05 INFO - r12 = 0x00000000ffffffff r13 = 0x000000f1323cf020 10:58:05 INFO - r14 = 0x0000000000000000 r15 = 0x0000000000000000 10:58:05 INFO - rip = 0x00007ff906df5f04 10:58:05 INFO - Found by: given as instruction pointer in context
:rwood could you please take a look?
Flags: needinfo?(rwood)
Summary: Intermittent tresize | application crashed [unknown top frame] → Perma tresize | application crashed [unknown top frame]
Looks like a pretty consistent crash during the 2nd cycle of tresize on Win 10x64 QuantumRender. :jimm, I see you listed as the contact for talos tresize, would you please be able to take a look at this crash (or fw it on to someone)? Thank you!
Flags: needinfo?(rwood) → needinfo?(jmathies)
I think this is a regression from bug 1445570, but the underlying issue is a bug in WebRender. When WebRenderBridgeParent creates a transaction to generate a frame it also records that there is a pending frame due. This code in update_document can stop a frame from being generated without any notification: https://searchfox.org/mozilla-central/source/gfx/webrender/src/render_backend.rs#969 The call below to self.notifier.new_frame_ready is supposed to notify us if a frame got skipped, but doesn't, since render_frame is false and we don't call it. Note that the composite_needed parameter to new_frame_ready will always be true, since it's the same variable as we used for the if condition. I think we need an extra variable to track if we need to call new_frame_ready, which gets set to the value of render_frame, before we take the empty document into account.
Flags: needinfo?(nical.bugzilla)
I think this worked in the past because we delivered MozAfterPaint after 100ms, so the JS test code thought things were still working and it progressed the test, even though the compositor side was locked.
Priority: P5 → P1
Blocks: stage-wr-trains
No longer blocks: stage-wr-nightly
This looks like a race condition, where we trigger a document update in WR before we get a display list. Once it happens, we won't be able to composite at all, forever (except for resizes, since they force a composite, and skip the pending frame count check).
Bug 1489471 seemed to be caused by same webrender change.
See Also: → 1489471
snip from irc. ------------------------------------------------ <sotaro> mattwoodrow, in this case wr_notifier_nop_frame_done() seemed to be called to avoid too may pending frame. Did it called? <mattwoodrow> sotaro: It won’t be called ever, because of this code: https://searchfox.org/mozilla-central/source/gfx/webrender/src/render_backend.rs#1027 <mattwoodrow> We only call new_frame_ready if render_frame is true, and we don’t call nop_frame_done if render_frame is true <sotaro> mattwoodrow, when the problem happen, did WebRenderBridgeParent::MaybeGenerateFrame() get called? WebRender transaction should have render_frame == true. <sotaro> mattwoodrow, the MaybeGenerateFrame() is triggered by CompositorBridgeParent::ScheduleRenderOnCompositorThread() <sotaro> mattwoodrow, it seems like the problem might be caused by recent webrender change. <sotaro> mattwoodrow, like https://github.com/servo/webrender/pull/2998 <sotaro> mattwoodrow, fastTxn.GenerateFrame() in WebRenderBridgeParent::MaybeGenerateFrame() expects that update_document() is called with render_frame = true.
Flags: needinfo?(nical.bugzilla)
This should now be fixed by bug 1489471
Assignee: nobody → nical.bugzilla
Status: NEW → RESOLVED
Closed: 7 years ago
Depends on: 1489471
Resolution: --- → FIXED
Flags: needinfo?(jmathies)
You need to log in before you can comment on or make changes to this bug.