1649696 - Linux sandbox disables indirect branch prediction on x64/x86, resulting in a significant slowdown

Reporter

Description

•

5 years ago

There's a long saga at bug 1649109, the upshot of which is that seccomp by default disables indirect branch prediction as part of Spectre mitigation, resulting in a significant drop in performance for programs that use indirect calls. (This would mean every virtual and indirect call in C++; in my context it also means all indirect calls in WebAssembly.)

We can undo this slowdown by adding SECCOMP_FILTER_FLAG_SPEC_ALLOW to the seccomp flags (see bug 1649109 comment 17 for a small patch). Presumably we need to discuss whether we should do that, as this disables some Spectre mitigations. I observe that neither Windows nor macOS appear to have this mitigation enabled and that Firefox therefore runs without the mitigation on those platforms. There's probably no good reason why Linux should be different, but that's just my own perspective.

Luke Wagner [:luke]

Comment 1

•

5 years ago

Once we have Fission, I think we should definitely disable these mitigations (chromium seems to). Until then, I expect they buy us some modest (theoretical, at this point) mitigation value so my inclination is that we shouldn't take action to override this default behavior.

Luke Wagner [:luke]

Comment 2

•

5 years ago

(FWIW, I filed bug 1649853 to ask whether, by any chance, similar mitigations are being enabled on Windows.)

Jed Davis [:jld] ⟨⏰|UTC-7⟩ ⟦he/him⟧

Assignee

Comment 3

•

5 years ago

Chromium turns off all mitigations that were conditional on seccomp but also turns on STIBP explicitly; i.e., they turn off all mitigations except STIBP (which currently means just SSBD), and require site isolation even for that. Their bug for this is https://crbug.com/1029470.

As I understand it, the issue that STIBP protects against is that the branch prediction state is per physical core, and our processes could spy on unrelated tasks that happen to be on the core's other thread. Fission wouldn't help with that, and re-enabling SAB (with real parallelism) would increase the potential for exploitation if I understand correctly.

What we'd need would be a way to prevent sharing cores with other processes, and I think macOS may have something like that, but it didn't exist on Linux as of https://crbug.com/1029470#c32.

(For future reference, if we use SECCOMP_FILTER_FLAG_SPEC_ALLOW, we'll need a fallback if the kernel doesn't support it — it was added in 4.17, but the seccomp system call was added in 3.17, almost 4 years earlier. Also, if we need to use a prctl in some capacity, those are per-thread, so that would need to be applied in early startup and preferably before exec (there are some issues with injected libraries creating threads before main) or else we'd need to repurpose this exciting code that currently exists to support pre-3.17 kernels and I was hoping to get rid of someday.)

See Also: → https://bugs.chromium.org/p/chromium/issues/detail?id=1029470

Jed Davis [:jld] ⟨⏰|UTC-7⟩ ⟦he/him⟧

Assignee

Comment 4

•

5 years ago

So this is weird: I can reproduce the result from bug 1649109 comment #0 (using sandbox vs. no sandbox rather than the JS shell), on an i9-7940X with kernel 5.7.6, but it still reproduces even if I boot with nosmt=force or spectre_v2_user=off, both of which disable the use of STIBP and the latter also disables IBPB, at least according to the documentation and what's reported in sysfs. So either the cause isn't STIBP, or there's a kernel bug such that it's still enabled with seccomp even when it shouldn't be.

However, setting SECCOMP_FILTER_FLAG_SPEC_ALLOW does remove the performance difference. I haven't tried manually turning on individual mitigations via prctl yet.

Luke Wagner [:luke]

Comment 5

•

5 years ago

IIUC, cross-process-cross-hyperthread variant 2 attacks are in the same theoretically-possible-but-practically-extremely-hard category as, e.g., cross-process cache eviction attacks (which browsers also do nothing about). But it's useful to know Chrome seems to have STIBP enabled. It sounds like, after Fission ships, we'll need to look into this a bit more to see if that's still actually the case and how practical STIBP attacks actually are.

Gian-Carlo Pascutto [:gcp]

Updated

•

5 years ago

Severity: -- → S2

Priority: -- → P2

Gian-Carlo Pascutto [:gcp]

Comment 6

•

5 years ago

Presumably we need to discuss whether we should do that

I'm going to mark this P5 until there's a decision what to do from the JS side. If you know what you want, we can try to implement it on the sandboxing side.

Flags: needinfo?(luke)

Priority: P2 → P5

Jed Davis [:jld] ⟨⏰|UTC-7⟩ ⟦he/him⟧

Assignee

Comment 7

•

5 years ago

(In reply to Jed Davis [:jld] ⟨⏰|UTC-6⟩ ⟦he/him⟧ from comment #3)

What we'd need would be a way to prevent sharing cores with other processes, and I think macOS may have something like that

It does and we use it; see bug 1546544.

Comment 8

•

5 years ago

The decision to do anything is a ways off (I don't think we should do anything before Fission), so I'll clear the ni? for now and we can keep this on the backlog.

Flags: needinfo?(luke)

Lars T Hansen [:lth]

Reporter

Updated

•

5 years ago

Depends on: fission

Chris Peterson [:cpeterson]

Updated

•

5 years ago

Fission Milestone: --- → Future

Lars T Hansen [:lth]

Reporter

Updated

•

4 years ago

Depends on: 1707955

RaresB

Updated

•

4 years ago

QA Whiteboard: qa-not-actionable

Gian-Carlo Pascutto [:gcp]

Comment 9

•

3 years ago

Luke, we shipped Fission, and this bug is in the rather weird S2 but P5 state. Can you provide an update?

Flags: needinfo?(mail)

Gian-Carlo Pascutto [:gcp]

Updated

•

3 years ago

Flags: needinfo?(mail) → needinfo?(sdetar)

Steven DeTar [:sdetar]

Comment 10

•

3 years ago

Jan, do you have any thoughts on this bug?

Flags: needinfo?(sdetar) → needinfo?(jdemooij)

Jed Davis [:jld] ⟨⏰|UTC-7⟩ ⟦he/him⟧

Assignee

Comment 11

•

3 years ago

Linux appears to have changed its default in commit 2f46993d83ff4abb310ef7b4beced56ba96f0d9d, which shipped in 5.16, and no longer enables SSBD or STIBP automatically when seccomp is used (but an administrator or distro can still choose that mode, and per-thread opt-in via prctl is always possible). Empirically, I can no longer reproduce the performance difference from bug 1649109.

The Linux commit message has a lengthy discussion of their rationale. I'm not completely following the comments there about pid namespaces; we don't currently use them (bug 1151624), but we do block most of the things that a pid could be used for, in particular reading /proc/{victim_pid}/maps to get the target's address space layout, so we might already have reasonable mitigations.

Jan de Mooij [:jandem]

Comment 12

•

3 years ago

(In reply to Jed Davis [:jld] ⟨⏰|UTC-8⟩ ⟦he/him⟧ from comment #11)

Linux appears to have changed its default in [commit 2f46993d83ff4abb310ef7b4beced56ba96f0d9d][change], which shipped in 5.16, and no longer enables SSBD or STIBP automatically when seccomp is used (but an administrator or distro can still choose that mode, and per-thread opt-in via prctl is always possible). Empirically, I can no longer reproduce the performance difference from bug 1649109.

Great, thanks for looking into this!

I think for now we should keep this bug open and blocked on bug 1707955. Maybe once that bug is fixed we can disable these mitigations on older kernels as well. For now it seems reasonable to rely on the kernel's default, also because we're shipping Fission and we don't have similar mitigations on other platforms as far as I know.

Flags: needinfo?(jdemooij)

Gian-Carlo Pascutto [:gcp]

Comment 13

•

3 years ago

Lowering severity because of comment 11.

Severity: S2 → S3

Ted Campbell [:tcampbell]

Updated

•

3 years ago

Blocks: speedometer3

Olli Pettay [:smaug][bugs@pettay.fi]

Updated

•

3 years ago

No longer blocks: speedometer3

Ryan Hunt [:rhunt]

Comment 14

•

2 years ago

Now that bug 1837602 is resolved, do you think it makes sense to fix this for older kernels?

Flags: needinfo?(jdemooij)

Jan de Mooij [:jandem]

Comment 15

•

2 years ago

(In reply to Ryan Hunt [:rhunt] from comment #14)

Now that bug 1837602 is resolved, do you think it makes sense to fix this for older kernels?

Yeah, I think it makes sense to fix this performance cliff with older kernels now that we've disabled Spectre mitigations. Ubuntu 22.04 LTS still ships 5.15 AFAICT so this still affects a lot of users.

jld, what do you think?

Flags: needinfo?(jdemooij) → needinfo?(jld)

Jed Davis [:jld] ⟨⏰|UTC-7⟩ ⟦he/him⟧

Assignee

Comment 16

•

2 years ago

(In reply to Jan de Mooij [:jandem] from comment #15)

Ubuntu 22.04 LTS still ships 5.15 AFAICT so this still affects a lot of users.

jld, what do you think?

I did a little investigating, and unfortunately the key word there is “Ubuntu”: they moved everyone to using Firefox as a Snap package, and Snap applies its own seccomp-bpf sandbox, which doesn't use SECCOMP_FILTER_FLAG_SPEC_ALLOW. This sets the speculations STORE_BYPASS and INDIRECT_BRANCH to “force disabled” status, meaning that the mitigations are applied and can't be turned back off by that process or its descendants. (This is also the case for Flatpak.)

However, Ubuntu LTS does upgrade kernels to newer branches, for hardware support, at least until the next LTS is released. In the past the situation was somewhat complicated and it was difficult to map from Ubuntu version to kernel version, but Ubuntu's current documentation about this indicates that desktop installs (but not server) now default to switching to new kernel versions as part of regular OS updates. So, a typical 22.04 LTS desktop system should be on 6.2.x by now, even if it was originally installed from the first 22.04.0 release, if I understand correctly.

But this might be an issue for Ubuntu 20.04, which is on kernel 5.15, and looks like it will remain there for the rest of its release cycle (until 2025), and defaults to using a normal .deb package (not Snap) for Firefox.

Also, I'm not sure how important this is, and I've only tested it in a VM so far, but it looks like the mitigations now slow down only the call external microbenchmark from the original bug report and not call internal; I don't know if there have been optimizations on the SpiderMonkey side that could have caused that.

Flags: needinfo?(jld)

Ryan Hunt [:rhunt]

Comment 17

•

2 years ago

(In reply to Jed Davis [:jld] ⟨⏰|UTC-8⟩ ⟦he/him⟧ from comment #16)

(In reply to Jan de Mooij [:jandem] from comment #15)

Ubuntu 22.04 LTS still ships 5.15 AFAICT so this still affects a lot of users.

jld, what do you think?

I did a little investigating, and unfortunately the key word there is “Ubuntu”: they moved everyone to using Firefox as a Snap package, and Snap applies its own seccomp-bpf sandbox, which doesn't use SECCOMP_FILTER_FLAG_SPEC_ALLOW. This sets the speculations STORE_BYPASS and INDIRECT_BRANCH to “force disabled” status, meaning that the mitigations are applied and can't be turned back off by that process or its descendants. (This is also the case for Flatpak.)

However, Ubuntu LTS does upgrade kernels to newer branches, for hardware support, at least until the next LTS is released. In the past the situation was somewhat complicated and it was difficult to map from Ubuntu version to kernel version, but Ubuntu's current documentation about this indicates that desktop installs (but not server) now default to switching to new kernel versions as part of regular OS updates. So, a typical 22.04 LTS desktop system should be on 6.2.x by now, even if it was originally installed from the first 22.04.0 release, if I understand correctly.

But this might be an issue for Ubuntu 20.04, which is on kernel 5.15, and looks like it will remain there for the rest of its release cycle (until 2025), and defaults to using a normal .deb package (not Snap) for Firefox.

Also, I'm not sure how important this is, and I've only tested it in a VM so far, but it looks like the mitigations now slow down only the call external microbenchmark from the original bug report and not call internal; I don't know if there have been optimizations on the SpiderMonkey side that could have caused that.

Thanks for looking into this!

Interesting that you're seeing the perf-difference primarily on call external benchmark. Both of them use indirect calls, so I'd expect that they'd both see the slow down. We did land an optimization a while ago that splits a webassembly indirect call into a branch going to one of two different machine level indirect calls (one restores state for external calls, the other doesn't). That probably would be the cause, but the reason why doesn't seem obvious to me.

I do see that Debian 11 (the previous stable release from 2021) is listed as using the 5.10 kernel series. Not sure if that's significant or not though. Fedora seems to follow the kernel releases pretty closely, so they seem less likely to be behind.

Lars had a pretty simple patch in bug 1649109 comment 17 that he reported would fix the issue. Do you think it's worth trying to land that? If it's not a simple fix, I understand it might not be worth it as data seems to show this has become less severe.

Flags: needinfo?(jld)

Jed Davis [:jld] ⟨⏰|UTC-7⟩ ⟦he/him⟧

Assignee

Comment 18

•

2 years ago

•

Edited

(In reply to Ryan Hunt [:rhunt] from comment #17)

Lars had a pretty simple patch in bug 1649109 comment 17 that he reported would fix the issue. Do you think it's worth trying to land that? If it's not a simple fix, I understand it might not be worth it as data seems to show this has become less severe.

It would need to check for the relevant error (probably EINVAL) and fall back to calling seccomp without that flag, in order not to break kernels from 4.17 until 5.16, but otherwise that could be done.

(Leaving ni? to look at this more next week.)

(Edit, 2024-01-10: fixed kernel version numbers.)

Jed Davis [:jld] ⟨⏰|UTC-7⟩ ⟦he/him⟧

Assignee

Comment 19

•

2 years ago

Attached file Bug 1649696 - Use `SECCOMP_FILTER_FLAG_SPEC_ALLOW` where available. — Details

Linux 4.17 applied some Spectre mitigations (SSBD and STIBP) by default
when seccomp-bpf is used, but later Linux 5.16 turned them off with
the rationale that, essentially: the attacks aren't really practical,
there are similar or worse attacks that it doesn't stop, and the
performance cost on is significant (STIBP seems to make indirect
branches unpredictable). Given that we support Linux distributions in
the affected range (e.g., Ubuntu 20.04 LTS, but not 22.04), and the
performance hit is very noticeable at least in microbenchmarks, this
patch opts out of the mitigations.

Note that, if a seccomp policy is applied by some external sandbox which
doesn't use this opt-out (e.g., if using Snap or Flatpak), and the
kernel is in that range, these Spectre mitigations will still be applied
and we can't turn them off.

Phabricator Automation

Updated

•

2 years ago

Assignee: nobody → jld

Status: NEW → ASSIGNED

Jed Davis [:jld] ⟨⏰|UTC-7⟩ ⟦he/him⟧

Assignee

Comment 20

•

2 years ago

So, I did look at this some more; see above. Testing it (again) on Ubuntu 20.04 in a VM on a Threadripper and looking more closely at the numbers, I saw a 1.5x slowdown in call internal and 3x in call external; the original bug report on an Intel CPU, and what I recall also seeing on an Intel CPU when I tested a while ago, was about 2x on both. I don't know how interesting that is, but I thought I'd mention it for the record.

Flags: needinfo?(jld)

Pulsebot

Comment 21

•

2 years ago

Pushed by jedavis@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/ebdd80f675b6 Use `SECCOMP_FILTER_FLAG_SPEC_ALLOW` where available. r=gcp

Natalia Csoregi [:nataliaCs]

Comment 22

•

2 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/ebdd80f675b6

Status: ASSIGNED → RESOLVED

Closed: 2 years ago

status-firefox123: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 123 Branch

Andrej (:aglavic)

Comment 23

•

2 years ago

(In reply to Pulsebot from comment #21)

Pushed by jedavis@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ebdd80f675b6
Use SECCOMP_FILTER_FLAG_SPEC_ALLOW where available. r=gcp

We're seeing a number of performance improvements in CI!

== Change summary for alert #41031 (as of Tue, 16 Jan 2024 18:31:43 GMT) ==

Improvements:

Ratio	Test	Platform	Options	Absolute values (old vs new)
11%	perf_reftest_singletons many-custom-props.html	linux1804-64-shippable-qr	e10s fission stylo webrender	342.56 -> 304.58
11%	perf_reftest_singletons slow-selector-2.html	linux1804-64-shippable-qr	e10s fission stylo webrender	0.90 -> 0.80
11%	perf_reftest_singletons slow-selector-1.html	linux1804-64-shippable-qr	e10s fission stylo webrender	0.90 -> 0.80
11%	perf_reftest coalesce-1.html	linux1804-64-shippable-qr	e10s fission stylo webrender	84.17 -> 75.26
10%	perf_reftest_singletons display-none-1.html	linux1804-64-shippable-qr	e10s fission stylo webrender	0.59 -> 0.53
...	...	...	...	...
2%	offscreencanvas_webcodecs_main_2d_h264 offscreencanvas_webcodecs_main_2d_h264 Mean time across 100 frames:	linux1804-64-qr	e10s fission stylo webgl-ipc webrender	12.41 -> 12.15

For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=41031

Comment hidden (obsolete)

Andra Esanu (needinfo me)

Updated

•

2 years ago

Keywords: perf-alert

Acasandrei Beatrice (needinfo me)

Comment 25

•

2 years ago

(In reply to Pulsebot from comment #21)

Pushed by jedavis@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ebdd80f675b6
Use SECCOMP_FILTER_FLAG_SPEC_ALLOW where available. r=gcp

== Change summary for alert #41033 (as of Tue, 16 Jan 2024 13:35:24 GMT) ==

Improvements:

Ratio	Test	Platform	Options	Absolute values (old vs new)	Performance Profiles
11%	youtube FirstVisualChange	linux1804-64-shippable-qr	fission warm webrender	179.97 -> 159.61	Before/After
9%	reddit-billgates-ama.members ContentfulSpeedIndex	linux1804-64-shippable-qr	cold fission webrender	193.45 -> 175.15	Before/After
9%	addkAR1 time_duration	linux1804-64-shippable-qr	fission webrender	50,859.48 -> 46,530.08	Before/After
8%	youtube PerceptualSpeedIndex	linux1804-64-shippable-qr	fission warm webrender	1,067.16 -> 982.94	Before/After
7%	youtube loadtime	linux1804-64-shippable-qr	fission warm webrender	1,274.17 -> 1,184.06	Before/After
...	...	...	...	...	...
2%	nytimes SpeedIndex	linux1804-64-shippable-qr	fission warm webrender	1,093.93 -> 1,070.30	Before/After

For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=41033

Jed Davis [:jld] ⟨⏰|UTC-7⟩ ⟦he/him⟧

Assignee

Comment 26

•

2 years ago

(In reply to Jed Davis [:jld] ⟨⏰|UTC-8⟩ ⟦he/him⟧ from comment #4)

So this is weird: I can reproduce the result from bug 1649109 […] but it still reproduces even if I boot with nosmt=force or spectre_v2_user=off, both of which disable the use of STIBP and the latter also disables IBPB […].

However, setting SECCOMP_FILTER_FLAG_SPEC_ALLOW does remove the performance difference. I haven't tried manually turning on individual mitigations via prctl yet.

I don't think I ever tested them separately, and I just did. STIBP (PR_SPEC_INDIRECT_BRANCH) actually doesn't change the microbenchmarks of indirect branches; it's SSBD (PR_SPEC_STORE_BYPASS) that reproduces the 2x/3x slowdowns.

The first part of that, counterintuitive as it sounds, makes sense: what I can easily test on right now is an AMD CPU that supports STIBP always-on mode (PDF), as well as an Intel CPU that supports Enhanced IBRS (a superset of always-on STIBP, essentially), so the PR_SPEC_INDIRECT_BRANCH mitigation is a no-op because it's already applied. On older CPUs that might cause a performance hit, however.

The second part is a little weirder, but maybe if I saw the generated code for the benchmark it might make more sense; nonetheless, speculative store bypass seems to be of concern only for same-process attacks (like jitcode in a non-Fission process) if I understand correctly, so I don't think we need to care about it.

I don't yet know which of these two is responsible for the performance effects reported in the last few comments, and I could try to find out, but I'm not sure if it matters given that we've already considered using STIBP and decided not to (bug 1841843).

Greg Mierzwinski [:sparky]

Updated

•

10 months ago