Open Bug 1541116 Opened 6 years ago Updated 1 year ago

Consider an ASAN-nightly like effort for the JIT

Categories

(Core :: JavaScript Engine, task, P3)

task

Tracking

()

Tracking Status
firefox68 --- affected

People

(Reporter: Alex_Gaynor, Unassigned)

References

(Blocks 1 open bug)

Details

Recording an issue for an idea we briefly brainstormed on earlier today.

JIT crash stats bugs are often unactionable. The idea here would be to enable more checks in the JIT (e.g. emitting assertion code for some checks the optimizers removed) which are easier to debug. We could do this with low probability to avoid regressing performance, but "make it up on volume" but rolling it out to larger release channels.

The problem we face with JIT bugs reported by crash-stat is the lack of context. An assertion message will tell you the various kind of errors we face, but while this might be good to categorize, this would not help at identifying the source of the issue.

You can report as much as you want that you are dividing by zero or computing something which is out-of-bounds, without context we would still be lost with no idea of what is being executed, and how we get there.

Today the only context information we have is from the code surrounding the program counter. This is hard to grasp and to understand as we have thousands of places which are generating JIT code. With luck we can identify idiomatic code (calling convention, register usage patterns, common operations) but these tend to disappear as we optimize the code more.

I once tried to train an automatic reverse-engineering machine, which goal was recover a Trampoline / RegExp / Baseline / CacheIR / CodeGenerator / Rabaldr / Cranelift stack traces out of the code which is surrounding the program counter. This experiment was able to report out-of-context stack traces for every bytes, but hardly managed to produce something consistent for sequences of bytes. The intent was that ultimately we could plug this tool to crash-stat post-processing.

I discussed with Calixte, and he suggested some ideas to improve this tool. The idea would be to try to identify sequences which are matching spread-bit-patterns. For example, a stack trace could be more likely if we observe the following bit sequence 110........110.........000........1 (0: expect 0; 1: expect 1; .: expect either 0 or 1) I have not yet tried this approach but it sounds promising.

(In reply to Nicolas B. Pierron [:nbp] from comment #1)

I once tried to train an automatic reverse-engineering machine

The project is available at https://github.com/nbp/seqrec , feel free to fork.

(In reply to Nicolas B. Pierron [:nbp] from comment #1)

The problem we face with JIT bugs reported by crash-stat is the lack of context. An assertion message will tell you the various kind of errors we face, but while this might be good to categorize, this would not help at identifying the source of the issue.

It would help because:

  • If we see the same issue happening often on foo.com/bar, we can go test that website ourselves with the assertions turned on unconditionally. In my experience (fixing top crashes based on URL data) there's a good chance we will catch it if there's a real bug. It's not that different from release asserts in C++.

  • We will know where the bugs are, let's say in range analysis or TI. Then we can try to narrow it down or do more targeted fuzzing/refactoring of the affected code.

Now that we have multiple Ion tiers (bug 1382650) we can probably get away with a sampling based approach for the lowest tier.

Priority: -- → P3
Severity: normal → S3
Blocks: sm-security
You need to log in before you can comment on or make changes to this bug.