How debuggers deal with out-of-order execution and branch prediction

Question

I know that modern CPUs do OoO execution and got advanced branch predictors that may fail, how does the debugger deal with that? So, if the cpu fails in predicting a branch how does the debugger know that? I don't know if debuggers execute instructions in a simulated environment or something.

Peter Cordes · Accepted Answer · 2022-08-16 22:24:39Z

2

Debuggers don't have to deal with it; those effects aren't architecturally visible, so everything (including debug breakpoints triggering) happens as-if instructions had executed one at a time, in program order. Anything else would break single-threaded code; they don't just arbitrary shuffle your program!

CPUs support precise exceptions so they can always recover the correct consistent state whenever they hit a breakpoint or unintended fault.

See also Modern Microprocessors A 90-Minute Guide!

If you want to know how often the CPU mispredicted branches, you need to use its own hardware performance counters that can see and record the internals of execution. (Software programs these counters, and can later read back the count, or have it record an event or fire an interrupt when the counter overflows.) e.g. Linux perf stat counts branches and branch-misses by default.

(On Skylake for example, that generic event probably maps to br_misp_retired.all_branches which counts how many branch instructions eventually retired that were at one point mispredicted. So it doesn't count when the CPU detected mis-prediction of a branch that was itself only reached in the shadow of some other misprediction, either of a branch or a fault. Because such a branch wouldn't make it to retirement. Events like int_misc.clear_resteer_cycles or int_misc.recovery_cycles can count lost cycles for the front-end due to such things.)

For more about OoO exec, see

Out-of-order execution vs. speculative execution (including in the context of the Meltdown vulnerability, which suddenly made a lot more people care about the details of OoO exec). A modern OoO exec CPU treats everything as speculative until it reached retirement (which happens in program order to support precise exceptions.)
Difference between In-oder and Out-of-order execution in ARM architecture
Why memory reordering is not a problem on single core/processor machines? OoO exec preserves the illusion (for the local core) of instructions running in program order.

answered Aug 16, 2022 at 22:24

Peter Cordes

377k50 gold badges741 silver badges1k bronze badges

Sign up to request clarification or add additional context in comments.

13 Comments

Ahmed Ehab Over a year ago

I read in a book called coders at work that Mr Jamie Zawinski once faced a problem in GDB due to branch prediction because it was on a machine with speculative execution and GDB only supported branch always taken. I wanted to know how modern debuggers fixed that. You may assume that I know computer architecture well (OoO exec, multi-core, speculative execution...etc)

Peter Cordes Over a year ago

@AhmedEhab: That doesn't make a lot of sense to me. Possibly there was some ancient machine without precise exceptions, where debug traps could happen spuriously if you had a breakpoint that wasn't along the true path of execution? If you know computer architecture, it should be obvious that precise exceptions make it a non-issue that debuggers don't have to worry about. (Neither do OS implementations of the APIs debuggers use, like ptrace). Just like page-faults, debug traps only happen when they would have if the machine executed 1 instruction at a time in program order.

Peter Cordes Over a year ago

@AhmedEhab: Any side-effects can't be made visible until the CPU knows that they're on the correct path of execution. That includes storing to memory (handled by the store buffer), and traps to exception handlers (handled by in-order retirement for precise exceptions). Just like kernel's page fault handler can't run with an address resulting from mis-speculation, its debug-exception handler can't run with the wrong program-counter or other architectural state due to mis-speculation. A CPU won't act on an int3 instruction unless/until it reaches retirement, i.e. non-speculative.

Peter Cordes Over a year ago

@DanielNitzan: Thanks for the followup link, the comments under my answer there include an example: resuming correctly after a page-fault wouldn't be possible if a later inc eax might or might not have already updated the architectural state. It could end up getting done twice if it committed before an earlier faulting store, and then got decoded and executed again after return from a page-fault handler. Same thing could happen with other exceptions that we don't normally care about resuming, like integer-division faults, but debuggers could see an architectural state with later isns done

Peter Cordes Over a year ago

@DanielNitzan: Actual HW or SW breakpoints would presumably stop any later instructions from reaching the back-end, since they'll trap in the front-end. And single-stepping is similar, the front-end will only decode and issue one instruction, on ISAs like x86 that have HW support for single-stepping.

|

Collectives™ on Stack Overflow

How debuggers deal with out-of-order execution and branch prediction

1 Answer 1

13 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

13 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related