0

I know that modern CPUs do OoO execution and got advanced branch predictors that may fail, how does the debugger deal with that? So, if the cpu fails in predicting a branch how does the debugger know that? I don't know if debuggers execute instructions in a simulated environment or something.

1 Answer 1

2

Debuggers don't have to deal with it; those effects aren't architecturally visible, so everything (including debug breakpoints triggering) happens as-if instructions had executed one at a time, in program order. Anything else would break single-threaded code; they don't just arbitrary shuffle your program!

CPUs support precise exceptions so they can always recover the correct consistent state whenever they hit a breakpoint or unintended fault.

See also Modern Microprocessors A 90-Minute Guide!


If you want to know how often the CPU mispredicted branches, you need to use its own hardware performance counters that can see and record the internals of execution. (Software programs these counters, and can later read back the count, or have it record an event or fire an interrupt when the counter overflows.) e.g. Linux perf stat counts branches and branch-misses by default.

(On Skylake for example, that generic event probably maps to br_misp_retired.all_branches which counts how many branch instructions eventually retired that were at one point mispredicted. So it doesn't count when the CPU detected mis-prediction of a branch that was itself only reached in the shadow of some other misprediction, either of a branch or a fault. Because such a branch wouldn't make it to retirement. Events like int_misc.clear_resteer_cycles or int_misc.recovery_cycles can count lost cycles for the front-end due to such things.)


For more about OoO exec, see

Sign up to request clarification or add additional context in comments.

13 Comments

I read in a book called coders at work that Mr Jamie Zawinski once faced a problem in GDB due to branch prediction because it was on a machine with speculative execution and GDB only supported branch always taken. I wanted to know how modern debuggers fixed that. You may assume that I know computer architecture well (OoO exec, multi-core, speculative execution...etc)
@AhmedEhab: That doesn't make a lot of sense to me. Possibly there was some ancient machine without precise exceptions, where debug traps could happen spuriously if you had a breakpoint that wasn't along the true path of execution? If you know computer architecture, it should be obvious that precise exceptions make it a non-issue that debuggers don't have to worry about. (Neither do OS implementations of the APIs debuggers use, like ptrace). Just like page-faults, debug traps only happen when they would have if the machine executed 1 instruction at a time in program order.
@AhmedEhab: Any side-effects can't be made visible until the CPU knows that they're on the correct path of execution. That includes storing to memory (handled by the store buffer), and traps to exception handlers (handled by in-order retirement for precise exceptions). Just like kernel's page fault handler can't run with an address resulting from mis-speculation, its debug-exception handler can't run with the wrong program-counter or other architectural state due to mis-speculation. A CPU won't act on an int3 instruction unless/until it reaches retirement, i.e. non-speculative.
@DanielNitzan: Thanks for the followup link, the comments under my answer there include an example: resuming correctly after a page-fault wouldn't be possible if a later inc eax might or might not have already updated the architectural state. It could end up getting done twice if it committed before an earlier faulting store, and then got decoded and executed again after return from a page-fault handler. Same thing could happen with other exceptions that we don't normally care about resuming, like integer-division faults, but debuggers could see an architectural state with later isns done
@DanielNitzan: Actual HW or SW breakpoints would presumably stop any later instructions from reaching the back-end, since they'll trap in the front-end. And single-stepping is similar, the front-end will only decode and issue one instruction, on ISAs like x86 that have HW support for single-stepping.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.