0

I am trying to understand meaning of various Intel performance monitoring counters and also want to measure load stalls using Intel performance monitoring counters available for RESOURCE_STALLS.

The following are approx. per second values for all RESOURCE_STALLS counters for a program running on my system (i.e INTEL_BROADWELL_XEON)

RESOURCE_STALLS.ANY = 522266857
RESOURCE_STALLS.SB  = 249785706
RESOURCE_STALLS.ROB  = 78120602
RESOURCE_STALLS.RS   = 53729085

Questions:

Does RESOURCE_STALLS.SB count store stall cycles?

How to find load stalls?

Can we subtract sum of RESOURCE_STALLS.ROB, RESOURCE_STALLS.SB and RESOURCE_STALLS.RS from RESOURCE_STALLS.ANY to get approximate cycles spent in load stalls?

Thanks,
TS

2
  • 1
    Stores being slow to commit is only a problem because the store buffer can fill up. Loads being slow to return data is a big problem because later uops usually depend on their results. (So you get the RS filling up, and sometimes the ROB.) The other thing that makes stores special and more worth tracking separately is that stores live in the store buffer after retiring from the ROB (such stores are called "graduated", and only at that point can they be considered for commit to L1d). Commented Oct 12, 2022 at 8:14
  • 1
    I highly doubt you can subtract the sum of other stalls from ANY and get something meaningful. The ROB and/or RS can be full at the same time as the store buffer, and loads not being able to issue from the front-end because you're out of load-buffer entries could happen without the CPU being stalled (which IIRC means no uops dispatched for execution that cycle.) Commented Oct 12, 2022 at 8:19

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.