Skip to main content
Filter by
Sorted by
Tagged with
2 votes
0 answers
280 views

So there is this original question I assume most of the C++ developers familiar with : Why is processing a sorted array faster than processing an unsorted array? Answer: branch prediction Then I tried ...
OopsUser's user avatar
  • 4,894
3 votes
0 answers
109 views

I have a performance-critical C++ code base, and I want to improve (or at least measure if it's worth improving) the likelihood that clang assigns to branches, and in general understand what it's ...
meisel's user avatar
  • 2,605
4 votes
1 answer
114 views

Sometimes we purposefully leave NOPs in a function for later runtime patching. Instead of: .nops 16 Why not: jmp 0f .nops 14 0: Or, if the amount that you need to patch in, varies up to a maximum: ....
Joseph Garvin's user avatar
2 votes
1 answer
90 views

I have the following code in nanoMips: loop: lw $t1, A($t0) lw $t2, B($t0) sub $t3, $t1, $t2 beq $t3, $r0, else sw $t2, A($t0) b end The exercise asks me to implement the no-taken branch prediction ...
papitas's user avatar
  • 21
0 votes
1 answer
241 views

When learning about the basic 5-stage pipeline processor that does in-order execution the number of wasted cycles per branch misprediction is a constant number when the processor is flushed. But what ...
Gehaktmolen's user avatar
0 votes
0 answers
34 views

There are many questions on checking finding a GUID in a list etc. But I could not find any for just determining if a message was seen before or not. I have an API which receives requests with a ...
pooya13's user avatar
  • 2,859
0 votes
1 answer
242 views

I have performance critical code which calculates inter-atomic forcefield. It is controled by variables like bPBC, shifts, doBonds, doPiSigma, doPiPiI which can be switched on and off by user which ...
Prokop Hapala's user avatar
0 votes
0 answers
52 views

https://developers.google.com/admob/ios/privacy class ViewController: UIViewController { // Use a boolean to initialize the Google Mobile Ads SDK and load ads once. private var ...
Gargo's user avatar
  • 1,380
0 votes
1 answer
129 views

I am currently implementing selectionsort. Using the code below and the driver file to test it below. I am currently trying to do micro optimizations to see what speeds it up. public static void ...
Ooh Ben's user avatar
  • 11
1 vote
4 answers
397 views

I know a little something about branch prediction. This happens at the CPU and has nothing to do with compilation. Although you might be able to tell the compiler if one branch is more likely than the ...
Joel's user avatar
  • 1,777
1 vote
3 answers
232 views

I have the following logic: struct Range { int start; int end; }; bool prev = false; Range range; std::vector<Range> result; for (int i = 0; i < n; i++) { bool curr = ...; // this is ...
ra1nsq's user avatar
  • 11
1 vote
1 answer
416 views

EDIT x 2 Added more comprehensive function which returns an abstract register class: the function outputs a register full of floats. I don't care the actual length - SSE, AVX... - because Google ...
stuckoverlow's user avatar
2 votes
1 answer
1k views

On x86-64 whatever micro architecture and ARM64 devices, how many clock cycles does a mispredicted conditional branch cost? And I suppose I should also ask what the figure is for a successfully ...
Cecil Ward's user avatar
1 vote
0 answers
184 views

Modern CPUs since at least the 486 ¹) have a tightly-pipelined design, so conditional branches can cause "stalls" in which the pipeline has to be flushed and the code restarted on a ...
Coder's user avatar
  • 247
1 vote
0 answers
280 views

For branch prediction, the BHT(Branch history table) is indexed by branch virtual address. Aliasing problem happens when two or more branches hash to the same entry in the BHT(Branch history table), ...
Changbin Du's user avatar
3 votes
0 answers
180 views

I am currently looking for answers to why gcc generates strange instructions like "rep ret" in the generated assembly code. I came across a question on Stack Overflow where someone raised a ...
Michael Coleman's user avatar
0 votes
0 answers
36 views

If CPU is already in the path of a branch A speculatively, will it continue to speculatively execute the next branch B? or wait until branch A retire? if (A) { /* body of branch A */ if(B) { ...
Changbin Du's user avatar
3 votes
0 answers
160 views

I have an AVL tree and I need to traverse it in ascending and descending order. I implemented a simple algorithm, where knowing the tree size in advance, I allocate an array and assign 0 to a counter, ...
Serge Rogatch's user avatar
3 votes
1 answer
234 views

I know that most modern processors maintain a branch prediction table (BPT). I have read the gdb documentation but I could not found any command that should give desired results. Based on this, I have ...
Taimoor Zaeem's user avatar
-1 votes
1 answer
478 views

In go standard package src/sync/once.go, a recent revision change the snippets if atomic.LoadUint32(&o.done) == 1 { return } //otherwise ... to: //if atomic.LoadUint32(&o.done) == ...
agnes's user avatar
  • 21
0 votes
1 answer
533 views

I know that modern CPUs do OoO execution and got advanced branch predictors that may fail, how does the debugger deal with that? So, if the cpu fails in predicting a branch how does the debugger know ...
Ahmed Ehab's user avatar
0 votes
3 answers
545 views

Here is some c++ pseudo-code as an example: bool importantFlag = false; for (SomeObject obj : arr) { if (obj.someBool) { importantFlag = true; } obj.doSomethingUnrelated(); } ...
Greg's user avatar
  • 63
5 votes
0 answers
150 views

Consider this code: .globl _non_tail, _tail .text .code32 _non_tail: lcall $0x33, $_non_tail.heavensgate ret .code64 _non_tail.heavensgate: # do stuff. there's 12 bytes on the stack ...
Joseph Sible-Reinstate Monica's user avatar
0 votes
0 answers
330 views

So I have this code snippet in C int unit_test_case08(int a, int b) { int success = 1336; if(a != b) { success = 1337; } else { success = -1; } return ...
BBBBBBBBBBBBBBBBBBBBBBBBB's user avatar
0 votes
0 answers
49 views

I executed the code from this famous topic Why is processing a sorted array faster than processing an unsorted array? On my Mac OS Mojave: //file test.cpp #include <algorithm> #include <ctime&...
Mikhail Genkin's user avatar

1
2 3 4 5
8