Skip to main content
Filter by
Sorted by
Tagged with
3 votes
0 answers
109 views

I have a performance-critical C++ code base, and I want to improve (or at least measure if it's worth improving) the likelihood that clang assigns to branches, and in general understand what it's ...
2 votes
0 answers
61 views

Preparing to make Estrin's method vectorisable I changed from normal linear indexing of the coefficients to bitreversed and restricted it to strictly powers of 2. Neither MSVC nor ICX can see how to ...
0 votes
0 answers
129 views

The tiered steps provided by Oracle are: It seems to me that... I'd be a reasonable assumption to think that optimizations should occur with methods in isolation (detached from its call-site context),...
2 votes
0 answers
96 views

I am currently reading through the F# core library source code and stumbled upon a common pattern which made me wonder a little about the performance of it, and could not find anything about it by a ...
1 vote
0 answers
114 views

I compile my binary like this go build -gcflags '-N -l', but when I run them in dlv I get Warning: debugging optimized function. My guess it's that it's a different host than the build host and the ...
1 vote
0 answers
81 views

I would like to do some (micro)benchmarking in Swift. I have been using package-benchmark for this. It comes with a blackHole helper function that forces the compiler to assume that a variable is read ...
2 votes
0 answers
122 views

This question provides kind of an alternative way for the issue described in Inefficient Loop Unrolling. But not, this question is absolute independent of the other one. We have a primitive switch/...
0 votes
0 answers
95 views

Consider the following minimal example: subroutine f(x,a,b,c) integer::x,i,j,k integer,optional::a,b,c ! depending on present arguments, some expressions and variables are redundant i=...
0 votes
0 answers
96 views

Assume the function below. There are two extents that have been picked to be 3 and 17. We wish to vectorize SomeWork (it is in the translation unit and simple). The naive approach I take is to flatten ...
1 vote
0 answers
311 views

I'm testing a project on a signal processing processor (ADSP 21489). I'm using as a development software VisualDSP++ 5.0 from Analog Devices. I use the DMA buffers to send data to a CODEC. The ...
0 votes
0 answers
150 views

I'm learning LLVM IR and noticed some seemingly redundant instructions in the generated code. For example, in the following LLVM IR: define i32 @main() #0 { %1 = alloca i32, align 4 store i32 0, ...
0 votes
0 answers
84 views

If I declare a struct like this: public record struct MyStruct(bool MyValue); and then look at the decompiled source, it looks like this: public struct MyStruct : IEquatable<MyStruct> { [...
1 vote
0 answers
88 views

I'm wondering if the Java compiler will optimize the run of the for-loop if there no changes within the for-loop body? For instance, let's suppose I have the following code: double res = 0; for (int i ...
1 vote
0 answers
76 views

in C++ is it possible to somehow pass a constant argument through multiple functions and still use consteval functions for them? The code below does not compile to FieldAsInt because the 'key' ...
2 votes
0 answers
188 views

I'm reading about how to translate out of SSA form from different sources and I'm having hard time about the necessity of different techniques. First of all looking at Cytron's SSA paper at section 7 ...
1 vote
0 answers
67 views

Context I'm trying to write a piece of code in inline assembly, which processes all elements of a "small" array (say ~10 elements) as a fully-unrolled loop. I want to avoid falling into the ...
1 vote
0 answers
144 views

i want to try optimizing this line of code: for i in 0..len { slots[values[i].slot].count += 1; } (both of these lists are extremely long and the same size) i have already optimized it like this(...
3 votes
0 answers
153 views

In general, the program extern int x, y; int main() { return x + N > y; } is optimized into something akin to x + N-1 >= y for some given N. Example below. Am I reading the assembly right? ...
0 votes
0 answers
175 views

My architecture is MicroBlaze, and I'm developing on the Xilinx ZCU102. The C++ version is 17. I built the code on Vitis 2022.2 version. My original code had a .text section of only 60,000 bytes. ...
2 votes
0 answers
288 views

Update2: You can find the original codes below in the github link, if needed. You can also find the complete, exact changes I made to reproduce the problem, along with program logs. But they are in ...
3 votes
0 answers
116 views

gcc/clang's __attribute__((const)) is a close (but not entirely exact) analogue of msvc's __declspec(noalias): both express that "a function call doesn't modify or reference visible global state&...
1 vote
0 answers
23 views

While debugging in RemixIDE with 'optimize=true', I noticed that the Solidity code address(0xAAAA) generates the following opcodes, which only compute 20 bytes 1s: PUSH1 0x1 PUSH1 0x1 PUSH1 0xa0 SHL ...
0 votes
0 answers
147 views

I'm currently working on optimizing a kernel, and one of the most time-consuming loops, despite optimization efforts, still accounts for 80% of the benchmark's execution time. The loop's performance ...
4 votes
0 answers
377 views

I'm creating a particle-life simulation in rust and i'm using Nannou for rendering graphics. Everything seems to work when i run "cargo run" but when i tried doing a "cargo run --...
1 vote
0 answers
125 views

I use tree-sitter to write the parser for comment input. this is the code I write for single line comment parsing: seq( "//", optional(seq($.comment_prefix, optional(/[ ]*/))), ...

1
2 3 4 5
10