(Update: Problem solved. It all came down to a stupid typo on my part, causing me to write the wrong part of memory, which in turn caused some pointer to point to someplace that was off limits.)
So, I'm taking a course that involves some programming, and we've essentially been thrown in the deep in of the C pool. I've programmed in other languages before, so it's not all new, but I don't have a solid set of tools to debug my code when the proverbial shit hits the fan.
I had, essentially, the following
int nParticles = 32;
int nSteps = 10000;
double u[nParticles], v[nParticles];
for (i = 0; i < nSteps; i++) {
...
for (j = 0; j < nParticles; j++) {
u[j] = 0.001 * v[j];
}
...
}
as one part of a bigger program, and I was getting segmentation fault. In order to pinpoint the problem, I added a bunch of
printf("foo\n");
and eventually it turned out that I got to step i = 209, and particle j = 31 before the segmentation fault occured.
After a bit of googling, I realised there's a tool called gdb, and with the extra printfs in there, doing bt in gdb tells me that now it's printf that segfaulting. Keep in mind though, I got segfaults before adding the diagnostic printfs as well.
This doesn't make much sense to me. How do I proceed from here?
Update:
valgrind gives me the following
==18267== Invalid read of size 8
==18267== at 0x400EA6: main (in [path redacted])
==18267== Address 0x7ff001000 is not stack'd, malloc'd or (recently) free'd
==18267==
==18267==
==18267== Process terminating with default action of signal 11 (SIGSEGV)
==18267== Access not within mapped region at address 0x7FF001000
==18267== at 0x400EA6: main (in [path redacted])
==18267== If you believe this happened as a result of a stack
==18267== overflow in your program's main thread (unlikely but
==18267== possible), you can try to increase the size of the
==18267== main thread stack using the --main-stacksize= flag.
==18267== The main thread stack size used in this run was 10485760.
==18267==
==18267== HEAP SUMMARY:
==18267== in use at exit: 1,136 bytes in 2 blocks
==18267== total heap usage: 2 allocs, 0 frees, 1,136 bytes allocated
==18267==
==18267== LEAK SUMMARY:
==18267== definitely lost: 0 bytes in 0 blocks
==18267== indirectly lost: 0 bytes in 0 blocks
==18267== possibly lost: 0 bytes in 0 blocks
==18267== still reachable: 1,136 bytes in 2 blocks
==18267== suppressed: 0 bytes in 0 blocks
==18267== Rerun with --leak-check=full to see details of leaked memory
==18267==
==18267== For counts of detected and suppressed errors, rerun with: -v
==18267== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 6 from 6)
Segmentation fault (core dumped)
I don't know what that means.
Update:
I tried commenting out the array assignment that initially caused the segfault. When I do that, but leave most of the diagnostic printfs in, I get a segfault at i = 207 instead.
Update:
Problem solved. In the outer loop (where i is the counter, representing time steps), I had a couple of inner loops (all of which reused j as a counter, iterating over a bunch of particles). In one of the inner loops (not the one that segfaulted, though), I was accidentaly assigning values to E[i] where E is an array of nParticles size, so I was running way out of bounds. Fixing this stops the segfault from happening.
So, it all came down to a silly silly typo on my part.
Update:
I spoke to my brother, and he explained the problem in a way that at least satisfies my limited understanding of the situation.
By accidentaly writing things in E way beyond the end of that array, I probably overwrote the pointers associated with my other arrays, and then when I go to access those other arrays, I try to access memory that's not mine, and I get a segfault.
Thank you all so much for helping me out and putting up with my lack of knowledge!