1

I have a variable which is read from my main loop, and is both read and written from an interrupt handler. The interrupt can change the value at any time, so clearly it needs to be volatile in the main loop. But when the interrupt is running, it won't change unexpectedly (both because the interrupt is the only thing that changes it, and because the interrupt cannot be interrupted itself). It's important that the interrupt is fast, and making the variable non-volatile helps with this. In C, what is the best way to handle this situation?

At the moment, the variable is volatile. This works, but the IR handler is painfully slow, probably because quite a few useful compiler optimisations are disabled by the volatile keyword.

I could cast the variable to a non-volatile one in the IR handler. But I think that is officially undefined behaviour. (C standard Annex J.2: An attempt is made to refer to an object defined with a volatile-qualified type through use of an lvalue with non-volatile-qualified type). And I'm not sure exactly what syntax I should use.

I could copy the value from the volatile variable to a local one, then copy back at the end of the IR handler. This feels a bit clunky, especially as there are actually quite a few of these variables.

This feels like a common enough thing that there should be an established good practise, I just don't know what it is.


Example main loop:

 union {float value[8]; uint8_t bytes[32];} uart_out;     

 while (1){
 if (state.new_results) {
    state.new_results = false;
    uart_out.value[0] = cabs(state.v_ref);
    uart_out.value[1] = cabs(state.v_sens);
    uart_out.value[2] = creal(state.v_in);
    uart_out.value[3] = cimag(state.v_in);
    uart_out.value[4] = creal(state.z_ratio);
    uart_out.value[5] = cimag(state.z_ratio);
    uart_out.value[6] = creal(state.z_ratio_filt);
    uart_out.value[7] = cimag(state.z_ratio_filt);
    HAL_UART_Transmit_IT(&huart1, uart_out.bytes , 32);
}

example IR header

#include <complex.h>
struct State_s {
    volatile double complex v_in;
    volatile double complex v_ref;
    volatile double complex v_sens;
    volatile double complex z_ratio;
    volatile double complex z_ratio_filt;
    volatile bool new_results;
};
extern struct State_s state;

example IR handler fragment

state.v_in = new_x + new_y*I; // new_? variables come from ADC via DMA.
// The next line takes 100us if everything is volatile! using -O3.
// z1 and z2 are global complex doubles which change only rarely
state.z_ratio = -state.v_sens / (state.v_in/z1 + state.v_ref/z2);
// More code sets new values for all other complex doubles
// in the struct, based on the new z_ratio. As above they read
// other values from the struct when doing calculations, and it
// is that many-reads-of-volatiles which is probably making it slow

// Write a new value to a buffer which will go to DAC by DMA
state.new_results = true;
19
  • 4
    You'll likely want the ISR to think it can change unexpectedly, or otherwise it might assume "ah that one again, it is already value 5 because I wrote that the last time, no need to update it". It seems that your actual problem is that you do a lot of arithmetic on the variable all over the ISR? Instead of working with a local temporary variable and then when done update the volatile shared one. Is that the case? Please post the code of the ISR and the use in main(). Commented Apr 2 at 13:14
  • 2
    "This feels like a common enough thing that there should be an established good practise" Yes and that is: do not needlessly access volatile-qualified variables (like registers) over and over. Prepare what you want to write, then write it all at once. An access to a volatile variable should ideally just be a single read or a single write, no arithmetic or complex expressions involved. Without seeing your code it's anyone's guess if this is the problem or not though. Commented Apr 2 at 13:16
  • 1
    Also out of curiosity, which core is this for? Needless to say it needs a double precision FPU on-chip or otherwise the project was doomed from the beginning. Maybe a stupid question, but I'm just checking so that you aren't coding this using some bloody Arduino :) Commented Apr 2 at 14:04
  • 2
    Double precision arithmetic will be slow as sin then, as it will get executed using software libs and not necessarily on the FPU. Check the disassembly (and weep, possibly), are there actual floating point instructions or just inline functions getting called? The project should be switched to float if you remain on Cortex M4. Commented Apr 2 at 14:15
  • 1
    "floats became the limiting factor on performance" How so? They can never be slower than doubles. They can only be strange and possibly irrelevant if you have hw support for double anyway. Commented Apr 2 at 14:26

4 Answers 4

6

Ok so there are many things here which need improvement and apart from being slow, this looks like it have lots of race condition bugs all over it too.

  • The ADC interrupt should not result in something that's immediately placed into some double complex mathematical model. Rather, the ISR should notify the ADC driver that a new value is available. Which is just a volatile uint32_t or similar, together with a volatile bool new_value.
  • Only the ADC driver communicates with the ADC interrupt. You may have to temporarily disable the interrupt during variable reads to prevent race conditions, since atomic access isn't guaranteed in this case.
  • Your application logic calls the ADC driver and asks for a new value. Then it gets a plain old uint32_t which is a copy of the last stored volatile value.
  • Now your application logic can start chewing through all manner of slow math, complex numbers etc etc.

Unrelated to that need of complete re-design, making individual struct members const and/or volatile is often questionable and it should usually be the whole struct that got the const or volatile qualifier.

General good practice in embedded real-time systems is also not to hardcopy big chunks of data ever, but especially not from inside an ISR. So rather than passing around this big struct by value all over the place and making copies of it during run-time, you could use two or more pointers to struct and just swapped where they point at. So that the ISR works with one data set and the main program with another, aka double/triple buffering. This also minimizes the time where your race condition protection is up, since swapping pointers goes much faster than swapping whole chunks of memory.

Sign up to request clarification or add additional context in comments.

3 Comments

Just going to add that this is a prototype, and I know it's messy. I didn't think there were any race conditions though. The reason this happens in an IR is that after doing the maths it updates a DAC, and timing for ADC -> maths -> decision -> DAC is very tight. It is at least a DMA interrupt rather than a ADC interrupt, so I'm batching through a few hundred samples at a time, and do have a few hundred us before it will fire again.
@JackB Sounds like you should be using fixed point arithmetic. And if it's a DMA interrupt then it might be ok if there's a real-time guarantee that the buffer is emptied before next interrupt. But that could also mean that the caller code should rather be grabbing the data out of the DMA buffer. And then either work with buffer swapping using a different DMA buffer each time. Or if that's not possible, well then you might just have found a case where a hardcopy would make sense.
The DMA is double buffered, and I've been watching with a scope on a GPIO to make sure the IR handler is done (with some margin for safety) before the DMA gets back to the first buffer again. 32-bit fixed point arithmetic is separately on my todo list.
3

What makes an access volatile is the qualifier on the lvalue, not in the original object definition. C 2024 5.2.2.4 says:

An access to an object through the use of an lvalue of volatile-qualified type is a volatile access.

Further, an unqualified object may be accessed through a qualified lvalue. So, you can define the object without volatile and use it in the interrupt routine:

unsigned foo;

void InterruptRoutine(/* ... */)
{
    // Use foo.
}

and use it with volatile elsewhere:

#define vfoo (* (volatile unsigned *) &foo)

void RegularRoutine(/*...*/)
{
    // Use vfoo.
}

One might even use foo for the macro name:

#define foo (* (volatile unsigned *) &foo)

That works since a macro name that appears in its own replacement list is not replaced again, and it helps guard against inadvertent use of foo in the regular routines.

(This answer only asserts those will work according to C standard. This answer does not speak to whether it is a good idea to do these things.)

6 Comments

"What makes an access volatile is the qualifier on the lvalue, not in the original object definition" This is new in C23 though and there will be many years and TC before C23 reaches embedded systems (if ever). Though most embedded compilers I know of do "de facto" treat a volatile lvalue access the same as an access of a volatile qualified variable. (There's various trash compilers where volatile is just plain broken, however.)
This sounds quite good, but as Lundin says, my MCU manufacturer recommends GCC and -std=gnu11 and I'm not sure how easy it would be to move to a newer standard. If this is documented behaviour in gcc somewhere that would work though.
@JackB gcc supports C23 and answers on SO do assume the latest standard unless you explicitly ask for an older one. The problem is that C23 is full of defects and poorly considered mechanisms, to the point where it must be avoided in production code. And another problem is that gcc past version 8 or so is a complete and utter bugfest, to the point where it must be avoided in production. And back in last stable version 7.x somewhere you only have up to C11 support.
@Lundin It's new in C23, but in practice, implementations have always interpreted it that way (DR 476, N1956), so nothing new, really!
@IanAbbott Rather: volatile has always been implemented in buggy ways so any clarification is welcome. See for example this. I don't recall the relevant Bugzilla thread.
|
1

The interrupt handler fragment you showed writes to a volatile variable and then immediately read back from it (and others) in order to compute a value to be written to yet another volatile.

// Writes to state.v_in.
state.v_in = new_x + new_y*I;
// Reads back from state.v_in (and others):
state.z_ratio = -state.v_sens / (state.v_in/z1 + state.v_ref/z2);

That's costly, of course, since the volatile qualifier requires the values be reloaded from memory every time.

Try computing all of the updated state values in non-volatile variables, and then write to the volatile ones just once at the of the handler:

double complex updated_v_in = new_x + new_y*I;
double complex updated_z_ratio =
    -updated_v_sens / (updated_v_in/z1 + updated_v_ref/z2);
// ...
state.v_in = updated_v_in;
state.z_ratio = updated_z_ratio;
// ...
return;

Another way to do this would be to define the State_s without the volatile qualifiers. Have the mainline code access the state through a volatile pointer

Comments

0

I see OP's approach as weak, unreliable and so suggest a larger change.

Consider an architecture shift

With "I oversample and average", all the ISR should do is get the ADC value, sum the results, get out.

Defer significant data processing to the main loop, especially any floating point math.

// Concept pseudo code

volatile struct sum_count {
  uint32_t sum;
  uint16_t count;
} adc;

// Get in, do little, get out
void ISR_ADC(void) {
  uint16_t sample = ADC_read();
  ADC_Setup_for_next_sample(); 
  adc.sum += sample;
  adc.count++;
}

// Called by the main loop.
sum_count ADC_GetSample(void) {
  sum_count sc;

  // Keep this quick.
  state = save_interrupt_state();
  disable_isr();  // Or disable just the ADC ISR.
  sc = adc;
  adc.sum = 0;
  adc.count = 0;
  restore_interrupt_state(state);

  // Now we can take our time.

  // Maybe do more complex data processing here and instead return another type.

  return sc;
}
 

  

1 Comment

The reason this all happens in the IR handler is because the calculation is used to set the output of a DAC (or rather, refill a DMA buffer for a DAC), and it is time critical. I've updated the example to make that clearer. I guess I made the minimal example too minimal...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.