3
typedef union  
{  
    unsigned long anyvariable;  
    float        output;  
} map;  
    
return  (*(map*)&fraction32).output;  

I copied this method of accessing the members of a union which does not have a physical existence from a very experienced programmer on the microchip forum. fraction32 is a previously defined variable. anyvariable is a placeholder

Could someone explain how this can work when the union does not have an instance of the type. How does the actual variable, fraction32, get mapped into the union?

New contributor
Cosmo Little is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.
6
  • It's a way to do type punning through a union. (map*) is a cast to convert fraction32, which is presumably a long, to a float. I'm refraining from answering because it's right on the edge of undefined behavior, and there are better contributors here to evaluate that. The current top answer to this question states this is a violation of strict aliasing. Commented yesterday
  • 1
    @AndrewHenle: That answer does not apply. Accessing an unsigned long through a float type would violate C’s aliasing rules. However, the code in this question accesses an unsigned long through a union, and it is a union that contains an unsigned long member, and that is specifically allowed in C’s aliasing rules. Commented yesterday
  • @Cosmo Little, Hmm, I'd consider return ((map){fraction32}).output; instead. (use a compound literal) Commented yesterday
  • 1
    Regarding "defined as a type": that's not what typedef does. Union types are types, regardless of typedef. What typedef does is define an alias for a type's name, including if the type does not have a name of its own. Commented yesterday
  • "fraction32 is a previously defined variable" Previously defined as what? The question cannot be answered without knowing the type of that variable. Commented 23 hours ago

2 Answers 2

7

How does the actual variable, fraction32, get mapped into the union?

fraction32 is (presumably; its definition is not shown in the question) an object in memory of type unsigned long, and unsigned long is hopefully the same size as float in this C implementation.

&fraction32 takes the address of that object. The result is a pointer of type unsigned long *.

(map *) &fraction32 converts that pointer to a pointer of type map *. In most C implementations, the resulting pointer points to the same memory. However, this is not required by the C standard,1 so the code is not strictly conforming.

Since the type of (map *) &fraction32 is map *, * (map *) &fraction32 says to access that memory as if it were a map object. Thus, if the pointer does point to the memory of fraction32, the bytes in that memory are interpreted as if they were a map object.2

(* (map *) &fraction32).output says to take the output member of that union. In C, reading any member of a union reinterprets the bytes of the union using the type of the member. So this reinterprets the bytes, originally of fraction32, as if they were a float.

Nonetheless, a better way of reinterpreting an unsigned long as a float in C is to use a compound literal:

return (map) { float32 } .output;

Footnotes

1 There are two problems with converting an unsigned long * to map *: The unsigned long might not have the alignment required for a map (but this is unlikely in typical C implementations) and the C standard does not specify what value results from this conversion is except that it can be converted back to the original type to produce the original pointer or equivalent. This is an argument that the compiler must generate code that generates the desired value because an unsigned long * in general could be a pointer to a member of a map *, and converting a pointer to a member of a union to a pointer to the union type is specified to produce a pointer to the union. But that argument does not cover the case where the compiler can see the unsigned long * does not point to an actual member of the union.

2 Some people might say that this access to the union is undefined behavior due to C’s aliasing rules. However, accessing the unsigned long named float32 using map conforms to the aliasing rules. C 2024 6.5 says that an object may be accessed by “an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union),” and map is a union type that includes unsigned long among its members.2 If the code had been * (float *) &float32, then the aliasing rules would be violated.

3 It should be noted that the C standard is unclear on whether a.b is an access to the union a, after which b is extracted, or it is a direct access to the member b. If it is the former, the aliasing rules are satisfied. If it is the latter, the aliasing rules are violated. But the latter would also mean most reads of union members violate aliasing rules, but that is clearly not the intent of the standard since footnote 93 in C 2024 6.5.3.4 indicates that reinterpreting the bytes of a union using any member type is intended.

Sign up to request clarification or add additional context in comments.

10 Comments

The cast to map * does have the potential to violate any implementation-specific alignment requirements for float.
I'm not following "the latter would also mean most reads of union members violate aliasing rules". If you have an object of union type, the SAR does not forbid accessing it via an lvalue of its own type. I don't see how the result being the left operand of a member-selection operator factors in, or how that presents any aliasing issues for the ensuing member-selection operation.
Suppose u.f is an access to the member f, rather than an access to the union followed by an extraction of f. Consider typedef union { int i; float f; } U; U u = { 0 }; return u.f;. The effective type of the memory of u is U, since it was declared as U. Then u.f accesses an object with declared type U with type float, which is not a case that conforms to the aliasing rules.
I agree that that is surely not the intended interpretation of the spec. I disagree that considering u.f to comprise an access to u followed by an access to its member f demands that reading. When structures and (especially) unions are involved, you cannot say "the effective type of the memory". The spec says the effective type of an object, and objects of different types can overlap in memory.
There remain problems with this. Suppose we do float *p = &u.f;. Then return *p; clearly must be a direct access to a member, as the union type is not involved in the *p expression. We could even pass p to an external function that has no knowledge of U and would expect that function to be able to read *p. That implies accesses to union members directly conform to the aliasing rules. But that requires that the memory of a union have multiple declared types; it must have the declared type of each of its members…
… But then a function that receives parameters int *i and float *f cannot be optimized with the assumption that accesses to i[j] and f[k] do not overlap. And that is the whole point of the aliasing rules, to allow compilers to optimized based on the assumption that difference types access different memory. So the C standard’s specifications in this regard are deficient. (Which has been discussed previously on Stack Overflow.)
Okay, we can regard the memory of a union as containing several overlapping objects. The standard does say that. But it has the same problem that a function may receive an int *i and a float *f that point to the same memory, and using either conforms to the aliasing rules (since there is an object with declared type the same as the lvalue type), so the compiler cannot optimize based on the assumption that int *i and float *f point to nonoverlapping memory, and then the reason for the aliasing rules is broken.
I take you to mean that two function parameters may point to different members of the same union. Yes, and the SAR does not then forbid accessing either or both, and a compiler that applies type-based aliasing analysis to optimize might thereby produce behavior inconsistent with spec. I've heard people describe this as unions being fundamentally incompatible with optimization. This is addressed in C++ by the concept of the "active" member of a union. In practice, compilers tend to ignore the problem, at least where no union type definition that could account for the aliasing is in scope.
fraction32 is an unsigned long containing the hex value of the float. This came about as I had to change a 64bit double stored in an unsigned long long into a 32 bit float. ( XC8 compiler supports long long, but not 64bit double)
I have never heard of a compound literal. However I think that I will stick with the simple way of using a union without typedef, ie setting one variable to the required value and reading the other. It might use a few bytes more memory, but is a lot easier to understand. Thanks for all your learned comments
3

Could someone explain how this can work when the union does not have an instance of the type. How does the actual variable, fraction32, get mapped into the union?

The code reinterprets the first sizeof(float) bytes of the (presumably) unsigned long input as if they were the bytes of a float. Involving the union is a bit of formalism aimed at ensuring that the code has the intended behavior according to the C language specification -- semantically, the bytes are reinterpreted as if they were the bytes of a union, and one of that union's members is read out, but that amounts to the same thing.

You can understand your original as doing the same as this:

    float result;
    memcpy(&result, &fraction32, sizeof result);
    return  result;

Indeed, a decent modern compiler is likely to produce the same optimized machine code for that as for your version. As such, I would absolutely recommend at least evaluating the memcpy() version. It is clearer, and it does not skirt the edge of behavior defined by the language spec.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.