-1

The code:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    union {
        int theInt;
        char theChar;
    } u1;

    u1.theChar = 'A';
    printf("%i\n", u1.theInt);
    printf("%c\n\n", u1.theChar);

    u1.theChar = "A";
    printf("%i\n", u1.theInt);
    printf("%c\n\n", u1.theChar);

}

Gives the output:

65
A

45
-

In the first assignemnts the char 'A' is assigned and in the second one the array "A" is assigned. Why do these two assignments lead to different union values?

Ludwig S
  • 1
  • 1
  • 6
    `u1.theChar = "A";` doesn't compile with a standard compiler, so that would be why. Please study [“Pointer from integer/integer from pointer without a cast” issues](https://stackoverflow.com/questions/52186834/pointer-from-integer-integer-from-pointer-without-a-cast-issues). – Lundin Jan 10 '20 at 14:39
  • Obviously it compiles with the OPs compiler, gcc probably also only emits a warning, but produces a binary (unless using `-Werror` of course). – Ctx Jan 10 '20 at 14:43
  • "A" is not a valid char type, its a string, if you want to print "A" then use %s for string in printf. – ROOT Jan 10 '20 at 14:43
  • 1
    @Ctx There is no such thing as "only a warning". There's "it compiles cleanly" and there's "it compiles with diagnostic messages". – Lundin Jan 10 '20 at 14:59
  • 1
    @underscore_d Well not really since the examples speak of `int` type and not of string literals. Nor does it address the multiple union type punning hiccups present in this code, which is the second major problem. I've posted an answer addressing both problems. – Lundin Jan 10 '20 at 15:00
  • @Lundin Yes, but "doesn't compile" is something else than "compiles with diagnostic messages", isn't that obvious? – Ctx Jan 10 '20 at 15:05
  • @Ctx The real problem is that books/teachers don't tell newbies to always compile with `-std=c11 -pedantic-errors -Wall -Wextra` or equivalent. But the "warning instead of error" disease is just present in the gcc-like compilers. If I compile this in some wildly different but standard compliant compiler like Codewarrior, Keil, IAR etc, I get something like "Error: Type mismatch (expected 'unsigned char ', given 'unsigned char *')". This with default settings. Nobody mentioned gcc, clang & friends. – Lundin Jan 10 '20 at 15:13
  • @Lundin You could say, that the code is not standard compliant, that it invokes UB, or whatever. Claiming, that "it doesn't compile" looks like fighting reality, Mr. Quixote ;) – Ctx Jan 10 '20 at 15:17
  • @Ctx But it _doesn't_ compile on my standard compliant compiler, as proven above. The OP isn't necessarily using a standard C compiler, they could be using VS or some crap - we don't know. – Lundin Jan 10 '20 at 15:19
  • @Lundin It compiles with gcc (unless explicitly denying it to do so with some flags), that should be well enough... – Ctx Jan 10 '20 at 15:24
  • @Ctx gcc isn't a C compiler by default - but a compiler for the GNU language. I'm talking about C compilers here. – Lundin Jan 10 '20 at 15:51

2 Answers2

3

In your code

 u1.theChar = "A";

is wrong, as the RHS, "A" is a string literal, which boils down to a pointer to the array containing the char 'A' and the terminating null. A pointer cannot be assigned to a char, it's a constraint violation.

If the code compiles and generate a binary, the execution would invoke undefined behaviour.

Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261
2

Gives the output: 65 A

No it doesn't, necessarily, because the code invokes undefined behavior since u1 isn't initialized and doesn't have its address taken, see (Why) is using an uninitialized variable undefined behavior?

Undefined behavior = no telling what will happen. On a 32 bit little endian system, at best it prints 3 bytes of indeterminate garbage followed by a 4th byte with the ASCII value. I get some garbage 2005012545. To make any sense of the garbage, I could print it as hex instead: 0x77821041, where 41 is the LSB containing the ASCII value 0x41/65.

This is because you didn't initialize the union and then you are probably running a debug build which just happens to zero-out all stack values, so that the program seems to work for you.

Fix this by initializing the union to a known value: ... } u1 = {0};.


As for u1.theChar = "A";, it doesn't even compile on a correctly configured C compiler (see "Pointer from integer/integer from pointer without a cast" issues). Because the string literal "A" end up as a char* type when assigned. but theChar is of type char. This is also undefined behavior, that's pointless to make any sense of. At best, you would end up with one of the bytes from the character pointer's address stored in theChar, but the code isn't valid, so there's no telling what it will do.


Necessary fixes:

#include <stdio.h>

int main (void)
{
    union {
        int theInt;
        char theChar;
    } u1 = {0};

    u1.theChar = 'A';
    printf("%d\n", u1.theInt);
    printf("%c\n\n", u1.theChar);

    u1.theChar = "A"[0];
    printf("%d\n", u1.theInt);
    printf("%c\n\n", u1.theChar);
}

The output is now predictable as long as we stick to little endian systems:

65
A

65
A

A big endian 32 bit system should give this output instead:

1090519040
A

1090519040
A

1090519040 = 0x41000000

Lundin
  • 195,001
  • 40
  • 254
  • 396