98

I saw the below code in this Quora post:

#include <stdio.h>

struct mystruct { int enabled:1; };
int main()
{
  struct mystruct s;
  s.enabled = 1;
  if(s.enabled == 1)
    printf("Is enabled\n"); // --> we think this to be printed
  else
    printf("Is disabled !!\n");
}

In both C & C++, the output of the code is unexpected,

Is disabled !!

Though the "sign bit" related explanation is given in that post, I am unable to understand, how it is possible that we set something and then it doesn't reflect as it is.

Can someone give a more elaborate explanation?


Note: Both the tags & are required, because their standards slightly differ for describing the bit-fields. See answers for C specification and C++ specification.

iammilind
  • 68,093
  • 33
  • 169
  • 336
  • 47
    Since the bitfield is declared as `int` i think it only can hold the values `0` and `-1`. – Osiris Dec 19 '18 at 14:45
  • 7
    just think of it how int stores -1. All bits are set to 1. Hence, if you only have one bit it clearly has to be -1. So 1 and -1 in the 1 bit int are the same. Change the check to 'if (s.enabled != 0)' and it works. Because 0 it can't be. – Jürgen Dec 19 '18 at 15:09
  • 3
    It is true that these rules are the same in C and C++. But according to the [tag usage](https://stackoverflow.com/tags/c%2b%2b/info) policies, we should only tag this as C and refrain from cross-tagging when not needed. I'll remove the C++ part, it should not affect any posted answers. – Lundin Dec 19 '18 at 16:06
  • 8
    Have you tried changing it to `struct mystruct { unsigned int enabled:1; };`? – ChatterOne Dec 19 '18 at 16:08
  • 4
    Kindly read the [C and C++ tag policies](https://stackoverflow.com/tags/c%2b%2b/info), particularly the part regarding cross-tagging C and C++ both, established through community consensus [here](https://meta.stackoverflow.com/questions/374306/proposed-update-to-c-and-c-tag-usage-wikis). I'm not going into some rollback war, but this question is incorrectly tagged C++. Even if the languages happen to have some slight difference because of various TC, then make a separate question about the difference between C and C++. – Lundin Dec 20 '18 at 08:04
  • Thanks @Lundin for the tag policy link. Both the C & C++ have different standard specifications for the bit fields. Hence I am interested in knowing the output results from both the perspectives. – iammilind Dec 20 '18 at 10:19
  • @iammilind They don't, until apparently after some TC of C++11. It seems to be that comparing C vs C++ vs C++11 TC:x is a separate question. – Lundin Dec 20 '18 at 10:26
  • Been reading C++11 and C++17. I find no evidence anywhere that C++ bit-fields behave differently than C ones. Multiple cases of implementation-defined behavior are the same for both languages. If anything, they are even more poorly defined in C++17 than in for example C90. Wouldn't have thought such a thing possible. – Lundin Dec 20 '18 at 10:52
  • @Lundin, C++ defines the bit-fields more clearly and coincidentally they have taken an example, which matches the code in Qn. You may refer [my answer](https://stackoverflow.com/a/53867011/514235). It's indeed good to have both the perspectives of C and C++ for this feature. – iammilind Dec 20 '18 at 11:22
  • Bitfields and throw specifications are the two most useless features of C++. Bitfields serve no purpose whatsoever. – user3344003 Dec 20 '18 at 13:43

6 Answers6

77

Bit-fields are incredibly poorly defined by the standard. Given this code struct mystruct {int enabled:1;};, then we don't know:

  • How much space this occupies - if there are padding bits/bytes and where they are located in memory.
  • Where the bit is located in memory. Not defined and also depends on endianess.
  • Whether an int:n bitfield is to be regarded as signed or unsigned.

Regarding the last part, C17 6.7.2.1/10 says:

A bit-field is interpreted as having a signed or unsigned integer type consisting of the specified number of bits 125)

Non-normative note explaining the above:

125) As specified in 6.7.2 above, if the actual type specifier used is int or a typedef-name defined as int, then it is implementation-defined whether the bit-field is signed or unsigned.

In case the bitfield is to be regarded as signed int and you make a bit of size 1, then there is no room for data, only for the sign bit. This is the reason why your program might give weird results on some compilers.

Good practice:

  • Never use bit-fields for any purpose.
  • Avoid using signed int type for any form of bit manipulation.
Lundin
  • 195,001
  • 40
  • 254
  • 396
  • 5
    At work we have static_asserts on the size and address of bitfields just to make sure that they are not padded. We use bitfields for hardware registers in our firmware. – Michael Dec 19 '18 at 19:38
  • 1
    C++ doesn't have the last part after [CWG 739](https://wg21.link/CWG739); it's always signed. – T.C. Dec 20 '18 at 07:46
  • 1
    @Michael At one time I used bit-fields too, one of those usual sloppily-written register maps you get from the compiler vendor. Then tried to port the code to another compiler for the same MCU. Of course everything went haywire, because the register map was utterly non-portable. Sure, bit-fields was not the only problem, but a major one. The solution was to roll out your own register map. I wrote little a program which grabbed the register names and bit masks from the MCU manual and made an ISO/MISRA C register map header out of it. Based on _bit masks_ to be used with the bit-wise operators. – Lundin Dec 20 '18 at 07:51
  • 4
    @Lundin: The ugly thing with #define-d masks and offsets is that your code gets littered with shifts and bit-wise AND/OR operators. With bitfields the compiler takes care of that for you. – Michael Dec 20 '18 at 08:25
  • @T.C. That link doesn't change the behavior, you seem to have misinterpreted it. It says that if a type is "plain" such as `int` then it has implementation-defined signedness just like C. Basically that TC says "can we clarify just how stupid bit-fields are defined by the standard". – Lundin Dec 20 '18 at 10:31
  • @T.C. They have also apparently entirely removed the part you linked in the C++17 draft, so it is no longer just implementation-defined... but as I read it, using a bit-field of `int` type is undefined behavior in C++17, as the behavior and signedness of the different integer types is not covered by the standard. Or at least I can't find it. – Lundin Dec 20 '18 at 10:50
  • 4
    @Michael *With bitfields the compiler takes care of that for you.* Well, that's OK if your standards for "takes care of that" are "non-portable" and "unpredictable". Mine are higher than that. – Andrew Henle Dec 20 '18 at 10:56
  • The final resolution of CWG739 (you need to scroll down to the February 2012 PR, not the August 2011 one) removes the implementation-definedness entirely. Plain `int` therefore has the same meaning it has everywhere else (which is, of course, signed). – T.C. Dec 20 '18 at 11:26
  • 1
    @AndrewHenle these properties are all implementation-defined, i.e. a conforming implementation is *required* to do something predictable, and document it. I'm not completely convinced the problem is substantially worse than for struct members in general: the alignment and padding of fields is also still implementation-defined (although they can't be reordered or straddle storage units), which makes a "simple" struct's layout technically non-portable between separately compiled binaries as well. – Alex Celeste Dec 20 '18 at 13:49
  • @Leushenko The requirements on structures are much, much stricter than on bit fields. There are many C standard functions such as [`localconv()`](https://port70.net/~nsz/c/c11/n1570.html#7.11.2.1) that pass structures, so structure layout requirements are stronger than "implementation-defined". I suspect structure layout is [ABI](https://en.wikipedia.org/wiki/Application_binary_interface)-controlled. (Someone with more expertise can comment...) Bit fields are completely controlled by the compiler chosen, and can and do change between different compilers on the same platform. – Andrew Henle Dec 20 '18 at 14:26
  • @AndrewHenle ABI is 100% beyond the scope of the standard. It doesn't and can't mandate that data structures (or even primitive types) are in *any way* compatible with those produced by a different compiler on the same system; how the standard library is made available to a program by an implementation is out of scope too. Implementations choosing to all di the same thing because it's sensible and because sharing a stdlib is practical doesn't make it not-implementation-defined. – Alex Celeste Dec 20 '18 at 14:30
  • @Michael Lets compare `reg.bit0 = 1;` with `reg |= bit0;`. In the former case we have something like `typedef struct { int bit0:1; } reg_t;`, in the latter `#define bit0 (1u << 0)`. The main difference is that `reg |= bit0;` actually sets bit 0, portably, independtly of compiler and endianess. Wheras `reg.bit0 = 1;` could set bit 0, or bit 31 or a padding bit, or a padding byte. – Lundin Dec 20 '18 at 15:39
  • @Leushenko *ABI is 100% beyond the scope of the standard. It doesn't and can't mandate that data structures (or even primitive types) are in any way compatible with those produced by a different compiler on the same system* Not true. The [x86_64 ABI](https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf), for example, strictly specifies structure layout while leaving bit field implementations a great deal of freedom. Strict compliance with the ABI would result in different compilers having to create identical structures while having wildly different bit-fields. – Andrew Henle Dec 20 '18 at 16:07
  • 3
    @AndrewHenle Leushenko is saying that from the perspective of *just the C standard itself*, it is up to the implementation whether or not it chooses to follow the x86-64 ABI or not. – mtraceur Dec 20 '18 at 22:41
  • @mtraceur First, I'd say a compiler that doesn't follow a platform's ABI is perverse and deliberately non-portable. Second, I'm saying that even given strict adherence to an ABI, the C standard bit field "specifications" still leave so much unspecified that different compilers - that each strictly follow an ABI - can easily use wildly incompatible bit field layouts on the exact same platform. – Andrew Henle Dec 20 '18 at 22:59
  • 3
    @AndrewHenle Right, I agree on both points. My point was that I think your disagreement with Leushenko boils down to the fact that you're using "implementation defined" to refer only to things neither strictly defined by the C standard nor strictly defined by the platform ABI, and he's using it to refer to anything not strictly defined by just the C standard. – mtraceur Dec 20 '18 at 23:20
59

I am unable to understand, how is it possible that we set something and then it doesn't show up as it is.

Are you asking why it compiles vs. gives you an error?

Yes, it should ideally give you an error. And it does, if you use your compiler's warnings. In GCC, with -Werror -Wall -pedantic:

main.cpp: In function 'int main()':
main.cpp:7:15: error: overflow in conversion from 'int' to 'signed char:1' 
changes value from '1' to '-1' [-Werror=overflow]
   s.enabled = 1;
           ^

The reasoning for why this is left up to being implementation-defined vs. an error may have more to do with historical usages, where requiring a cast would mean breaking old code. The authors of the standard may believe warnings were enough to pick up the slack for those concerned.

To throw in some prescriptivism, I'll echo @Lundin's statement: "Never use bit-fields for any purpose." If you have the kind of good reasons to get low-level and specific about your memory layout details that would get you to thinking you needed bitfields in the first place, the other associated requirements you almost certainly have will run up against their underspecification.

(TL;DR - If you're sophisticated enough to legitimately "need" bit-fields, they're not well-defined enough to serve you.)

  • 16
    The authors of the standard were on holidays the day the bit-field chapter was designed. So the janitor had to do it. There is no rationale about _anything_ regarding how bit-fields are designed. – Lundin Dec 19 '18 at 15:00
  • 10
    There is no coherent *technical* rationale. But that leads me to conclude that there was a *political* rationale: to avoid making any of the existing code or implementations incorrect. But the result is that there's very little about bitfields that you can rely upon. – John Bollinger Dec 19 '18 at 15:20
  • 7
    @JohnBollinger There was definitely politics in place, that caused a lot of damage to C90. I once spoke with a member of the committee who explained the source of lots of the crap - the ISO standard could not be allowed to favour certain existing technologies. This is why we are stuck with moronic things like support for 1's complement and signed magnitude, implementation-defined signedness of `char`, support for bytes that aren't 8 bits etc etc. They weren't allowed to give moronic computers a market disadvantage. – Lundin Dec 19 '18 at 15:26
  • 1
    @Lundin It would be interesting to see a collection of writeups and post-mortems from people who believed tradeoffs had been made in error, and why. I wonder how much study of these *"we did that last time, and it did/didn't work out"* has become institutional knowledge to inform the next such case, vs. just stories in people's heads. – HostileFork says dont trust SE Dec 19 '18 at 15:29
  • 1
    This is still listed as point no. 1 of the original principles of C in the C2x Charter: "Existing code is important, existing implementations are not." ... "no one implementation was held up as the exemplar by which to define C: It is assumed that all existing implementations must change somewhat to conform to the Standard." – Alex Celeste Dec 19 '18 at 16:00
23

This is implementation defined behavior. I am making the assumption that the machines you are running this on use twos-compliment signed integers and treat int in this case as a signed integer to explain why you don't enter if true part of the if statement.

struct mystruct { int enabled:1; };

declares enable as a 1 bit bit-field. Since it is signed, the valid values are -1 and 0. Setting the field to 1 overflows that bit going back to -1 (this is undefined behavior)

Essentially when dealing with a signed bit-field the max value is 2^(bits - 1) - 1 which is 0 in this case.

Community
  • 1
  • 1
NathanOliver
  • 171,901
  • 28
  • 288
  • 402
  • "ince it is signed, the valid values are -1 and 0". Who said it is signed? It's not defined but implementation-defined behavior. If it is signed, then the valid values are `-` and `+`. 2's complement doesn't matter. – Lundin Dec 19 '18 at 14:56
  • 5
    @Lundin A 1 bit twos compliment number only has two possible values. If the bit is set, then since it is the sign bit, it is -1. If it isn't set then it is "positive" 0. I know this is implementation defined, I'm just explaining the results using the most common implantation – NathanOliver Dec 19 '18 at 15:00
  • 1
    The key here is rather that 2's complement or any other signed form cannot function with a single bit available. – Lundin Dec 19 '18 at 15:03
  • @Lundin What do you mean it cannot function? [It works fine here](http://coliru.stacked-crooked.com/a/41fa5d5c6fe4f85c), you just have know the domain you are working with. – NathanOliver Dec 19 '18 at 15:04
  • @NathanOliver, a possible point of contention arises here from the wording of the standard, which does not seem specifically to require that bitfields of type `int` in fact represent signed types. I do think it's the intent that such bitfields should be signed, however. – John Bollinger Dec 19 '18 at 15:05
  • 1
    @JohnBollinger I understand that. That's why I have the discliamer that this is implementation defined. At least for the big 3 they all treat `int` as signed in this case. It is a shame that bit-fields are so under specified. It's basically here is this feature, consult your compiler on how to use it. – NathanOliver Dec 19 '18 at 15:08
  • 1
    @Lundin, the standard's wording for the representation of signed integers can perfectly well handle the case where there are zero value bits, at least in two of the three allowed alternatives. This works because it assigns (negative) *place values* to sign bits, rather than giving them an algorithmic interpretation. – John Bollinger Dec 19 '18 at 15:08
  • @JohnBollinger In the line `s.enabled = 1;`, the rule of simple assignment applies. We have a conversion from `int` to "sign bit". `Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.` This is what happens, and it is not necessarily "place values in sign bit". – Lundin Dec 19 '18 at 15:22
  • Agreed, @Lundin, that the result of the assignment is implementation-defined, and in particular that it does not have to correspond in any way to placing a value in the sign bit. All I'm saying is that C's rules for integer representation *do* afford meaningful and well-defined interpretations for 1-bit signed integers. I say "afford" rather than "specify" only because there are two viable alternatives and one wobbly one. – John Bollinger Dec 19 '18 at 15:27
10

You could think of it as that in the 2's complement system, the left-most bit is the sign bit. Any signed integer with the left-most bit set is thus a negative value.

If you have a 1-bit signed integer, it has only the sign bit. So assigning 1 to that single bit can only set the sign bit. So, when reading it back, the value is interpreted as negative and so is -1.

The values a 1 bit signed integer can hold is -2^(n-1)= -2^(1-1)= -2^0= -1 and 2^n-1= 2^1-1=0

Paul Ogilvie
  • 25,048
  • 4
  • 23
  • 41
8

As per the C++ standard n4713, a very similar code snippet is provided. The type used is BOOL (custom), but it can apply to any type.

12.2.4

4 If the value true or false is stored into a bit-field of type bool of any size (including a one bit bit-field), the original bool value and the value of the bit-field shall compare equal. If the value of an enumerator is stored into a bit-field of the same enumeration type and the number of bits in the bit-field is large enough to hold all the values of that enumeration type (10.2), the original enumerator value and the value of the bit-field shall compare equal. [ Example:

enum BOOL { FALSE=0, TRUE=1 };
struct A {
  BOOL b:1;
};
A a;
void f() {
  a.b = TRUE;
  if (a.b == TRUE)    // yields true
    { /* ... */ }
}

— end example ]


At 1st glance, the bold part appears open for interpretation. However, the correct intent becomes clear when the enum BOOL is derived from the int.

enum BOOL : int { FALSE=0, TRUE=1 }; // ***this line
struct mystruct { BOOL enabled:1; };
int main()
{
  struct mystruct s;
  s.enabled = TRUE;
  if(s.enabled == TRUE)
    printf("Is enabled\n"); // --> we think this to be printed
  else
    printf("Is disabled !!\n");
}

With above code it gives a warning without -Wall -pedantic:

warning: ‘mystruct::enabled’ is too small to hold all values of ‘enum BOOL’ struct mystruct { BOOL enabled:1; };

The output is:

Is disabled !! (when using enum BOOL : int)

If enum BOOL : int is made simple enum BOOL, then the output is as the above standard pasage specifies:

Is enabled (when using enum BOOL)


Hence, it can be concluded, also as few other answers have, that int type is not big enough to store value "1" in just a single bit bit-field.

grg
  • 5,023
  • 3
  • 34
  • 50
iammilind
  • 68,093
  • 33
  • 169
  • 336
0

There is nothing wrong with your understanding of bitfields that I can see. What I see is that you redefined mystruct first as struct mystruct { int enabled:1; } and then as struct mystruct s;. What you should have coded was:

#include <stdio.h>

struct mystruct { int enabled:1; };
int main()
{
    mystruct s; <-- Get rid of "struct" type declaration
    s.enabled = 1;
    if(s.enabled == 1)
        printf("Is enabled\n"); // --> we think this to be printed
    else
        printf("Is disabled !!\n");
}
ar18
  • 335
  • 2
  • 5