7

If I write:

int some_arr[4];
some_arr = {0, 1, 2, 3};

Then my compiler (in this case, GCC) will complain that I don't have an expression before {. So I need to use a compound literal, fine:

int some_arr[4];
some_arr = (int[]){0, 1, 2, 3};

And now we see that I'm not allowed to assign a value to an array.

What?

I can "circumvent" this with something like memcpy(some_arr, (int[]){0, 1, 2, 3}, sizeof(int[4])), or by assigning to each element of some_arr one-by-one (or through a loop.) I can't imagine that GCC is incapable of parsing the individual assignments from what I've wrote (a lazy compiler that doesn't care about the user could probably even do it in the pre-processor), so it seems to come down to "the standard said no." So why does the standard say this particular thing is off-limits?

I'm not looking for the language in the standard that says it's not allowed as much as I'm looking for the history lesson of how that part of the standard came to be.

c-x-berger
  • 991
  • 12
  • 30
  • 2
    Interestingly, if the array is a field in a structure, then C will happily let you assign the whole structure, including the array. (This is sometimes used to return a stack-allocated array from a function.) – ruakh Feb 07 '20 at 08:32
  • 2
    What you are requesting is not an available feature of C. You have to **assign** the elements one by one. In the declaration, you can **initialize** the array all at once with `int some_arr[4] = {0, 1, 2, 3};`. Mind the difference between initialize and assign. – Pierre François Feb 07 '20 at 08:37
  • I know it's not available. I'm asking why it's not available. The standard has been amended to add new features before (like that compound literal)... why not this one? – c-x-berger Feb 07 '20 at 08:37
  • mem copy is the way you should do it in C you cannot assign memory blocks to each other only primitive types - If you want to do this migrate your project to c++ – Simson Feb 07 '20 at 08:38
  • 2
    C is an old language Operations in C are designed to mimic assembler - to assign the contents of an array to an array is a complex operation which would require several instructions and even a loop - so it is not a primitive operation - thus not included. – Simson Feb 07 '20 at 08:41
  • I see what you are asking for: the answer of *why* is to be found in the limits the designers of C have put in the definition of this programming language. You have to stop somewhere to keep the compiler light and fast enough. This is perhaps also the reason why C++ has been defined: to enrich C with all kinds of extensions. – Pierre François Feb 07 '20 at 08:41
  • @Simson While that’s the correct answer, C is inconsistent in this: notably, you *can* assign structures, even though the same reasoning should forbid it. You can even assign structures that contain arrays. – Konrad Rudolph Feb 07 '20 at 08:51
  • 1
    @KonradRudolph I think the difference is that the sizeof a struct is always known, whereas the sizeof an array gets lost when passed to a function. – user3386109 Feb 07 '20 at 08:56
  • 1
    Maybe your question reduces to this: [Why array type object is not modifiable?](https://stackoverflow.com/questions/17687429/why-array-type-object-is-not-modifiable) – Bob__ Feb 07 '20 at 09:10
  • @Bob__ one of the edits to the accepted answer to that question looks like what I'm after, but I'm relatively inexperienced in C so I'll leave it to the community to decide if this question is a dupe or not – c-x-berger Feb 07 '20 at 09:21
  • 1
    I remember there being some discussion here on SO about why C doesn't allow this, design choices by Ritchie, old rationales etc. Can't find the post though. – Lundin Feb 07 '20 at 09:38
  • 2
    I guess [Why can't a modifiable lvalue have an array type?](https://stackoverflow.com/questions/45656162/why-cant-a-modifiable-lvalue-have-an-array-type) is the best post we have. I tried to find any direct quote from Ritchie's "development of C" but didn't find one. Also the last edit of [this answer](https://stackoverflow.com/a/17691191/584518) is pretty much an exact summary of Ritchie's paper explaining the rationale. – Lundin Feb 07 '20 at 09:55
  • @user3386109 No, that’s not correct: in C you *cannot* pass arrays to functions. You can only pass pointers, and arrays automatically decay. The `sizeof` an array *is* always known in C (even for VLAs, with special provisions), same as for structs. – Konrad Rudolph Feb 07 '20 at 09:59
  • @user3386109 Absolutely not. Your comment is incorrect in its substance, not just its wording, and therefore doesn’t explain the difference I highlighted (the real reason is absolutely not obvious but the historical explanations linked by Lundin probably come close). – Konrad Rudolph Feb 07 '20 at 10:51
  • Originally, C was only able to manipulate simple objects—essentially just integer values. You could not assign structures, and there were no compound literals. They were building languages out of machine instructions. To make arrays work, they were treated as pointers in many contexts. When development grew to be able to support values of compound objects, so that structures could be assigned, and, later, compound literals could be constructed, it was too late for arrays, largely because of the automatic conversion to pointer. Adding exceptions for the conversion could have broken things. – Eric Postpischil Feb 07 '20 at 12:49
  • 1
    E.g., to make `a = b;` work for arrays `a` and `b`, you might add to the rule that arrays are not converted when they are operands of `sizeof` or unary `&` that arrays are not converted when they are the left operand of `=`. But, as we see from the above, you also need to add that they are not converted when they are the right operand of `=`. But then existing code like `p = a;`, where `p` is a pointer, would not work, because you would be assigning the array `a` to a pointer `p`. Similarly, passing an array by value would present problems with existing semantics and code. – Eric Postpischil Feb 07 '20 at 12:51

2 Answers2

5

From ISO/IEC 9899:1999 on assignment operator constrains

§6.5.16 An assignment operator shall have a modifiable lvalue as its left operand.

Then on modifiable lvalue

§6.3.2.1 A modifiable lvalue is an lvalue that does not have array type, does not have an incomplete type, does not have a const-qualified type, and if it is a structure or union, does not have any member (including, recursively, any member or element of all contained aggregates or unions) with a const-qualified type.

Why not? probably because the array name decays to pointer to first element most probably.


However, an array assignment wrapped by a struct is allowed, as such:

//gcc 5.4.0

#include  <stdio.h>

struct A
{
    int arr[3];
    int b;
};

struct A foo()
{
    struct A a = {{1, 2, 3},10};
    return a;
}

int main(void)
{
    struct A b = foo();
    for (int i=0; i<3; i++)
          printf("%d\n",b.arr[i]);
    printf("%d\n", b.b);
}

Yields

1
2
3
10
Tony Tannous
  • 14,154
  • 10
  • 50
  • 86
  • 1
    > A modifiable lvalue is an lvalue that does not have array type I hate to sound like the toddler who keeps saying "why," but... – c-x-berger Feb 07 '20 at 09:05
  • 1
    @c-x-berger I assume it makes the compiler more complex and slower. – Tony Tannous Feb 07 '20 at 09:09
  • Fair enough, I guess. It seems like a weird place to stop to me but I'm the guy who learns about languages, not the guy who designs languages – c-x-berger Feb 07 '20 at 09:11
  • 1
    I don’t think array decay is at all relevant here. It would be trivial for the language definition to define assignment for variables of array type such that no decay happens. Not doing so was a *choice*, it’s neither required nor obvious. – Konrad Rudolph Feb 07 '20 at 10:01
  • 2
    The rationale is simply that Ritchie didn't want to mix up pointer assignment and array assignment. In B language, all arrays were allocated together with a pointer to the first element. This didn't sit well with the C concept of structs though, which were supposed to correspond directly to memory. So Ritchie got rid of the way B stored the address and introduced array decay instead, meaning that when an array is used inside an expression, you get a pointer to the first element. Rather than the B language way of the first element being the pointer. It's all in the linked duplicates. – Lundin Feb 07 '20 at 10:03
4

tl;dr:

because C decided that arrays decay to pointers, and hasn't provided a way for the programmer to avoid it.

Long answer:

When you write

int arr[4];

from that moment on, every time you use arr in a dynamic context, C considers arr to be &arr[0], namely the decay of an array to a pointer (see also here and here).

Therefore:

arr = (int[]){0, 1, 2, 3};

is considered to be

&arr[0] = (int[]){0, 1, 2, 3};

which cannot be assigned. A compiler could implement a full array copy using memcpy(), but then C would have to provide a means to tell the compiler when to decay to a pointer and when not to.

Note that a dynamic context is different from a static context. sizeof(arr) and &arr are static context processed at compile time, in which arr is treated as an array.

Likewise, the initializations

int arr[4] = {0, 1, 2, 3};

or

int arr[] = {0, 1, 2, 3};

are static context - these initializations happen when the program is loaded into memory, before it even executes.

The language in the standard is:

Except when it is the operand of the sizeof operator or the unary & operator, or is a string literal used to initialize an array, an expression that has type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.

When an array is inside a struct, e.g.

struct s {
    int arr[4];
};
struct s s1, s2;

Then again using s1.arr is like &s1.arr[0], and it cannot be assigned.

However, while s1 = s2 is dynamic context, is not referencing the array. The compiler knows it needs to copy the full array, because it's part of the definition of the structure, and this assignment is generated implicitly. For example, if the compiler chooses to implement struct assignment using memcpy(), the array is automatically copied.

root
  • 5,528
  • 1
  • 7
  • 15
  • You answer the question within the quoted language of [C11 Standard - 6.3.2.1 Other Operands - Lvalues, arrays, and function designators(p3)](http://port70.net/~nsz/c/c11/n1570.html#6.3.2.1p3) `..has type "array of type"...` ***and is not an lvalue*** ' ...' – David C. Rankin Feb 07 '20 at 09:41
  • 3
    This is [begging the question](https://en.wikipedia.org/wiki/Begging_the_question). The language could have trivially added an exception for modifiable lvalues to the definition of static contexts, and removed the exception for arrays from modifiable lvalues. The language definition complexity would have remained unchanged… there is no inherent, natural reason for the specific choice C made. – Konrad Rudolph Feb 07 '20 at 10:04
  • IIUC The question isn't "why do arrays decay to pointers" or "what does the standard say about lvalue arrays", but rather "_why_ does the standard say that about lvalue arrays", and the answer is "because arrays decay to pointers". – root Feb 08 '20 at 10:35
  • "C would have to provide a means to tell the compiler when to decay to a pointer and when not to." - it already does, see 6.3.2.1/2 . This could easily be modified to exclude the case in question – M.M Feb 08 '20 at 23:22
  • @M.M you don't get a choice. ex.: `int f(int a[3])` will accept any int array or int pointer, and no operator can change that. You can't tell C to enforce the array size (i.e. not decay). I do agree that it could make the exception, and though it would be slightly inconsistent, it would be immensely useful. – root Feb 08 '20 at 23:59
  • @root we're talking about assignment expressions here, not function arguments – M.M Feb 09 '20 at 10:16