3

I noticed C compilers (gcc, clang, tinycc) allow me to assign a pointer to a larger array to a pointer to a smaller VLA without a warning:

#include <stdio.h>
#if !__TINYC__
void take_vla(int N, char const X[1][N]) { printf("%zd\n", sizeof(*X)); }
#endif
int main()
{
    static char bigarray[]="0123456789abcdefghijklmnopqrstuvwxyz";
    //VLA
    int n = 3;
    char const (*subarray2)[n]=&bigarray;
    //char const (*subarray3)[(int){3}]=&bigarray; //VLA but clang doesn't see it as such (a bug, I guess)
#if !__TINYC__
    take_vla(3,&bigarray);
    take_vla(3,&"abcdefg");
#endif

    #if 0
        char const (*subarray1)[3]=&bigarray; //-Wincompatible-pointer-types
    #endif
}

Is this conformant C and why?

Petr Skocik
  • 58,047
  • 6
  • 95
  • 142
  • In C, pointers are just numbers to the CPU, there is no additional information about what they're pointing to, and the language doesn't define any meta data either, so it doesn't matter about the size of what it's pointing to, just that the type is the same (and that's a compiler thing not a CPU thing). – Skizz Dec 12 '18 at 13:09
  • In your declaration of `subarray2`, `n` is only used for address calculations and does not result in storage allocation, e.g. in `subarray2++`, the size of `n` chars will be added. It could be considered that the compiler, when able, could give a warning when the assigned pointer points to an object that is not the same size as `n`, however, that would be a run-time isue, generally not a compile time issue. – Paul Ogilvie Dec 12 '18 at 13:13
  • 1
    @Skizz Yes, but with C's type system it's never that easy, is it. Try changing the last `#if 0` to `#if 1` and you get a warning on that assignment (larger plain old array to smaller plain old array). Shouldn't matter if "pointers are just numbers" but to the compiler it somehow does. – Petr Skocik Dec 12 '18 at 13:16
  • maybe because there _could_ be boundary check warnings or arrays, that you're cheating with this ? – Jean-François Fabre Dec 12 '18 at 13:18
  • @PaulOgilvie It is a valid consideration that the dynamic `int n` might not be statically known (e.g, it could come from a function argument) and then the compiler can't check, but I wonder if there can ever be any issues even with the non-vla version. `char (*p)[n] = &(char [N/*N>n*/]){0}; /*... use p*/` doesn't look conformant as per strict aliasing rules but subarrays kind of are subobjects of larger arrays and you can't ever access them whole anyway (individual elements only) so maybe `char (*p)[n] = (void*)&(char [N/*N>n*/]){0}; /*... use p*/` shouldn't ever lead to any issues. – Petr Skocik Dec 12 '18 at 13:26
  • @PSkocik: if I'm not mistaken (and that is always a possibility!) that code in that last #if does involve an assignment between two different types: an array of pointer to char and a pointer to a pointer to char so the compiler could easily spot that and generate warnings. – Skizz Dec 12 '18 at 13:49
  • 1
    @Skizz: In C, pointers are **not** just numbers to the CPU, both because they have type information (which affects pointer arithmetic in C), and because [they are abstractions](https://stackoverflow.com/a/11714314/298225). – Eric Postpischil Dec 12 '18 at 14:04
  • 2
    The code is not strictly conforming C because the constraints for assignment operators (C 2018 6.5.16.1 1) say the two pointers must be pointers to versions of compatible types, and the rules that might make pointers to two array types compatible (6.2.7 3) do not provide for that in the case of arrays of different specified lengths. (“Specified” in this case includes the length having been determined by evaluating its size expression.) – Eric Postpischil Dec 12 '18 at 14:17
  • @EricPostpischil I'm thinking even for non-VL arrays, `int (*p)[3]=(char (*)[3])&(int[4]){0}; /*... use p in any way without casting*/` shouldn't ever lead to any strict aliasing issues because while the pointed-to types are incompatible, to violate strict aliasing rules you'd have to access an object through an L-value incompatible with its effective type, but there's no such thing as array-typed L-values in C. The arrays will always decay to a pointer to `int` and the `int`s really are there. What do you think? – Petr Skocik Dec 12 '18 at 14:17
  • @EricPostpischil: I was talking about the code the CPU sees at run time, the CPU doesn't know what language was used to create the code so they're just numbers. In the C language, it's only the compiler (and with some meta data, a debugger) that is ascociating a type with an address. That link is lost in the final executable. Some languages (Java, C# for example) do keep that link between type and address so type checks at run time are possible. Pointer arithmetic is something the compiler generates based on type information, to the CPU it's just adding two numbers together. – Skizz Dec 12 '18 at 15:33
  • @Skizz: (a) If you were talking about what the CPU says, then that is not “In C.” (b) That is not true for all CPUs. Some CPUs use segmented address architectures, so addresses are not just single numbers. And there are other embellishments and variations of addressing schemes. – Eric Postpischil Dec 12 '18 at 16:16
  • @Skizz the issue with aliasing types is that it allows the compiler to assume a `restrict` and generate different code based on that. If you do `int f(int *X, short *Y){ *X=2; *Y=3; return *X; }` it can generate code that returns 2 as an intermediate because since X and Y can't alias, *Y=3 cannot change X. On the first sight, it looks like the same thing might happen with the subarray assignment/casting, but I don't think it can based on the letter of the standard because the standard defines strict aliasing in terms of lvalue accesses lvalue arrays don't exist because of array-2-pointer decay – Petr Skocik Dec 12 '18 at 16:23
  • @EricPostpischil: Ah, I see where the confusion has happened, sorry. Bah, people that read too many standards get so picky about grammar! (Joke) – Skizz Dec 13 '18 at 08:04

1 Answers1

3

const char[3] is not compatible with char[37].

Nor is "pointer to qualified type" compatible with "pointer to type" - don't mix this up with "qualified pointer to type". (Const correctness does unfortunately not work with array pointers.)

The relevant part being the rule of simple assignment C17 6.5.16.1:

  • the left operand has atomic, qualified, or unqualified pointer type, and (considering the type the left operand would have after lvalue conversion) both operands are pointers to qualified or unqualified versions of compatible types, and the type pointed to by the left has all the qualifiers of the type pointed to by the right;

Looking at various compilers:

  • gcc in "gnu mode" is useless for checking C conformance. You must compile with -std=cxx -pedantic-errors. After which gcc behaves fine: gcc -std=c17 -pedantic-errors:

    error: pointers to arrays with different qualifiers are incompatible in ISO C [-Wpedantic]

  • icc gives the same diagnostics as gcc, it works fine.

  • Whereas clang -std=c17 -pedantic-errors does not report errors, so it apparently does not conform to the C standard.

Lundin
  • 195,001
  • 40
  • 254
  • 396