1

When the compiler (gcc, or intel c++) optimizes a for loop with an index that is a member of a struct, can the compiler break the struct into individual variables/register?

For example, when one member of the struct a is actually always equal to another variable b, it would be safe to replace all a to b in the actual program.

When the struct has a and other things as member, can compilers still perform this optimization?

How about other optimization such as removing unnecessary indirection (referencing and dereferencing)?

I tried to test, but the test is not complete.

Code: (compiled with g++ -O3 -std=c++14)

int main()
{
    int i, j // version 1

    // version 2
    /*
    struct Index{
    int ii;
    int jj;
    }C;
    int &i = C.ii;
    int &j = C.jj;
    */

    int a = 0;
    constexpr int N = 65536;
    int x[N];

    for (int k = 0; k < N; ++k) x[k] = 1;

    for (i = 0; i < N; ++i) {
        a += x[i];
    }

    for (j = 0; j < N; ++j) {
        a += x[j];
    }
    std::cout << a << "\n";
}

gcc godbolt:

Version 1 and 2 have identical assembly.

Update:

Things I found online:

From GCC's structure reorganization optimization development plan (gcc.gnu.org):

  • GCC can split a struct with four fields into four structs when two fields are extensively accessed by function f() while the other two fields are extensively accessed by function g(). (Table 2. test1.c)

Why doesn't GCC optimize structs?

  • GCC doesn't reorder attributes of a struct because C standard doesn't allow that.

The new intraprocedural scalar replacement of aggregates (gcc.gnu.org, suggested by Mysticial)

  • So gcc can replace a member of a struct by register.
Community
  • 1
  • 1
hamster on wheels
  • 2,771
  • 17
  • 50
  • The code you posted does not match the code in the links. The code in the links uses references (which are not needed and complicate the program). – Thomas Matthews Oct 19 '16 at 15:44
  • Please edit your post with the content of the assembly language listing, not links to a web site. – Thomas Matthews Oct 19 '16 at 15:45
  • @Thomas Matthews: Sure. I can copy the assembly, it is just quite long. There are two versions. Version 1 doesn't use references, version 2 uses reference. Reference is not needed, but I was testing if the compiler can optimize that by comparing version1 and version 2. – hamster on wheels Oct 19 '16 at 15:50
  • You only need to post the assembly language for one of the indexed `for` loops. There should be no difference between the `i` loop and the `j` loop. – Thomas Matthews Oct 19 '16 at 15:52
  • 1
    Compiler optimization strategies are not standard with the C++ language. Different compilers can perform different levels of optimization. Also, the optimization *levels* or *settings* play a big role too. To answer your question completely, you would have to compile the sample with all compilers using the highest optimization setting in release mode and compare the assembly language listings. – Thomas Matthews Oct 19 '16 at 15:55
  • @Thomas Matthews: While packing variables together in memory makes accessing them more efficient, some of the variables are not even needed in some cases. The code was compiled with g++ -O3. – hamster on wheels Oct 19 '16 at 16:01
  • g++ can use register to replace a variable. Is that possible for simd type (e.g. `__m128i`) member of a struct? – hamster on wheels Oct 19 '16 at 16:57
  • 4
    It's called, "Scalar Replacement of Aggregates". It's a standard optimization that all modern compilers should be able to do. – Mysticial Oct 19 '16 at 17:46

0 Answers0