0

I'm working on an ARMv7 platform, and encountered a register-access problem. The registers in the device module has a strong WORD requirement for access:

typedef unsigned char u8;
struct reg {
    u8 byte0; u8 byte1; u8 byte2; u8 byte3;
};

when try c code like this: reg.byte0 = 0x3, normally gcc generate assembly code similar LDRB r1, [r0], and this byte operation will lead undefined behavior of my platform. It there an option so that gcc will produce code "read reg, mask byte0" and then a dword "LDR r1, [r0]" rather than "LDRB" op code?

update: the destination i wanna access is a device register on SOC. It has 4 fields and we use a struct representing this register. Accessing byte0 field like reg.byte0 = 3 normally generate byte access assembly code. I want to know whether this kind of c code reg.byte0=3 could be assembled to word access (32 bit, LDR) code. really sorry for my poor English!

UPDATE: The example is just a simplification for real world. and volatile and memory barrier are also used in linux driver. just forgot to add in examples. It's ARM11 on which i'm working on. 1) seems memcpy not good for me, because different register has various fields, i cannot write all of access-inline-function 2) using union seems effective and i'll update result when completing test.

UPDATE2: just test union and it still cannot work on my platform. i think the better way is to use explicit word access and do not confuse compiler.

UPDATE3: seems someone else post the exact same question, and it has been resolved. Force GCC to access structs with words

thanks your guys!

Community
  • 1
  • 1
  • What is `u8`? Use standard fixed-width types. And `LDR` uses a word (i.e. 32 bit) access. A dowrd would be 64 bit access, i.e. `LDRD`. Not clear what your problem is. If you want a 32 bit access, use a 32 bit variable. For 64 bit accesses, use a 64 bit integer type. – too honest for this site Apr 13 '17 at 04:24
  • sorry for the unclear statement: I meant **word** access (32 bit), and `u8` stands for `unsigned char`. – Jingbo Zhang Apr 13 '17 at 04:35
  • I believe that if you want it to only be accessed as a word, you need to declare it as `volatile int` (or some other word-like type, maybe unsigned). You can then use helper functions to load/store the bytes. – Marc Glisse Apr 13 '17 at 07:57
  • On x86_64, the difference is quite visible if you remove `volatile` in `volatile int x; struct A { char a,b,c,d; }; void f(){ union { int i; struct A a; } e; e.i=x; e.a.b=3; x=e.i; }`. Compiling with -O3 gives a byte access without volatile and a word access with volatile. – Marc Glisse Apr 13 '17 at 08:04
  • Don't use homebrew types if there are standard types available! That's what `stdint.h` is for! Note that `((un)signed) char` is not guaranteed to be 8 bits wide in general. – too honest for this site Apr 13 '17 at 13:28
  • Which ARMv7 platform is it? There are three very different! – too honest for this site Apr 13 '17 at 13:30
  • Use word acceses and isolate the bytes with masks and shifts, that what you want to happen anyway, just write the code that way, dont make the compiler guess at what you want it to do...tell it...Note that pointing unions, structs, etc are a hack that dont always work. Also note that unless you actually use the right instruction you cant be sure that gcc will (so if you want an LDR/STR write the assembly language to get an LDR/STR, otherwise hope for the best and get what you get). – old_timer Apr 13 '17 at 14:25
  • Using `union` is a clean way to read a byte from a 32bit word variable. But that is only half of your problem. The other half, actually the first half, is to reliably read a word from hardware into such a variable. For that you need a tested, preferrably supplier-confirmed method. Your text seems to indicate that you do have something like that. Can you provide that? Can you otherwise describe in which way the union method does not work for you? – Yunnosch Apr 15 '17 at 10:39

1 Answers1

1

You could go with inline assembly:

static inline u8 read_reg_b0(const struct reg *rp) __attribute__((always_inline)) {
    struct reg r;
    u32 tmp;
    __asm__("ldr %0, %1" : "=r" (tmp) : "m" (*rp));
    memcpy(&r, &tmp, 4);
    return r.b0;
}
static inline void write_reg_b0(struct reg *rp, u8 b0) __attribute__((always_inline)) {
    struct reg r;
    u32 tmp;
    __asm__("ldr %0, %1" : "=r" (tmp) : "m" (*rp));
    memcpy(&r, &tmp, 4);
    r.b0 = b0;
    memcpy(&tmp, &r, 4);
    __asm__("str %1, %0" : "=m" (*rp) : "r" (tmp));
}

GCC will optimize away the memcpy but can't modify the assembly instructions.

ephemient
  • 198,619
  • 38
  • 280
  • 391
  • There is no need for assembly language, less for `memcpy` (which is a bad idea here anyway). – too honest for this site Apr 13 '17 at 13:28
  • @Olaf 'you could go' seems to express an opinion that 'there is no need for assembly', yet everyone agrees assembly can work (or nothing will). Also, the `memcpy` is explicitly denote that the compiler is expected to optimize it away to the most effective way to transfer the 'word' transfer from the device to the 'memory' shadow. This seems an acceptable answer to the OPs poor question. *It has 4 fields and we use a struct representing this register.* and *strict memory ordering* are in direct conflict and one way to solve it is with shadowing. – artless noise Apr 13 '17 at 14:56
  • @Olaf ASM precisely guarantees which instruction is used to access the register, and memcpy is purely on local variables for safe type punning. You can't guarantee the first with a union or volatile, and the second is well defined behavior. – ephemient Apr 13 '17 at 15:00
  • @ephemient 1) You cannot use `memcpy` for type-punning. 2) There are easier and correct ways to write this in standard C, which are even faster. The overhead of `memcpy` is not tolerable on an embedded system where such accesses are done in interrupot handlers, etc. – too honest for this site Apr 13 '17 at 15:06
  • 3
    1) You absolutely can use memcpy for type punning, and it's well defined in standard C unlike access through different union fields or pointers. http://stackoverflow.com/q/11639947 2) At any optimization level GCC will not call memcpy. If you look at its assembler output it is oops it understands that `r` and `tmp` represent the same bytes in memory, and it is not generating any calls to other functions nor inlining memcpy. – ephemient Apr 13 '17 at 15:15
  • You could *just* use `memcpy` instead of inline asm load/store. `memcpy` is aliasing-safe and alignment-safe, and a good way to express an unaligned aliasing load or store. (In GNU C, another way is `typedef uint32_t unaligned_u32 __attribute((aligned(1), may_alias))` and you pointers of that type.) Oh, but the question is doing MMIO, and GCC might choose to implement the `memcpy` with byte loads if it doesn't know the alignment. So +1 for using asm to make sure, vs. a `volatile` aligned(1) typedef. – Peter Cordes Jul 13 '22 at 09:43