I read a number of articles and S/O answers saying that (on linux x86_64) FS (or GS in some variants) references a thread-specific page table entry, which then gives an array of pointers to the actual data that is in sharable data. When threads are swapped, all the registers are switched over, and the threaded base page therefore changes. Threaded variables are accessed by name with just 1 extra pointer hop, and the referenced values can be shared to other threads. All good and plausible.
Indeed, if you look at the code for __errno_location(void), the function behind errno, you find something like (this is from musl, but gnu is not so much different):
static inline struct pthread *__pthread_self()
{
struct pthread *self;
__asm__ __volatile__ ("mov %%fs:0,%0" : "=r" (self) );
return self;
}
And from glibc:
=> 0x7ffff6efb4c0 <__errno_location>: endbr64
0x7ffff6efb4c4 <__errno_location+4>: mov 0x6add(%rip),%rax # 0x7ffff6f01fa8
0x7ffff6efb4cb <__errno_location+11>: add %fs:0x0,%rax
0x7ffff6efb4d4 <__errno_location+20>: retq
So my expectation is that the actual value for FS would change for each thread. E.g. under the debugger, gdb: info reg or p $fs, I would see the value of FS be different in different threads, but no: ds, es, fs, gs are all zero all the time.
In my own code, I write something like below and get the same - FS is unchanged but the TLV "works":
struct Segregs
{
unsigned short int cs, ss, ds, es, fs, gs;
friend std::ostream& operator << (std::ostream& str, const Segregs& sr)
{
str << "[cs:" << sr.cs << ",ss:" << sr.ss << ",ds:" << sr.ds
<< ",es:" << sr.es << ",fs:" << sr.fs << ",gs:" << sr.gs << "]";
return str;
}
};
Segregs GetSegRegs()
{
unsigned short int r_cs, r_ss, r_ds, r_es, r_fs, r_gs;
__asm__ __volatile__ ("mov %%cs,%0" : "=r" (r_cs) );
__asm__ __volatile__ ("mov %%ss,%0" : "=r" (r_ss) );
__asm__ __volatile__ ("mov %%ds,%0" : "=r" (r_ds) );
__asm__ __volatile__ ("mov %%es,%0" : "=r" (r_es) );
__asm__ __volatile__ ("mov %%fs,%0" : "=r" (r_fs) );
__asm__ __volatile__ ("mov %%gs,%0" : "=r" (r_gs) );
return {r_cs, r_ss, r_ds, r_es, r_fs, r_gs};
}
But the output?
Main: Seg regs : [cs:51,ss:43,ds:0,es:0,fs:0,gs:0]
Main: tls @0x7ffff699307c=0
Main: static @0x96996c=0
Modified to 1234
Main: tls @0x7ffff699307c=1234
Main: static @0x96996c=1234
Async thread
[New Thread 0x7ffff695e700 (LWP 3335119)]
Thread: Seg regs : [cs:51,ss:43,ds:0,es:0,fs:0,gs:0]
Thread: tls @0x7ffff695e6fc=0
Thread: static @0x96996c=1234
So something else is actually going on? What extra trickery is happening, and why add the complication?
For context I'm trying to do something "funky with forks", so I would like to know the gory detail.