0

I'm trying to set bit 30 of cr0 register with inline assembly. I'm using the following assembly in my kernel module,

__asm__ (
"mov %%cr0, %%rax\n\t"
"or 0x40000000, %%eax\n\t"
"mov %%rax, %%cr0\n\t"
:: 
:"%rax"
);

My module compiles but upon inserting the module my terminal freezes. From a new terminal when I try to remove the module, it shows the following,

rmmod: ERROR: Module xxx is in use

and dmesg shows the following in red color,

RIP  [<ffffffffc09c604c>] hello_start+0x4c/0x1000 [ModuleName]

how to set control register 0 (cr0) bits in x86-64 using gcc assembly on linux and Trying to disable paging through cr0 register also talk about the same problem. I try to follow their solution but I just can't make it work. Any help, where am I making mistake in my inline assembly?

Update post:

I have fixed the code according @prl suggestions, following is my full source code,

u64 get_cr0(void){
  u64 cr0;
  __asm__ (
  "mov %%cr0, %%rax\n\t"
  "mov %%eax, %0\n\t"
  : "=m" (cr0)
  : /* no input */
  : "%rax"
  );

  return cr0;
}   

static int __init hello_start(void){
  printk(KERN_INFO "Loading hello module...\n");
  printk(KERN_INFO "Hello world\n");

  printk(KERN_INFO "cr0 = 0x%8.8X\n", get_cr0());


  __asm__ (
  "mov %%cr0, %%rax\n\t"
  "or $0x40000000, %%eax\n\t"
  "mov %%rax, %%cr0\n\t"
   :: 
   :"%rax"
   );
   printk(KERN_INFO "cr0 after change = 0x%8.8X\n", get_cr0());
   return 0;
 }

static void __exit hello_end(void){
    printk(KERN_INFO "Goodbye Mr.\n");

    __asm__ (
    "mov %%cr0, %%rax\n\t"
    "and $~(0x40000000), %%eax\n\t"
    "mov %%rax, %%cr0\n\t"
    :: 
    :"%rax"
    );

}

Indeed, my system is running super slow after loading the module. But after changing the bit I still did not see any difference in cr0 register value. Following is the output in dmesg

[  +0.000400] Loading hello module...
[  +0.000001] Hello world
[  +0.000002] cr0 = 0x80050033
[  +0.000312] cr0 after change = 0x80050033
[  +6.085675] perf interrupt took too long (2522 > 2500), lowering kernel.perf_event_max_sample_rate to 50000

Why can't I see the change in bit 30 of the cr0 register?

Therefore, Upon removing the module, I tried to clear bit 30 hoping my system will start to respond normally. But seems like that did not work. My system is still running slow. Any thoughts, how to bring back the system to its normal functional state after modifying the cr0?

Michael Petch
  • 46,082
  • 8
  • 107
  • 198
user45698746
  • 305
  • 2
  • 13
  • You might be surprised at how poorly a modern system runs with cache disabled. Perhaps it isn’t hung, but is just doing what you told it to do (run like crap). – prl Dec 14 '20 at 10:06
  • 4
    You need to use `or $0x40000000, %%eax`. But I would expect it to #GP if it’s reading memory at 0x40000000 and setting random bits in cr0. Or give some other error that would be visible in dmesg. – prl Dec 14 '20 at 10:14
  • @prl thanks. Now I see what do you mean by the system will run poorly. Indeed, after setting the bit, my system is running horribly slow. But after setting the bit 30, I still do not see the change (please see the updated post). And Upon removing the module I clear bit 30, but my system is still responding slow, any thoughts on how to fix that? – user45698746 Dec 14 '20 at 20:56
  • 1
    The reason CR0 is likely not appearing to update is because the optimizer probably assumed that the inline assembly in `get_cr0` would result in the same value being returned in the first instance of being called and subsequent calls and thus removed a second call to `get_cr0`. The reason for this is that the inline assembly doesn't know there is a side effect of reading something the compiler is unaware of - getting the value of a CR0 register. With that in mind you need to explicitly mark the inline assembly in `get_cr0` as `__volatile__` . – Michael Petch Dec 14 '20 at 21:57
  • The other inline assembly you have doesn't need `__volatile__` because there are no output operands in them so they are implicitly volatile. In essence your code updated CR0 but the code to properly retrieve the value in CR0 in `get_cr0` is in error. – Michael Petch Dec 14 '20 at 22:02
  • @MichaelPetch thanks. I can see the change now. As my `__exit` clear the bit 30 upon removing the module, I can see the change in `cr0`. However, even after clearing the bit, my system is still running slow. Any thought what might cause that? – user45698746 Dec 14 '20 at 22:59
  • 2
    It only sets CD on whichever CPU happens to run the init function. Perhaps the exit function runs on a different CPU. – prl Dec 15 '20 at 01:33
  • @prl that makes sense. However, according to the Intel manual `vol 3A chapter 11 section 11.5.2`, setting the `CD` flag disable caching globally. I was assuming clearing the `CD` flag also enables caching globally. I guess I'm wrong. – user45698746 Dec 15 '20 at 04:27
  • 2
    By “globally” it means for all memory accesses generated by that CPU. It is contrasted with the next sentence, which talks about disabling caching at page granularity. Changing CR0 on one CPU cannot affect the behavior of the other CPUs. That could be written more clearly. – prl Dec 15 '20 at 07:27

0 Answers0