9

I read this article: http://static.patater.com/gbaguy/day3pc.htm

It includes the sentence

DON'T EVER CHANGE CS!!

But what exactly would happen if you did modify the CS segment register? Why is it so dangerous?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
SmRndGuy
  • 1,719
  • 5
  • 30
  • 49
  • 2
    `CS` = code segment. I suppose changing it is equivalent (in some sense) to a perverted `jmp`. – valdo Sep 04 '12 at 12:55
  • 6
    That document seems quite unreliable: "DON'T EVER CHANGE `CS`!!, but you can read `CS` like so: `mov ds,cs` ; put `CS`'s value into `D`S." Well, in x86 there is no such instruction as `mov ds,cs` nor any other `mov segreg,segreg`. To read the value of `cs` you can either use `mov reg,cs; mov ds,reg` (where `reg` can be `ax`, `bx`, `cx` etc...), or `push cs; pop ds`. Further, if you decide not to *ever* change `cs`, all interrupt calls are out of question (eg. BIOS, DOS and Linux services). http://web.itu.edu.tr/kesgin/mul06/intel/instr/mov.html – nrz Sep 04 '12 at 13:41
  • 1
    @nrz: There is no such thing as "Linux services" accessible through `far` calls (interrupts / syscalls work differently; even though these lead to a change in `cs`, the _caller_ cannot control what that target `cs` will be, that's decided by the OS when setting up the IDT entries / syscall msrs). Ack with all else, obviously `cs` _can_ be changed, just that unless the target code segment exists and is set up in such a way that the target `eip` is reachable, any such call will cause a `#GP` fault and the app will abort. – FrankH. Sep 06 '12 at 10:33
  • 3
    Just having read the document linked to by the orig question: I do wonder why anyone not ultimately masochistic will these days attempt _16bit x86_ assembly programming. If anything, that'll scar your brain and turn you off assembly for the rest of your life... – FrankH. Sep 06 '12 at 10:41
  • 1
    @FrankH With "Linux services" I meant `int 0x80` (32-bit) or `syscall` (64-bit). I agree on your comment about why not learn x86 16-bit assembly. I don't see any reason to start learning assembly with x86 16-bit assembly nowadays, even if that's what I started with more than ten years ago (in DOS environment). I'm currently learning Linux 64-bit assembly and I believe it will be useful for many years to come. – nrz Sep 06 '12 at 12:08

2 Answers2

8

cs is the code segment. cs:ip, which means cs together with ip (instruction pointer) points to the location of the next instruction. So any change to cs or ip or to both changes the address from where the next instruction will be fetched and executed.

Usually you change cs with a jmp (long jump), call (long call), retf, int3, int or iret. In 8088 and 8086 pop cs is also available (opcode 0x0F). pop cs won't work in 186+, in which the opcode 0x0F is reserved for multibyte instructions. http://en.wikipedia.org/wiki/X86_instruction_listings

There is nothing inherently dangerous in long jump or long call. You just have to know where you jump or call and in protected mode you need to have sufficient priviledges to do it. In 16-bit real mode (eg. DOS) you can jump and call what ever address you wish, eg. jmp 0xF000:0xFFF0 sets cs to 0xF000 and ip to 0xFFF0, which is the start address of BIOS code, and thus reboots the computer. Different memory addresses have different code and thus cause different kinds of results, in theory everything possible can happen (if you jump into BIOS code used for formatting hard-drive, with valid register and/or stack values, then the hard drive will be formatted 'as requested'). In practice jmp's and call's to most addresses probably result in invalid opcode or some other exception (divide by zero, divide overflow, etc.) quite soon.

nrz
  • 10,435
  • 4
  • 39
  • 71
  • Changing only `cs` is usually wrong. Normally either `cs` and `xip` are changed together or just `xip` changes. – Alexey Frunze Sep 04 '12 at 16:12
  • 3
    @AlexeyFrunze Well, the only way to change only `cs` without changing/setting `ip`/`eip`/`rip` is `pop cs`, that is available only in 8088 and 8086. In 186+ you can't change `cs` without changing/setting also `ip`/`eip`/`rip`. `pop cs` could possibly be used in obfuscation code targeted for 8088/8086 (or 8088/8086 emulator). – nrz Sep 04 '12 at 16:23
1

In protected mode and long mode (i.e. not 16-bit mode), segment registers including CS are no longer just an extra 4 bits of address. They index into the table of segment descriptors, with a base + limit (normal base=0 limit=4GiB, i.e. a flat memory model), but also with other attributes.

The code segment descriptor determines the CPU mode (e.g. 32-bit compat mode vs. 64-bit long mode). On a 64-bit kernel, a 64-bit user-space process could make a far jmp to some 32-bit code. This is not useful in practice, and may even break when the OS returns to your process after a context switch.

TODO: dig up a link where someone showed how to do this. I think there was even a recent question about this with a detailed answer about how to even find the right segment numbers.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847