I want to write a bootloader, which simply prints "Hello World!" on the screen and I don't know why my bytes get mixed up. I'm trying to write it in AT&T syntax (please don't recommend Intel syntax) and trying to convert the code from this tutorial to AT&T syntax.
Now here is the rather short code for my bootloader:
start:
.code16 #real mode
.text
.org 0x0
.globl _main
_main:
movw hello, %si
movb $0x0e, %ah
loophere:
lodsb
or %al, %al #is al==0 ?
jz halt #if previous instruction sets zero flag jump to halt
int $0x10 #run bios interrupt 0x10 (ah is set to 0x0e so a character is displayed)
jmp loophere
halt:
cli
hlt
hello: .ascii "Hello world!\0"
filloop:
.fill (510-(.-_main)),1,0 #I hope this works. Fill bootloader with 0's until byte 510
end:
.word 0xaa55
Now if I compile this with
$as -o boot.o boot.as
$ld -Ttext 0x07c00 -o boot.elf boot.o
$objcopy -O binary boot.elf boot.bin
the following command
$objdump -d boot.elf
gives me this dissassembly
Disassembly of section .text:
0000000000007c00 <_main>:
7c00: 8b 36 mov (%rsi),%esi
7c02: 11 7c b4 0e adc %edi,0xe(%rsp,%rsi,4)
0000000000007c06 <loophere>:
7c06: ac lods %ds:(%rsi),%al
7c07: 08 c0 or %al,%al
7c09: 74 04 je 7c0f <halt>
7c0b: cd 10 int $0x10
7c0d: eb f7 jmp 7c06 <loophere>
0000000000007c0f <halt>:
7c0f: fa cli
7c10: f4 hlt
0000000000007c11 <hello>:
7c11: 48 rex.W
7c12: 65 6c gs insb (%dx),%es:(%rdi)
7c14: 6c insb (%dx),%es:(%rdi)
7c15: 6f outsl %ds:(%rsi),(%dx)
7c16: 20 77 6f and %dh,0x6f(%rdi)
7c19: 72 6c jb 7c87 <filloop+0x69>
7c1b: 64 21 00 and %eax,%fs:(%rax)
0000000000007c1e <filloop>:
...
0000000000007dfe <end>:
7dfe: 55 push %rbp
7dff: aa stos %al,%es:(%rdi)
if I hexdump it (you can also see the bytes in the disassembly above) my first 6 bytes are
8b 36
11 7c b4 0e
compared to be 10 7c b4 0e from the tutorial (The rest of the hexdump is exactly the same down to the byte). Now I understand that ac is the instruction for lodsb (loadstringbyte) so b4 0e would have to load 0e into %ah and be 10 7c would have to point %si to the hello label at address 7c10 (be aware of little endian). I changed the corresponding bytes with a hex editor and it suddenly worked. Allthough the disassembly kinda mixed it up like this:
0000000000007c00 <_main>:
7c00: be 10 7c b4 0e mov $0xeb47c10,%esi
7c05: ac lods %ds:(%rsi),%al
My original version just printed a capital 'S'. Can someone help me as to why these first instruction bytes get set differently?
I'm coding all this on Debian 9 64-bit and running it on qemu-system-x86_64 as a floppy.