4

Is there a AVR disassembler which produces human readable output, e.g., writes

OUT       SREG,R0 

instead of

OUT       0x3F,R0

(I would like to get a better understanding of what the compiler is doing.)

Windell Oskay
  • 1,579
  • 14
  • 13
Mike L.
  • 221
  • 3
  • 8
  • If this is from code which you have compiled, your C compiler will be able to show you properly annotated assembly. http://www.delorie.com/djgpp/v2faq/faq8_20.html – Toby Jaffey Aug 10 '11 at 22:13
  • http://stackoverflow.com/questions/5141177/atmel-avr-disassembler – Toby Jaffey Aug 12 '11 at 15:09
  • Mike? Did you figure this one out? I'm looking for exactly the same thing (on Linux) and I consider writing a Perl postprocessor if there is no better solution. – jippie Jul 11 '12 at 18:23
  • No, I've didn't find a good disassembler yet. – Mike L. Jul 12 '12 at 18:46

3 Answers3

3

Sorry to revive this thread but I've been a few times asking myself how I could have avr-objdump output proper register names. I found out it is hopeless. Or... is it? Not for a python lover, of course!

That's the reason for this script, which basically is a wrapper/post-processor for avr-objdump. It is currently limited to a few architectures, namely avr35 and avr5 essentially because I used those 3 processors: ATmega64M1, ATmega328 (Arduino) and ATtiny1634. There's room for improvement, like making the register list available in modules but this opens the way.

Command line invocation is as simple as this:

adump.py <MCUNAME> <avr-objdump arguments>

Examples:

adump.py atmega328 -S main.flash.hex

If you have the raw binary flash image that you've downloaded from a board:

adump.py atmega328 -D -b binary main.flash.bin

Here's the output for a recovered flash image of a simple blinking program written for the ATmega64M1:

$ adump.py atmega64m1 -D -b binary blink-m64m1.bin

...
00000000 <.data>:
   0:   0c 94 3e 00     jmp 0x7c    ;  0x7c
   4:   0c 94 48 00     jmp 0x90    ;  0x90
   8:   0c 94 48 00     jmp 0x90    ;  0x90
   c:   0c 94 48 00     jmp 0x90    ;  0x90
  10:   0c 94 48 00     jmp 0x90    ;  0x90
  14:   0c 94 48 00     jmp 0x90    ;  0x90
  18:   0c 94 48 00     jmp 0x90    ;  0x90
  1c:   0c 94 48 00     jmp 0x90    ;  0x90
  20:   0c 94 48 00     jmp 0x90    ;  0x90
  24:   0c 94 48 00     jmp 0x90    ;  0x90
  28:   0c 94 48 00     jmp 0x90    ;  0x90
  2c:   0c 94 48 00     jmp 0x90    ;  0x90
  30:   0c 94 48 00     jmp 0x90    ;  0x90
  34:   0c 94 48 00     jmp 0x90    ;  0x90
  38:   0c 94 48 00     jmp 0x90    ;  0x90
  3c:   0c 94 48 00     jmp 0x90    ;  0x90
  40:   0c 94 48 00     jmp 0x90    ;  0x90
  44:   0c 94 48 00     jmp 0x90    ;  0x90
  48:   0c 94 48 00     jmp 0x90    ;  0x90
  4c:   0c 94 48 00     jmp 0x90    ;  0x90
  50:   0c 94 48 00     jmp 0x90    ;  0x90
  54:   0c 94 48 00     jmp 0x90    ;  0x90
  58:   0c 94 48 00     jmp 0x90    ;  0x90
  5c:   0c 94 48 00     jmp 0x90    ;  0x90
  60:   0c 94 48 00     jmp 0x90    ;  0x90
  64:   0c 94 48 00     jmp 0x90    ;  0x90
  68:   0c 94 48 00     jmp 0x90    ;  0x90
  6c:   0c 94 48 00     jmp 0x90    ;  0x90
  70:   0c 94 48 00     jmp 0x90    ;  0x90
  74:   0c 94 48 00     jmp 0x90    ;  0x90
  78:   0c 94 48 00     jmp 0x90    ;  0x90
  7c:   11 24           eor r1, r1
  7e:   1f be           out SREG, r1    ; 63
  80:   cf ef           ldi r28, 0xFF   ; 255
  82:   d0 e1           ldi r29, 0x10   ; 16
  84:   de bf           out SPH, r29    ; 62
  86:   cd bf           out SPL, r28    ; 61
  88:   0e 94 4a 00     call    0x94    ;  0x94
  8c:   0c 94 5a 00     jmp 0xb4    ;  0xb4
  90:   0c 94 00 00     jmp 0   ;  0x0
  94:   82 e0           ldi r24, 0x02   ; 2
  96:   8a b9           out DDRD, r24   ; 10
  98:   92 e0           ldi r25, 0x02   ; 2
  9a:   8b b1           in  r24, PORTD  ; 11
  9c:   89 27           eor r24, r25
  9e:   8b b9           out PORTD, r24  ; 11
  a0:   2f e3           ldi r18, 0x3F   ; 63
  a2:   3d e0           ldi r19, 0x0D   ; 13
  a4:   83 e0           ldi r24, 0x03   ; 3
  a6:   21 50           subi    r18, 0x01   ; 1
  a8:   30 40           sbci    r19, 0x00   ; 0
  aa:   80 40           sbci    r24, 0x00   ; 0
  ac:   e1 f7           brne    .-8         ;  0xa6
  ae:   00 c0           rjmp    .+0         ;  0xb0
  b0:   00 00           nop
  b2:   f3 cf           rjmp    .-26        ;  0x9a
  b4:   f8 94           cli
  b6:   ff cf           rjmp    .-2         ;  0xb6

Now here's the script, in the hope it'll still be useful:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
#  adump.py
#
#  Copyright 2017 Nasha
#
#  This program is free software; you can redistribute it and/or modify
#  it under the terms of the GNU General Public License as published by
#  the Free Software Foundation; either version 2 of the License, or
#  (at your option) any later version.
#
#  This program is distributed in the hope that it will be useful,
#  but WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#  GNU General Public License for more details.
#
#  You should have received a copy of the GNU General Public License
#  along with this program; if not, write to the Free Software
#  Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
#  MA 02110-1301, USA.

from __future__ import print_function
import sys

def report(*args, **kwargs):
    print(*args, file=sys.stderr, **kwargs)

arches = {
    # List of supported architectures.
    # See http://www.atmel.com/webdoc/AVRLibcReferenceManual/using_tools_1using_avr_gcc_mach_opt.html
    # for updates
    "at90s1200": "avr1",
    "attiny11": "avr1",
    "attiny12": "avr1",
    "attiny15": "avr1",
    "attiny28": "avr1",
    "at90s2313": "avr2",
    "at90s2323": "avr2",
    "at90s2333": "avr2",
    "at90s2343": "avr2",
    "attiny22": "avr2",
    "attiny26": "avr2",
    "at90s4414": "avr2",
    "at90s4433": "avr2",
    "at90s4434": "avr2",
    "at90s8515": "avr2",
    "at90c8534": "avr2",
    "at90s8535": "avr2",
    "at86rf401": "avr25",
    "ata6289": "avr25",
    "ata5272": "avr25",
    "ata6616c": "avr25",
    "attiny13": "avr25",
    "attiny13a": "avr25",
    "attiny2313": "avr25",
    "attiny2313a": "avr25",
    "attiny24": "avr25",
    "attiny24a": "avr25",
    "attiny25": "avr25",
    "attiny261": "avr25",
    "attiny261a": "avr25",
    "attiny4313": "avr25",
    "attiny43u": "avr25",
    "attiny44": "avr25",
    "attiny44a": "avr25",
    "attiny441": "avr25",
    "attiny45": "avr25",
    "attiny461": "avr25",
    "attiny461a": "avr25",
    "attiny48": "avr25",
    "attiny828": "avr25",
    "attiny84": "avr25",
    "attiny84a": "avr25",
    "attiny841": "avr25",
    "attiny85": "avr25",
    "attiny861": "avr25",
    "attiny861a": "avr25",
    "attiny87": "avr25",
    "attiny88": "avr25",
    "atmega603": "avr3",
    "at43usb355": "avr3",
    "atmega103": "avr31",
    "at43usb320": "avr31",
    "at90usb82": "avr35",
    "at90usb162": "avr35",
    "ata5505": "avr35",
    "ata6617c": "avr35",
    "ata664251": "avr35",
    "atmega8u2": "avr35",
    "atmega16u2": "avr35",
    "atmega32u2": "avr35",
    "attiny167": "avr35",
    "attiny1634": "avr35",
    "at76c711": "avr3",
    "ata6285": "avr4",
    "ata6286": "avr4",
    "ata6612c": "avr4",
    "atmega48": "avr4",
    "atmega48a": "avr4",
    "atmega48pa": "avr4",
    "atmega48p": "avr4",
    "atmega8": "avr4",
    "atmega8a": "avr4",
    "atmega8515": "avr4",
    "atmega8535": "avr4",
    "atmega88": "avr4",
    "atmega88a": "avr4",
    "atmega88p": "avr4",
    "atmega88pa": "avr4",
    "atmega8hva": "avr4",
    "at90pwm1": "avr4",
    "at90pwm2": "avr4",
    "at90pwm2b": "avr4",
    "at90pwm3": "avr4",
    "at90pwm3b": "avr4",
    "at90pwm81": "avr4",
    "at90can32": "avr5",
    "at90can64": "avr5",
    "at90pwm161": "avr5",
    "at90pwm216": "avr5",
    "at90pwm316": "avr5",
    "at90scr100": "avr5",
    "at90usb646": "avr5",
    "at90usb647": "avr5",
    "at94k": "avr5",
    "atmega16": "avr5",
    "ata5790": "avr5",
    "ata5702m322": "avr5",
    "ata5782": "avr5",
    "ata6613c": "avr5",
    "ata6614q": "avr5",
    "ata5790n": "avr5",
    "ata5795": "avr5",
    "ata5831": "avr5",
    "atmega161": "avr5",
    "atmega162": "avr5",
    "atmega163": "avr5",
    "atmega164a": "avr5",
    "atmega164p": "avr5",
    "atmega164pa": "avr5",
    "atmega165": "avr5",
    "atmega165a": "avr5",
    "atmega165p": "avr5",
    "atmega165pa": "avr5",
    "atmega168": "avr5",
    "atmega168a": "avr5",
    "atmega168p": "avr5",
    "atmega168pa": "avr5",
    "atmega169": "avr5",
    "atmega169a": "avr5",
    "atmega169p": "avr5",
    "atmega169pa": "avr5",
    "atmega16a": "avr5",
    "atmega16hva": "avr5",
    "atmega16hva2": "avr5",
    "atmega16hvb": "avr5",
    "atmega16hvbrevb": "avr5",
    "atmega16m1": "avr5",
    "atmega16u4": "avr5",
    "atmega32": "avr5",
    "atmega32a": "avr5",
    "atmega323": "avr5",
    "atmega324a": "avr5",
    "atmega324p": "avr5",
    "atmega324pa": "avr5",
    "atmega325": "avr5",
    "atmega325a": "avr5",
    "atmega325p": "avr5",
    "atmega325pa": "avr5",
    "atmega3250": "avr5",
    "atmega3250a": "avr5",
    "atmega3250p": "avr5",
    "atmega3250pa": "avr5",
    "atmega328": "avr5",
    "atmega328p": "avr5",
    "atmega329": "avr5",
    "atmega329a": "avr5",
    "atmega329p": "avr5",
    "atmega329pa": "avr5",
    "atmega3290": "avr5",
    "atmega3290a": "avr5",
    "atmega3290p": "avr5",
    "atmega3290pa": "avr5",
    "atmega32c1": "avr5",
    "atmega32hvb": "avr5",
    "atmega32hvbrevb": "avr5",
    "atmega32m1": "avr5",
    "atmega32u4": "avr5",
    "atmega32u6": "avr5",
    "atmega406": "avr5",
    "atmega64rfr2": "avr5",
    "atmega644rfr2": "avr5",
    "atmega64": "avr5",
    "atmega64a": "avr5",
    "atmega640": "avr5",
    "atmega644": "avr5",
    "atmega644a": "avr5",
    "atmega644p": "avr5",
    "atmega644pa": "avr5",
    "atmega645": "avr5",
    "atmega645a": "avr5",
    "atmega645p": "avr5",
    "atmega6450": "avr5",
    "atmega6450a": "avr5",
    "atmega6450p": "avr5",
    "atmega649": "avr5",
    "atmega649a": "avr5",
    "atmega6490": "avr5",
    "atmega6490a": "avr5",
    "atmega6490p": "avr5",
    "atmega649p": "avr5",
    "atmega64c1": "avr5",
    "atmega64hve": "avr5",
    "atmega64hve2": "avr5",
    "atmega64m1": "avr5",
    "m3000": "avr5",
    "at90can128": "avr51",
    "at90usb1286": "avr51",
    "at90usb1287": "avr51",
    "atmega128": "avr51",
    "atmega128a": "avr51",
    "atmega1280": "avr51",
    "atmega1281": "avr51",
    "atmega1284": "avr51",
    "atmega1284p": "avr51",
    "atmega128rfr2": "avr51",
    "atmega1284rfr2": "avr51",
    "atmega2560": "avr6",
    "atmega2561": "avr6",
    "atmega256rfr2": "avr6",
    "atmega2564rfr2": "avr6",
    "atxmega16a4": "avrxmega2",
    "atxmega16a4u": "avrxmega2",
    "atxmega16c4": "avrxmega2",
    "atxmega16d4": "avrxmega2",
    "atxmega32a4": "avrxmega2",
    "atxmega32a4u": "avrxmega2",
    "atxmega32c3": "avrxmega2",
    "atxmega32c4": "avrxmega2",
    "atxmega32d3": "avrxmega2",
    "atxmega32d4": "avrxmega2",
    "atxmega8e5": "avrxmega2",
    "atxmega16e5": "avrxmega2",
    "atxmega32e5": "avrxmega2",
    "atxmega64a3": "avrxmega4",
    "atxmega64a3u": "avrxmega4",
    "atxmega64a4u": "avrxmega4",
    "atxmega64b1": "avrxmega4",
    "atxmega64b3": "avrxmega4",
    "atxmega64c3": "avrxmega4",
    "atxmega64d3": "avrxmega4",
    "atxmega64d4": "avrxmega4",
    "atxmega64a1": "avrxmega5",
    "atxmega64a1u": "avrxmega5",
    "atxmega128a3": "avrxmega6",
    "atxmega128a3u": "avrxmega6",
    "atxmega128b1": "avrxmega6",
    "atxmega128b3": "avrxmega6",
    "atxmega128c3": "avrxmega6",
    "atxmega128d3": "avrxmega6",
    "atxmega128d4": "avrxmega6",
    "atxmega192a3": "avrxmega6",
    "atxmega192a3u": "avrxmega6",
    "atxmega192c3": "avrxmega6",
    "atxmega192d3": "avrxmega6",
    "atxmega256a3": "avrxmega6",
    "atxmega256a3u": "avrxmega6",
    "atxmega256a3b": "avrxmega6",
    "atxmega256a3bu": "avrxmega6",
    "atxmega256c3": "avrxmega6",
    "atxmega256d3": "avrxmega6",
    "atxmega384c3": "avrxmega6",
    "atxmega384d3": "avrxmega6",
    "atxmega128a1": "avrxmega7",
    "atxmega128a1u": "avrxmega7",
    "atxmega128a4u": "avrxmega7",
    "attiny4": "avrtiny10",
    "attiny5": "avrtiny10",
    "attiny9": "avrtiny10",
    "attiny10": "avrtiny10",
    "attiny20": "avrtiny10",
    "attiny40": "avrtiny10",
}

regs = {
    # Register dictionary for supported architectures
    "avr35": {
        0x7F : "TWSCRA",
        0x7E : "TWSCRB",
        0x7D : "TWSSRA",
        0x7C : "TWSA",
        0x7B : "TWSAM",
        0x7A : "TWSD",
        0x79 : "UCSR1A",
        0x78 : "UCSR1B",
        0x77 : "UCSR1C",
        0x76 : "UCSR1D",
        0x75 : "UBRR1H",
        0x74 : "UBRR1L",
        0x73 : "UDR1",
        0x72 : "TCCR1A",
        0x71 : "TCCR1B",
        0x70 : "TCCR1C",
        0x6F : "TCNT1H",
        0x6E : "TCNT1L",
        0x6D : "OCR1AH",
        0x6C : "OCR1AL",
        0x6B : "OCR1BH",
        0x6A : "OCR1BL",
        0x69 : "ICR1H",
        0x68 : "ICR1L",
        0x67 : "GTCCR",
        0x66 : "OSCCAL1",
        0x65 : "OSCTCAL0B",
        0x64 : "OSCTCAL0A",
        0x63 : "OSCCAL0",
        0x62 : "DIDR2",
        0x61 : "DIDR1",
        0x60 : "DIDR0",
        0x3F : "SREG",
        0x3E : "SPH",
        0x3D : "SPL",
        0x3C : "GIMSK",
        0x3B : "GIFR",
        0x3A : "TIMSK",
        0x39 : "TIFR",
        0x38 : "QTCSR",
        0x37 : "SPMCSR",
        0x36 : "MCUCR",
        0x35 : "MCUSR",
        0x34 : "PRR",
        0x33 : "CLKPR",
        0x32 : "CLKSR",
        0x30 : "WDTCSR",
        0x2F : "CCP",
        0x2E : "DWDR",
        0x2D : "USIBR",
        0x2C : "USIDR",
        0x2B : "USISR",
        0x2A : "USICR",
        0x29 : "PCMSK2",
        0x28 : "PCMSK1",
        0x27 : "PCMSK0",
        0x26 : "UCSR0A",
        0x25 : "UCSR0B",
        0x24 : "UCSR0C",
        0x23 : "UCSR0D",
        0x22 : "UBRR0H",
        0x21 : "UBRR0L",
        0x20 : "UDR0",
        0x1F : "EEARH",
        0x1E : "EEARL",
        0x1D : "EEDR",
        0x1C : "EECR",
        0x1B : "TCCR0A",
        0x1A : "TCCR0B",
        0x19 : "TCNT0",
        0x18 : "OCR0A",
        0x17 : "OCR0B",
        0x16 : "GPIOR2",
        0x15 : "GPIOR1",
        0x14 : "GPIOR0",
        0x13 : "PORTCR",
        0x12 : "PUEA",
        0x11 : "PORTA",
        0x10 : "DDRA",
        0x0F : "PINA",
        0x0E : "PUEB",
        0x0D : "PORTB",
        0x0C : "DDRB",
        0x0B : "PINB",
        0x0A : "PUEC",
        0x09 : "PORTC",
        0x08 : "DDRC",
        0x07 : "PINC",
        0x06 : "ACSRA",
        0x05 : "ACSRB",
        0x04 : "ADMUX",
        0x03 : "ADCSRA",
        0x02 : "ADCSRB",
        0x01 : "ADCH",
        0x00 : "ADCL",
    },
    "avr5": {
        0xFA : "CANMSG",
        0xF9 : "CANSTMPH",
        0xF8 : "CANSTMPL",
        0xF7 : "CANIDM1",
        0xF6 : "CANIDM2",
        0xF5 : "CANIDM3",
        0xF4 : "CANIDM4",
        0xF3 : "CANIDT1",
        0xF2 : "CANIDT2",
        0xF1 : "CANIDT3",
        0xF0 : "CANIDT4",
        0xEF : "CANCDMOB",
        0xEE : "CANSTMOB",
        0xED : "CANPAGE",
        0xEC : "CANHPMOB",
        0xEB : "CANREC",
        0xEA : "CANTEC",
        0xE9 : "CANTTCH",
        0xE8 : "CANTTCL",
        0xE7 : "CANTIMH",
        0xE6 : "CANTIML",
        0xE5 : "CANTCON",
        0xE4 : "CANBT3",
        0xE3 : "CANBT2",
        0xE2 : "CANBT1",
        0xE1 : "CANSIT1",
        0xE0 : "CANSIT2",
        0xDF : "CANIE1",
        0xDE : "CANIE2",
        0xDD : "CANEN1",
        0xDC : "CANEN2",
        0xDB : "CANGIE",
        0xDA : "CANGIT",
        0xD9 : "CANGSTA",
        0xD8 : "CANGCON",
        0xD2 : "LINDAT",
        0xD1 : "LINSEL",
        0xD0 : "LINIDR",
        0xCF : "LINDLR",
        0xCE : "LINBRRH",
        0xCD : "LINBRRL",
        0xCC : "LINBTR",
        0xCB : "LINERR",
        0xCA : "LINENIR",
        0xC9 : "LINSIR",
        0xC8 : "LINCR",
        0xBC : "PIFR",
        0xBB : "PIM",
        0xBA : "PMIC2",
        0xB9 : "PMIC1",
        0xB8 : "PMIC0",
        0xB7 : "PCTL",
        0xB6 : "POC",
        0xB5 : "PCNF",
        0xB4 : "PSYNC",
        0xB3 : "POCR_RBH",
        0xB2 : "POCR_RBL",
        0xB1 : "POCR2SBH",
        0xB0 : "POCR2SBL",
        0xAF : "POCR2RAH",
        0xAE : "POCR2RAL",
        0xAD : "POCR2SAH",
        0xAC : "POCR2SAL",
        0xAB : "POCR1SBH",
        0xAA : "POCR1SBL",
        0xA9 : "POCR1RAH",
        0xA8 : "POCR1RAL",
        0xA7 : "POCR1SAH",
        0xA6 : "POCR1SAL",
        0xA5 : "POCR0SBH",
        0xA4 : "POCR0SBL",
        0xA3 : "POCR0RAH",
        0xA2 : "POCR0RAL",
        0xA1 : "POCR0SAH",
        0xA0 : "POCR0SAL",
        0x97 : "AC3CON",
        0x96 : "AC2CON",
        0x95 : "AC1CON",
        0x94 : "AC0CON",
        0x92 : "DACH",
        0x91 : "DACL",
        0x90 : "DACON",
        0x8B : "OCR1BH",
        0x8A : "OCR1BL",
        0x89 : "OCR1AH",
        0x88 : "OCR1AL",
        0x87 : "ICR1H",
        0x86 : "ICR1L",
        0x85 : "TCNT1H",
        0x84 : "TCNT1L",
        0x82 : "TCCR1C",
        0x81 : "TCCR1B",
        0x80 : "TCCR1A",
        0x7F : "DIDR1",
        0x7E : "DIDR0",
        0x7C : "ADMUX",
        0x7B : "ADCSRB",
        0x7A : "ADCSRA",
        0x79 : "ADCH",
        0x78 : "ADCL",
        0x77 : "AMP2CSR",
        0x76 : "AMP1CSR",
        0x75 : "AMP0CSR",
        0x6F : "TIMSK1",
        0x6E : "TIMSK0",
        0x6D : "PCMSK3",
        0x6C : "PCMSK2",
        0x6B : "PCMSK1",
        0x6A : "PCMSK0",
        0x69 : "EICRA",
        0x68 : "PCICR",
        0x66 : "OSCCAL",
        0x64 : "PRR",
        0x61 : "CLKPR",
        0x60 : "WDTCSR",
        0x3F : "SREG",
        0x3E : "SPH",
        0x3D : "SPL",
        0x37 : "SPMCSR",
        0x35 : "MCUCR",
        0x34 : "MCUSR",
        0x33 : "SMCR",
        0x30 : "ACSR",
        0x2E : "SPDR",
        0x2D : "SPSR",
        0x2C : "SPCR",
        0x29 : "PLLCSR",
        0x28 : "OCR0B",
        0x27 : "OCR0A",
        0x26 : "TCNT0",
        0x25 : "TCCR0B",
        0x24 : "TCCR0A",
        0x23 : "GTCCR",
        0x22 : "EEARH",
        0x21 : "EEARL",
        0x20 : "EEDR",
        0x1F : "EECR",
        0x1E : "GPIOR0",
        0x1D : "EIMSK",
        0x1C : "EIFR",
        0x1B : "PCIFR",
        0x1A : "GPIOR2",
        0x19 : "GPIOR1",
        0x16 : "TIFR1",
        0x15 : "TIFR0",
        0x0E : "PORTE",
        0x0D : "DDRE",
        0x0C : "PINE",
        0x0B : "PORTD",
        0x0A : "DDRD",
        0x09 : "PIND",
        0x08 : "PORTC",
        0x07 : "DDRC",
        0x06 : "PINC",
        0x05 : "PORTB",
        0x04 : "DDRB",
        0x03 : "PINB",
    }
}

import io
import os
import re
from subprocess import Popen, PIPE

decode = [
    re.compile(r'\b(?P<op>in|lds)\s+\w+,\s+(?P<reg>\w+)'),
    re.compile(r'\b(?P<op>out|[cs]bi|sts)\s+(?P<reg>\w+),\s+\w+'),
]

def main(args):
    """ Arguments: mcu_name dump_file """
    # Find MCU name hence architecture name & registers from first argument
    try:
        mcu = args[1]
        arch = arches[mcu]
        cmd = ['avr-objdump', '-m', arch] + args[2:]
        reg_dict = regs[arch]

    except IndexError:
        report("""{0}: Not enough arguments

Syntax: {0} <MCU> <avr-objdump argument list>

Examples:
  {0} atmega328 -S main.flash.hex
  {0} atmega328 -D -b binary main.flash.bin
""".format(args[0]))
        return 1

    # Architecture not found or wrong MCU name. Use a blank register list
    except KeyError:
        reg_dict = []
        try:
            report("Register dictionary for '{}' not available yet...".format(arch))
        except UnboundLocalError:
            cmd = ['avr-objdump'] + args[1:]
            report("Unknown MCU name: '{}'".format(mcu))

    # Architecture & reg dictionary are ready, run disassembler, analyze output
    # Examples:
    #   avr-objdump -D -m avr5 -b binary <file.flash.bin>
    #   avr-objdump -D -m avr5 <file.flash.hex>
    p = Popen(cmd, stdout=PIPE, universal_newlines=True)
    try:

        # Parse each line from objdump standard output
        for line in iter(p.stdout.readline, ''):
            matches = decode[0].search(line) or decode[1].search(line)
            try:
                # If there's a match, replace the special register index (in hex)
                # with the matched register name from the register dictionary.
                if matches:
                    # As per the datasheet, registers 0-0x3F are accessed with a
                    # 0x20 offset only with their memory-mapped version (LDS & STS)
                    regid = int(matches['reg'], 0)
                    if matches['op'] in [ 'sts', 'lds' ] and regid < 0x60:
                        regid -= 0x20

                    line = ''.join( [
                        line[:matches.start(0)],
                        re.sub(matches['reg'], reg_dict[regid], matches[0]),
                        line[matches.end(0):]
                    ] )

            # Instruction matches but operand does not, ignore (maybe no reg list)
            except IndexError:
                pass

            # Don't care if register index is nowhere to be found
            except KeyError:
                pass

            sys.stdout.write(line)

        p.stdout.close()
        return p.wait()

    # We were piped and the pager exited... bail out!
    except BrokenPipeError:
        return 0


if __name__ == '__main__':
    import sys
    sys.exit(main(sys.argv))
  • You can add arch support by checking the datasheet for your MCU. On every Atmel datasheet there's a register table with register numeric id's and names. Take the first two columns. There is a catch though: lower registers appear as 0xnn (0xNN), in which case the leftmost ID matches (i.e. 0xnn), not the one between parentheses (i.e. 0xNN). –  Jul 25 '17 at 12:33
  • I see this script is licensed, so I want to credit properly the author(s). Is this Python script hosted somewhere, some GitHub/SVN? Should I link to this answer? – AgainPsychoX Mar 07 '24 at 11:30
1

The best disassembler I ever seen is IDA. It also supports AVR. But I have only used it for x86. I believe the feature you are asking is one of its core features. May be that is why this disassembler is named Interactive. http://www.hex-rays.com/idapro/idaproc.htm

Toby Jaffey
  • 28,836
  • 19
  • 98
  • 150
0

If you have an open-source toolchain, and this is really not already available as an option in it, it seems to me it should be fairly easy to provide.

Otherwise, it shouldn't be too hard to do in something that post-processes the output. Basically, you need to express a rule for matching a pattern where an address should be converted to a register name, and provide a lookup table of low addresses to names This could inefficiently be done with one sed command per register and instruction format, probably be done in one very complicated bit of sed wizardry per format, or be accomplished in just about any programming language where you know how to read and write strings to streams or files.

One of the tricks would be matching enough context to determine that the number such as 0x3F is an address and not an immediate argument.

At an extreme, writing your own disassembler is not necessarily that large an undertaking, and can be an informative one. You can start by making groups of opcodes that use the same format. Usually the binary encoding of a given style of instruction has a few bits for the ALU operation, perhaps a few option bits, and then a number of bits that encode the operand register numbers.

Chris Stratton
  • 33,491
  • 3
  • 44
  • 90