0

In x86 assembly language, is there any efficient way to convert a byte to a string of binary digits (represented as a byte array of 0s and 1s)? As far as I know, there isn't any 'toString' function in x86 assembly, as in most high-level programming languages.

.stack 2048

.data
theString byte 0, 0, 0, 0, 0, 0, 0, 0 ;store eax as a binary string here.
ExitProcess proto, exitcode:dword 

.code
start:
mov eax, 3;
;now I need to convert eax to a binary string somehow (i. e., a byte array of 0s and 1s)
invoke  ExitProcess, 0
end start
Anderson Green
  • 30,230
  • 67
  • 195
  • 328
  • At least it's possible to obtain the first bit from a register in x86 assembly language: http://stackoverflow.com/questions/15238467/get-the-first-bit-of-the-eax-register-in-x86-assembly-language – Anderson Green Apr 03 '13 at 16:20
  • 1
    If you mean converting e.g. the value 13 to the string "1101" then see my answer for http://stackoverflow.com/questions/15786970/mips-decimal-to-binary-conversion-code-is-working-but-result-must-be-reversed-ho/15787423# It could be done fairly efficiently on x86 with a loop and a `SHL` / `JC` combination. – Michael Apr 03 '13 at 16:26
  • @Michael That question discusses MIPS assembly instead of x86 assembly. – Anderson Green Apr 03 '13 at 16:27
  • Read the actual answer. It's in no way MIPS-specific and doesn't even contain any MIPS code. – Michael Apr 03 '13 at 16:28

2 Answers2

1

Was it that hard?:

.data
mystr db 33 dup(0)

.code

EaxToBinaryString:
    mov     ebx, offset mystr
    mov     ecx, 32
EaxToBinaryString1:
    mov     dl, '0' ; replace '0' with 0 if you don't want an ASCII string
    rol     eax, 1
    adc     dl, 0
    mov     byte ptr [ebx], dl
    inc     ebx
    loop    EaxToBinaryString1
    ret
Alexey Frunze
  • 61,140
  • 12
  • 83
  • 180
0

Using SSE intrinsics, one could code this like:

char in[2];
char string[16];
__m128i zeroes = _mm_set1_epi8('0');
__m128i ones = _mm_set1_epi8('1');
__m128i mask = _mm_set_epi8(
    0x80, 0x40, 0x20, 0x10, 8, 4, 2, 1,
    0x80, 0x40, 0x20, 0x10, 8, 4, 2, 1);
__m128i val = _mm_set_epi8(
    in[1], in[1], in[1], in[1], in[1], in[1], in[1], in[1],
    in[0], in[0], in[0], in[0], in[0], in[0], in[0], in[0]);

val = _mm_cmplt_epi8(val, _mm_and_si128(val, mask));
val = _mm_or_si128(_mm_and_si128(val, zeroes), _mm_andnot_si128(val, ones));
_mm_storeu_si128(string, val);

The code performs the following steps:

  • replicate the 2-byte input into all bytes of the XMM register, _mm_set1_epi...()
  • create a mask to extract a different bit from each word
  • bit extract using parallel and
  • compare (lower-than) the extracted bit with the mask.
    the result is an array of either 0xffff or 0x0 if the bit was clear, or set.
  • extract the '0' and '1' characters using that mask, combine them.
  • write the resulting byte array out

This gets away with shift-and-test sequences, but at the price of the _mm_set*() which expands into sequences of a few SSE instructions each. It's still faster than 128 iterations of a bit-test loop.

FrankH.
  • 17,675
  • 3
  • 44
  • 63
  • Which type of assembly language syntax is this? I don't recognize it. (I usually use MASM syntax, so I'm a bit confused now.) – Anderson Green Apr 04 '13 at 14:35
  • Not assembly - compiler _intrinsics_, http://software.intel.com/en-us/articles/how-to-use-intrinsics – FrankH. Apr 04 '13 at 16:50
  • I.e. the above can be _compiled_ (with a C/C++ compiler and `#include `); the compiler substitutes some of the SSE intrinsics with exactly-matching SSE instructions (`_mm_or...` = `POR`, `_mm_cmplt...` = `PCMPGT` with inverted operands, ...), others evaluate into a small sequence of instructions (the `_mm_set...` ones). In many cases, it's much easier to write and test x86 SIMD code with intrinsics first, and dump into a plain/pure assembly function after ... – FrankH. Apr 04 '13 at 16:57