Convert byte to string in x86 assembly language

Question

In x86 assembly language, is there any efficient way to convert a byte to a string of binary digits (represented as a byte array of 0s and 1s)? As far as I know, there isn't any 'toString' function in x86 assembly, as in most high-level programming languages.

.stack 2048

.data
theString byte 0, 0, 0, 0, 0, 0, 0, 0 ;store eax as a binary string here.
ExitProcess proto, exitcode:dword 

.code
start:
mov eax, 3;
;now I need to convert eax to a binary string somehow (i. e., a byte array of 0s and 1s)
invoke  ExitProcess, 0
end start

At least it's possible to obtain the first bit from a register in x86 assembly language: http://stackoverflow.com/questions/15238467/get-the-first-bit-of-the-eax-register-in-x86-assembly-language — Anderson Green, Apr 03 '13 at 16:20
If you mean converting e.g. the value 13 to the string "1101" then see my answer for http://stackoverflow.com/questions/15786970/mips-decimal-to-binary-conversion-code-is-working-but-result-must-be-reversed-ho/15787423# It could be done fairly efficiently on x86 with a loop and a `SHL` / `JC` combination. — Michael, Apr 03 '13 at 16:26
@Michael That question discusses MIPS assembly instead of x86 assembly. — Anderson Green, Apr 03 '13 at 16:27
Read the actual answer. It's in no way MIPS-specific and doesn't even contain any MIPS code. — Michael, Apr 03 '13 at 16:28

score 1 · Accepted Answer · answered Apr 05 '13 at 03:44

Was it that hard?:

.data
mystr db 33 dup(0)

.code

EaxToBinaryString:
    mov     ebx, offset mystr
    mov     ecx, 32
EaxToBinaryString1:
    mov     dl, '0' ; replace '0' with 0 if you don't want an ASCII string
    rol     eax, 1
    adc     dl, 0
    mov     byte ptr [ebx], dl
    inc     ebx
    loop    EaxToBinaryString1
    ret

score 0 · Answer 2 · answered Apr 04 '13 at 14:10

Using SSE intrinsics, one could code this like:

char in[2];
char string[16];
__m128i zeroes = _mm_set1_epi8('0');
__m128i ones = _mm_set1_epi8('1');
__m128i mask = _mm_set_epi8(
    0x80, 0x40, 0x20, 0x10, 8, 4, 2, 1,
    0x80, 0x40, 0x20, 0x10, 8, 4, 2, 1);
__m128i val = _mm_set_epi8(
    in[1], in[1], in[1], in[1], in[1], in[1], in[1], in[1],
    in[0], in[0], in[0], in[0], in[0], in[0], in[0], in[0]);

val = _mm_cmplt_epi8(val, _mm_and_si128(val, mask));
val = _mm_or_si128(_mm_and_si128(val, zeroes), _mm_andnot_si128(val, ones));
_mm_storeu_si128(string, val);

The code performs the following steps:

replicate the 2-byte input into all bytes of the XMM register, _mm_set1_epi...()
create a mask to extract a different bit from each word
bit extract using parallel and
compare (lower-than) the extracted bit with the mask.
the result is an array of either 0xffff or 0x0 if the bit was clear, or set.
extract the '0' and '1' characters using that mask, combine them.
write the resulting byte array out

This gets away with shift-and-test sequences, but at the price of the _mm_set*() which expands into sequences of a few SSE instructions each. It's still faster than 128 iterations of a bit-test loop.

Which type of assembly language syntax is this? I don't recognize it. (I usually use MASM syntax, so I'm a bit confused now.) — Anderson Green, Apr 04 '13 at 14:35
Not assembly - compiler _intrinsics_, http://software.intel.com/en-us/articles/how-to-use-intrinsics — FrankH., Apr 04 '13 at 16:50
I.e. the above can be _compiled_ (with a C/C++ compiler and `#include `); the compiler substitutes some of the SSE intrinsics with exactly-matching SSE instructions (`_mm_or...` = `POR`, `_mm_cmplt...` = `PCMPGT` with inverted operands, ...), others evaluate into a small sequence of instructions (the `_mm_set...` ones). In many cases, it's much easier to write and test x86 SIMD code with intrinsics first, and dump into a plain/pure assembly function after ... — FrankH., Apr 04 '13 at 16:57

Convert byte to string in x86 assembly language

2 Answers2