I am having trouble understanding how this LZSS compression assigns dictionary match references. This particular LZSS compression algorithm has a dictionary space of 4096 bytes that are assigned 0 to start. The first index used is 4078, and the match length is 3 to 18. Each data chunk is a flag byte followed by 8 objects. An object is a 1 byte literal or two byte reference/length pair.
The flag byte determines whether the next object will be a literal or a reference/length. As an example, 0xEF (0b11101111) means the fifth object will be a dictionary reference. All other objects will be literals.
For the reference/length pair, 0XECF0 means the dictionary reference is 0xFEC and the length is 3 (0+3).
This set of data:
00 0A 7C 8D 00 00 00 03 00 5B
Compresses into
EF 00 0A 7C 8D EC F0 03 00 5B
This 100% makes sense to me. The 0xFEC, 0xFED, and 0xFEE all contain zeros in the dictionary.
The next sets of bytes to be encoded are
00 5C 00 5D 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 E0
Which it compresses into
CF 00 5C 00 5D DC FF E5 F7 01 E0
Why is it DC FF E5 F7 (reference 0XFDC length 18, reference 0xFE5 length 10) and not DD FF 0D 07? I thought LZSS was supposed to work backwards. The encoder is ignoring the 0 stored at reference 0XFEE for the first 18 zeros. I have seen other circumstances where chains of 18 0s are previously referenced. I have even used someone else’s code and got the same result I did.
UPDATE
This algorithm is completely baffling. The 0xB36 byte to be encoded is 18 0s and it pulls it from 0x1DF to 0x1F0. The very next bytes to be encoded (0XB48) is 5 0s. It goes all the way back 0XFE9 to 0xFEE to pull the 5 zeros. Where the dictionary references are pulled from seem completely arbitrary.