Does lossy audio compression damage datasette data?

Question

I grew up with the C64 and had software on cassette tapes. These days, you can find "backups" of this software all over the internet, even in mp3 format (by recording the audio signal with a sound interface and converting it).

Since mp3 is lossy, Wouldn't this damage the actual data and render the software useless? I assume the decoder fills in the data that was thrown away during compression but this is never 100% accurate, no?

Not all the data is needed. I bet you could read all the words in this comment, even if it was a low-quality JPEG. — user253751, Mar 30 '22 at 15:44
MP3 is not necessarily lossy. You can encode an audio stream in MP3 that is lossless. It takes more storage space, of course, but that's the tradeoff you can make as a content encoder. — jwh20, Mar 30 '22 at 16:25
@jwh20 ‒ I think we can all agree that any lossless encoding (with a sufficiently high sample rate) will work. — Michael Graf, Mar 30 '22 at 16:49
The OP makes the assertion that "...mp3 is lossy..." My comment is addressing that misconception only. — jwh20, Mar 30 '22 at 17:04
@jwh20: "You can encode an audio stream in MP3 that is lossless". Can you show an example, please? There's always filtering applied to the signal in MP3 encoding. — scruss, Mar 30 '22 at 20:12
@jwh20 MP3 has always been and still is a lossy compression codec. There has never been lossless MP3 technology. Perhaps you might mean something else than MP3? There are plenty of lossless audio compression codes too. — Justme, Mar 30 '22 at 20:18
MP3's core decisions about what data to throw away come down to the perceptual model trying to figure out which frequencies are inaudible, which frequencies are masked, etc. One would need to look at the details of Datasette encoding to see if it uses frequencies that are likely to be considered masked or inaudible in this context. — Charles Duffy, Mar 30 '22 at 23:18
@jwh20 You can reduce the data loss to a very small amount with the right encoding parameters for certain audio streams in MP3, but it is still lossy regardless. And, unlike MP4, MP3 is not a container format, so you can’t, for example, shove an audio stream using a different compression algorithm (such as FLAC) into it. — Austin Hemmelgarn, Mar 31 '22 at 01:41
Tape recording isn't lossless as well (quite the opposite, indeed) - So what you're discussing likely isn't even relevant. — tofro, Mar 31 '22 at 07:46
@jwh20 MP3 is necessarily lossy. MP3 uses reconstruction filters that will never be bit-exact at any bitrate... and the MP3 spec also has a 320k cap. There are some formats that are capable of being either lossy or lossless depending on bit allocation, but MP3 is provably not one of them. — hobbs, Mar 31 '22 at 13:37
Here's more info from Stanford University, re: MP3 = Lossy Data Compression — ashleedawg, Mar 31 '22 at 14:54
@scruss It's of course lossy if you take the analog signal as the starting point, or a signal containing frequencies mp3 will remove. But one can conceive of an input which will be converted without loss. — vsz, Apr 01 '22 at 06:16
@vsz: After a few round-trips through an MP3 encoder/decoder, you might find a digital input that settles down to bit-exact decode, especially if you start with a very simple signal like a single sine wave. But that's not very interesting, since it's very unlikely for any real-world use-case of a non-trivial signal. The frequencies in the output will be very close, perhaps basically identical, but the exact values of at least some of the 16-bit PCM samples will differ in the low bits. — Peter Cordes, Apr 01 '22 at 20:58

score 16 · Accepted Answer · answered Mar 30 '22 at 20:13

TL; DR type of answer:

In short, MP3 is a lossy format that does distort the audio waveform in which the C64 data is stored, but just like you can still listen to morse code or music just fine on a noisy or distorted radio channel, the digital data signal on tape may be transferred and stored with audio equipmentand it can survive a lossy audio compression format, as long as the audio signal is not too distorted in the whole process for the C64 tape drive to pick up and restore the digital data signal within some margin of error.

A more in-depth answer:

MP3 is a lossy format, and while it does do damage to the signal waveform by removing frequency content to encode it smaller, the distorted signal waveform can still be good enough to be able to convey data. It depends on the MP3 encoder settings, obviously, but there will be other difficulties in the system of transferring tape data between C64 systems as MP3 audio files.

So first of all, when MP3 encoder has thrown away data, it cannot be filled in or recovered any more.

But this thrown away data is not C64 data bits, this is audio signal energy information that is thrown away.

Second, the audio energy that was not present to begin with in the original audio signal will not be encoded. For example, silence needs much less bits to encode than speech, music, or white noise.

Third, the big point which is also important, that the encoder will also keep information about energy or throw it away, by using a psychoacoustic model to determine what in the audio us humans can or cannot hear, and allocate more bits which need to be presented with higher fidelity than the parts that are not so important. So encoder does not care about what part of the audio waveform is important for transferring C64 data, the encoder only cares that us humans hear the C64 data similar enough before and after encoding.

Knowing that the C64 data on audio tape is basically represented as square wave signals that have high or low tone frequency (or rather, time between edges of the square waves) to represent bits of ones and zeroes, we can analyse how well it might pass MP3 encoding or get distorted. Simple test of generating 1 kHz square wave in an audio editor, exporting it as MP3 file at pretty low quality of 80-120 kbps, and importing it back reveals some distortion, but it genuinely looks good enough a square wave to work. There is slight peaking at the edges and the edges have some ringing, which is not surprising at all due to loss of high frequencies in the MP3 encoding.

So, transmitting C64 data over MP3 files is definitely doable.

The problems lie elsewhere than in the MP3 encoding.

The C64 data stream is a digital square wave signal, not an audio signal, even if contemporary home computers did use analogue audio signals to standard cassette recorders. C64 data is transmitted as digital square waves from C64 to tape drive (datassette), and the write head stores the edges of the fast digital square waveform to cassette tape, with fast magnetic transitions. So again, not audio. Playing the cassette back in an audio tape player that outputs analog audio will have bandwidth limited to audio frequencies, so the signal edges are slower. A real C64 datassette will directly convert the sharp magnetic transitions to digital square wave for sending it back to C64.

If the data tape is played in an audio tape player and fed into computer sound card for recording, it will be an audio waveform which is bandwidth limited square wave with some limited slew rate in the otherwise sharp transitions. But at least it can be recorded, stored as MP3 and distributed.

Playing the MP3 back to tape is also problematic. Since the computer plays already bandwidth limited square waves to tape recorder, the tape recorder will store the transitions of the square wave less sharply. Recording audio is more complex than just writing analog waveform directly as magnetic signal to tape. While many cheap devices did that, higher end recorders biased, or modulated, the audio with 100kHz sine wave, to allow for a better recording of the analog waveform. The recorders anyway limited the bandwidth, and the square wave transitions stored on tape are not very sharp, and the sharper they are, the better the tape works in real C64 datassette. As long as the square wave edges are reasonably sharp with high enough amplitude, it will be detected as a transition by the C64 datassette circuitry and will be output as fast clean square transition on the digital signal to the C64. But with too slow transitions of too low amplitude, the transition timing may get distorted or there may not be a transition detected at all.

And finally, the reason why lossy MP3 will not distort the waveform beyond recognition, is the fact that the signal is essentially just two different frequency square waves. And square waves just have energy at the base frequency and odd overtones. The more overtones that can be stored, the better looking the square wave is, but it is still limited by tape bandwidth. As tapes don't generally go above 15 kHz, and MP3 format starts also attenuating high frequencies after around 16 kHz, it means that for example 1 kHz square wave needs only 8 frequency peaks in an MP3 file. So the recording of C64 tape data has only energy at few sharp frequency peaks, and the rest of the frequency band is unused, so the MP3 encoder can use all the bits to encode the scarce energy content quite faithfully.

If a square wave is processed in ways that affect the phases of the different harmonics differently, it may have zero crossings in very different places from the original. If, for example, the 5th, 9th, and 13th harmonics were inverted, and everything beyond the 15th omitted, the signal would have extra zero crossings near what should be the centers of the flat portions. — supercat, Mar 30 '22 at 20:59
So... having said all that... does lossy audio compression damage datasette data or not? I.e., does it work? Are all of these effects damaging a significant amount of bits or not? — AnoE, Mar 31 '22 at 08:20
@AnoE So did you read the TLDR part of my answer? In short, it can work, or it can't, depending on MP3 compression settings you set, and the analogue tape equipment you use to read and write tapes. The recorded tapes may or may not work in the C64 datassette, and it may not even have anything to do with MP3 compression. — Justme, Mar 31 '22 at 16:02
I'm pretty sure that the 100 kHz tape bias is necessary to record sound (the magnetic domains are too small to encode the full range of audio frequencies, IIRC - though that possibly depends on the particular cassette type, e.g. ferric, metal, etc). — Toby Speight, Apr 01 '22 at 12:22

score 15 · Answer 2 · answered Mar 30 '22 at 15:47

15

As I understand the Datasette, it is demodulating a binary stream from two quite distinct audio frequencies recorded on the cassette tape. As such, the loss of frequency fidelity would have to be extreme before the frequencies would become close enough to no longer be distinct. So long as your compression is not so lossy, the binary data is still perfectly preserved.

answered Mar 30 '22 at 15:47

Brian H

60,767
20
200
362

1

Since there are only two frequencies, the FFT of them is quite clear. What normally happens is that frequencies below a certain threshold are removed. But there are only three choices here: two big signals of two frequencies and the rest zero. The frequency transform in the MP3 will extract very simple data, which can be encoded quite efficient. I think the only way to get damage is to start from a faulty tape. – chthon Mar 30 '22 at 16:16
5

That's true for Commodore's standard encoding, but does it also hold for fastloaders? If I recall correctly, some of them only one half-wave per frequency before switching -- I haven't had time to work out what that means for the FFT. – Michael Graf Mar 30 '22 at 16:28
@MichaelGraf Good point. Tape fast loaders are a dark mystery to me. – Brian H Mar 30 '22 at 16:46
1

A half pulse is nothing special, it just means half pulse compared to the original pulse. So fast load should be just as recordable and transferable as MP3 than standard data, unless the data rate is so fast that audio equipment such as standard cassette players, recorders and audio interfaces don't have the bandwidth to handle them. – Justme Mar 30 '22 at 20:26
1

Oh and the Datassette is not sensitive to frequencies and it does not demodulate any frequencies. It is as simple as magnetic transition edge on tape generating a small low-going or high-going voltage step on the tape head and then it is just conditioned and amplified back to a low or high going digital signal sent to C64. – Justme Mar 30 '22 at 20:56
Since MP3 compression work chiefly by removing inaudible frequencies due to masking, the use of only two distinct frequencies would mean that MP3 compression is likely to remove analog noise. – MSalters Apr 01 '22 at 11:38

score 11 · Answer 3 · answered Mar 30 '22 at 18:31

11

In the world of Spectrum emulation, at least, there was an initial reluctance to use a lossy compression method. Various people experimented and found that as long as the bitrate of the MP3 was sufficiently high (I have seen 192Kbps quoted) the quality was sufficient to allow loading cassette data with standard or turbo loaders.

answered Mar 30 '22 at 18:31

john_e

7,263
20
44

1

192,000 bps for 892.8 bps of raw data seems like a huge waste. I wonder if a FLAC file could hold the same data in fewer bits because it's non-lossy. – snips-n-snails Mar 30 '22 at 20:53
6

@snips-n-snails I suspect FLAC would be worse because the compression doesn't know the difference between signal and noise. What you really need is a custom compression with baked-in knowledge about what is significant, similar to speech codecs as opposed to general audio codecs. – ssokolow Mar 30 '22 at 21:06
1

@ssokolow What I would suspect is converting the file to pure binary data, and then having some sort of "tape emulator" that would read that data and say it came from the tape. – trlkly Mar 31 '22 at 03:23
2

@trlkly: Yes, demodulating back to the original binary data (and maybe using zstd or lzma on that) would be the correct and optimal way to implement such a codec, if you knew exactly what encoding it used. And would maybe be useful for getting data in/out of old machines to edit on a modern desktop. But if different save/load code on the same machine could use different formats (other comments mentioned a "fastloader"), you'd either have to autodetect them, or use a format that would end up with some redundancy for less efficient encodings but could still round-trip every relevant modulation – Peter Cordes Mar 31 '22 at 04:18
1

Again sticking to the Spectrum community: various file formats have been created to encode cassette data, such as the fairly generic Compressed Square Wave and the more Spectrum-specific TZX and PZX. The latter are able to store the original binary data with the necessary timing / encoding information to recreate the waveform, falling back to a generic count of times between edges if the encoding scheme is not known. – john_e Mar 31 '22 at 11:55

score 6 · Answer 4 · answered Mar 31 '22 at 10:37

From a real-world point of view, I used a tool to convert the binary snapshot of Acorn Electron cassette data back to an audio signal (in lossless WAV PCM format).

I then attempted to put this onto a 1st generation iPod touch and load into a real Acorn Electron. My first attempts failed, presumable to the lossy AAC compression (the files were automatically converted to AAC as part of the process to getting them onto the iPod). No amount of fiddling with the playback of the file made a difference to the Electron - it never showed any signs of picking up any signal from the fake cassette.

I then converted the WAVs to lossless AAC and tried again. Immediately the Electron picked up the first block of data, but it was slightly corrupted and didn't load. When I adjusted the tone of the playback via the equaliser settings, I was easily able to load the files without any issues at all.

This might not directly equate to the Commodore 64 and maybe the problem was not lossy compression, but I was pretty certain that it was that case as the results were so stark.

Modern equipment (even a first-gen iPod is considered modern in this context) with headphone jacks is generally providing lower-amplitude output than equipment did back then (to protect teenage ears). With a bit of amplification, lossy AAC might have worked perfectly well. — tofro, Mar 31 '22 at 13:38
I would not be suprised if lossy audio codecs were to mangle the phase information of the wavelets which will make it unreadable. — Jasen, Mar 31 '22 at 22:45

score 4 · Answer 5 · answered Mar 31 '22 at 04:09

4

Analog tape recording and playback itself is a lossy process. So is converting an analog waveform to digital PCM samples, even when stored in a “lossless” digital format.

So the real issue is whether the tape input circuit and decoder logic is robust enough to handle typical channel losses and distortion.

It’s quite possible that high bit-rate mp3 encoding and computer “sound card” playback could add less distortion than analog tape player noise and channel response.

answered Mar 31 '22 at 04:09

hotpaw2

8,183
1
19
46

One part of the problem is that the C64 Datassette has digital connection to C64. So in order to even transfer data between a cassette and a PC sound card, you have to use a standard audio cassette player and recorder which works with analog audio. C64 has no analogue tape interfaces. – Justme Mar 31 '22 at 11:41

score 3 · Answer 6 · answered Mar 31 '22 at 13:33

Wouldn't this damage the actual data and render the software useless?

No, or at least not necessary in all cases.

The original audio may encode so few bits per second that even a lossy compression of this audiodata will not change these few coded bits in a significant way. The way to get an intuition of this is to compare how many bytes are in the final mp3 and how many bytes are in the coded data. You may conclude that the mp3 is in reality a huge wasteful way of storing this data.

In a way, a lossy mp3 of a cassette is not so much different from a radio audio recording. Both are not identical to the original audio, but both can be used to obtain the original coded data.

Does lossy audio compression damage datasette data?

6 Answers6