Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> If your DSP callback executes in 1ms 99.99% of the time but sometimes takes 10ms, you’re hosed.

I tend to agree, but...

From my recollection of using Zoom-- it has this bizarre but workable recovery method for network interruptions. Either the server or the client keeps some amount of the last input audio in a buffer. Then if the server detects connection problems at time 't', it grabs the buffer from t - 1 seconds all the way until the server detects better connectivity. Then it starts a race condition, playing back that amount of the buffer to all clients at something like 1.5 speed. From what I remember, this algo typically wins the race and saves the client from having to repeat themselves.

That's not happening inside a DSP routine. But my point is that some clever engineer(s) at Zoom realized that missing deadlines in audio delivery does not necessarily mean "hosed." I'm also going to rankly speculate that every other video conferencing tool hard-coupled missing deadlines with "hosed," and that's why Zoom is the only one where I've ever experienced the benefit of that feature.



The context for this article is writing pro audio software, where that kind of distortion would generally be as bad as a dropout, if not worse.


Yeah, 5ms is the threshold for noticeability as far as latency in pro-audio. Its like frame-rate for pro-gamers. The problem is your target user is highly specialized out side the the norms by a large margin. What makes audio even more difficult is that sub ms issues can cause phase and frequency distortion that can become even more noticeable than latency alone.


1. you do not need to be a highly specialized target user to detect latency between pressing a key on a MIDI keyboard and the corresponding sound being produced.

2. 3ms is typical in-air latency between a typical DAW user and their near-field monitors, so claims about sensitivity to times much lower than 5msec should be taken with some skepticism

3. In live contexts, many drum + bass pairings have more than 10ms of air latency between them, so ditto #2

4. On the other hand, no good reason to add to latency

5. For performance purposes, jitter is much worse than latency. Pipe organ players rapidly learn to deal with even whole seconds of latency, but almost nobody can deal with jitter (essentially, variable, unpredictable latency)

6. There are no sub-ms issues that will cause phase and frequency distortion. Those come from DSP errors, not handling of latency, which is just about always a constant, fixed feature of the data signal path. You may be thinking of stuff like comb filtering, but this is not related to the latency in the signal path in a correct setup.


The "MIDI timing" problem was often a combination of MIDI traffic limitations with limited CPU in the receiver.

What started off as a four note chord would be smeared out a little by MIDI, especially in the early days until everyone worked out that putting MIDI for an entire studio down a single cable was a bad idea.

Then you'd get some more smearing in the target synth CPU as the incoming notes were parsed. Then perhaps some more delay for each notes, because it took a while to send trigger and pitch messages to the hardware. Even more if there were if there were software envelopes involved and they had to be initialised.

This is still a problem with VSTs, on a smaller scale. There's some finite amount of processing that has to be done before sound starts being generated. Usually it's not very much, but there's always the possibility that two notes that should start in the same 5ms buffer slot will be spread across two of them because one note is just a little too late.

This isn't as objectionable as glitching, but it can still affect the timing feel, and - depending on the patch design - cause phasing effects between the notes.


1. MIDI traffic limitations are rarely the issue. The chord smearing that some people claim to be able to hear is not because of traffic but because the protocol is a serialized stream of individual note on/note off messages, and thus by definition there is no possible way for every message to arrive at the same time. However, the actual delays between a set of note on messages caused by the protocol is small enough that it is in the same range as human performance on both keyboards and string instruments. Note that MIDI has no collision detection or ACK-style replies, and you do not use "a single cable" for MIDI unless you have only 1 sender and 1 receiver. If it is a DAW sending "a lot" of MIDI to some external MIDI hardware, the only issues arise if the total amount of data to be sent exceeds the serial capacity of the hardware layer. This is not impossible to make happen, but even so-called black MIDI faces a challenge when doing this, even with classic (DIN) serial MIDI.

2. "parsing incoming notes" does not cause more smearing. Block-sized processing of audio causes a delay which is the "performance latency" that people complain about. It does not change the ordering or interval between note onsets.

3. the "finite amount of processing that has to be done before sound starts being generated" is irrelevant in a block processing architecture (which is used these days by all DAWs and all plugin APIs). As long as the plugin gets its work done within the time represented by the block,there is no additional latency caused by the plugin. If it doesn't, then there's a click anyway.

4. "there's always the possibility that two notes that should start in the same 5ms buffer slot will be spread across two of them". No, there isn't, If that happens, that's a coding error in either the plugin host or the plugin or both. But also, time is continuous. If the notes are supposed to be 3msec apart, it doesn't matter if they are 3msec apart within the same buffer/process cycle, or in two consecutive ones.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: