How much is the AI necessary for this? At least for the targeting of sounds in t...

lm28469 · on May 29, 2024

> but I don’t know about the human voice identification.

> The headphones send that signal to an on-board embedded computer, where the team’s machine learning software learns the desired speaker’s vocal patterns

Their "AI" is good ol dumb machine learning

meindnoch · on May 29, 2024

Microphone array beamforming: http://www.labbookpages.co.uk/audio/beamforming.html

KaiserPro · on May 29, 2024

Depends on how you do it.

If you have good eyetracking, a microphone array and decent object tracking on your AR glasses, then you don't really need much "AI" (ie you have access to https://facebookresearch.github.io/projectaria_tools/docs/AR...)

but its not quite possible to do it all on device yet. However its not far off.

INTPenis · on May 29, 2024

Directional mics were a toy 30 years ago, but an AI that can pick out a single voice and isolate it for you is quite the contemporary achievement.

willis936 · on May 29, 2024

Yeah I'm not really sure what's going on here. Sonar has been using ML classifiers for decades but afaik stream splitting with 100% confidence is currently considered magic. So what did they apply or what advance did they make? Afaict they threw some audio into a GPT blender without a closer look at what's being done.

Edit: I found the link to the paper. It isn't stream splitting so much as it is GPT-assisted beamforming estimation. Good stuff for sure.

https://dl.acm.org/doi/10.1145/3613904.3642057

i5heu · on May 29, 2024

I think one could build quite the good system with 2 directional microphones and then do some beamforming or how it is called to isolate the depth one want to perceive.

But this is super expensive since you need calibrated mics etc.

The biggest advantage of neural nets in this field is that you can use a dirt cheap microphone and postprocess it so good that it is good enough or even very good for humans.

kleiba · on May 29, 2024

It's necessary for sales.