Why aren't volume controls based on average power output or something?

Sodium_nitride@lemmygrad.ml · 16 days ago

Why aren't volume controls based on average power output or something?

dizzy@lemmy.ml · 16 days ago

I don’t know why they don’t, I work in music rather than TV/Film but it infuriated me too! Give me a voice volume control! It would be technically very easy to do implement as a standard but the powers that be just haven’t come together and done it!

The_Grinch [he/him]@hexbear.net · 16 days ago

I’m glad to hear I’m not the only one thinking it!

Do you think it could be done by diffing a few of the different language tracks?

dizzy@lemmy.ml · edit-2 15 days ago

Unfortunately no, audio files are actually really dumb in that they’re basically just a file of 44100 (or 48000 or 96000 etc) amplitude numbers per second.

So there’s nothing really to diff because it’s basically just a squiggly line, set of squiggly lines or, when compressed, a mathematical expression that when decompressed, recreates a squiggly line.

You could isolate the dialog if you got ahold of a version with no dialog at all and then inverse the polarity of that and sum it with the original but it’s unlikely you’ll find a version without any vocals.

Machine learning vocal isolation tools are probably going to be the best way to go about it as a DIY approach. Ultimate Vocal Remover 5 with the demucs 4 algo is great FOSS software to extract vocals and you could sum that with the original track and adjust the gain to get louder dialogue… it would be a lot of work though…

The_Grinch [he/him]@hexbear.net · 15 days ago

I don’t really understand still but thanks for trying all the same.