If you’re streaming or gaming online, you’re probably familiar with RNNoise, which removes the noise that ruins your recordings or CS games, thanks to the magic of neural networks. The good news of the day is that a new version has just been released.
For those who don’t know, RNNoise is an open-source library developed by the geniuses at Xiph.Org and Mozilla that uses a recurrent neural network model to filter out noise in real time while preserving voice quality.
And the new features are cool:
- Firstly, they’ve added AVX2 and SSE4.1 optimizations to boost performance. Your CPU will run like never before!
- Then, they have integrated automatic detection of CPU features to make the best use of your hardware. With this, no matter what your CPU is, it will be able to run RNNoise in style!
- Finally, the icing on the cake is that the models provided are now trained only on public datasets. Goodbye private databases and hello transparency! 🌞
RNNoise isn’t just about eliminating noise during your video conferences. This marvel can also improve speech recognition, music processing, and many other tasks. In addition to the most likely voice, the model also indicates the reliability of its estimate, which is very useful for automatic speech recognition. But that’s not the only factor that comes into play: speaker characteristics, language models, and signal processing techniques are also important.
To test it, it’s very simple:
Start by cloning RNNoise’s GitHub repository:
git clone https://github.com/xiph/rnnoise.git
Then compile the thing by running these commands:
./autogen.sh
./configure
make
Pro tip: use the -march=native option in your CFLAGS to take full advantage of AVX2 optimizations!
You can now test RNNoise on a raw 16-bit/48kHz audio file:
./examples/rnnoise_demo input.pcm output.pcm
And voila, your audio will come out as clean as new, free of all unwanted noise. You’ll tell me about it!
If you want to dig deeper into the topic, I recommend you take a look at the RNNoise benchmarks on OpenBenchmarking. You’ll see that it’s far from being a gimmicky solution: on a good big CPU, you can process 60 times the real-time! Enough to livestream on Twitch with complete peace of mind. By the way, it’s funny to see that RNNoise is also a hit on exotic architectures such as the POWER or ARM chips. The developers have done a really good job of making their code portable. Respect! 🙌
Okay, I’m not going to bother you any longer, and I invite you to read the excellent article by Jean-Marc Valin. It’s fascinating to see how deep learning can be harnessed to improve traditional signal processing algorithms.