Sounds Of India google com (g. co/soundsofindia)
Indian musical instruments and music is loved by people all over the world. India is celebrating its 74th Independence Day on 15th August 2020. So, for all the Indian music lovers, a new musical experience is coming on their way with “Sounds of India (g. co/soundsofindia)” Everyone can enjoy this exceptional and unique experience on their mobile phone.
Sounds of India is built with Tensor Flow to give a new experience for all music lovers. This project is powered by Machine language, where models can transform the voice of users in different musical instruments. The whole concept of this project is unique and especially launched for Indian on this auspicious day. Users singing national anthem through the microphone of their mobile device can see how their voice converts in instruments on the live browser.
g. co/soundsofindia use the three most popular classical instruments The Sarangi, the Shehnai, and the Bansuri. These three instruments are the heart of Indian music and so developers have come up with it.
Role of DDSP
The whole project is developed in Magenta, which is already available within Google. It helps developers to bring out t their creativity with the help of machine learning. The processing of signals is fused with machine learning with the help of an open-source library “Differentiable Digital Signal Processing(DDSP).” Tone Transfer is one more such application that uses DDSP to convert sound in musical instruments.
Developers have used DDSP modules that are capable of controlling varying sound signals and synthesize them with musical instruments to bring our final product. One of the reasons to use DDSP in Tensor Flow is the efficiency capable enough to generate audio 1000 times faster than any other counterparts.
User Input is gone through neural networks where it helps to generate digital signals. The processing of messages is done through Digital Signal Processors and thus helps to transform user voice in instruments.
What is TFX (Tensor Flow Extended)?
TFX is one of the relevant platforms that have helped to make such a project possible. It includes data, training, validating and even modeling for the production of Machine Language. “TFX is responsible for training models that transform the user’s voice into one of the musical instruments. These converted models are then deployed on a web browser through TensorFlow.js to help users hear the transformed voice.”
TFX has helped to perform iteration with different instruments with model architecture. Developers were able to get a new model with the required result through the TFX pipeline. Overall, TFX has played a vital role in making this project possible for users on Independence day.
How to interact with the browser?
The main reason to deploy the project to browser is to make it comfortable for users to interact with machine learning. Users can hear the instrumental version of their song through by below-mentioned steps:
- Click the link https://soundsofindia.withgoogle.com/ which will make it open in a web browser just like any website.
- There is no need for any installation as there is no need for sensor data, graphics cards, or memory to execute the project.
- Users are required to sing through the microphone, which means the client is interacting only with the browser, and thus privacy is maintained.
- The machine learning model will convert the voice in any of the mentioned instruments, and the user can hear the new version of the song in the browser itself.
Why is the project browser-based?
It is quite rare to see something that if working in the client’s browser, but g. co/sounds of india is one of them. Machine learning models which are browser based as quiet small and require minimum bandwidth. It was still tough to find the most miniature models for each instrument, but with the help of TFX, it was lastly achievable. It has helped reduce memory footprints without impacting the quality of devices’ sound quality.
How was the TensorFlow.js model was created?
One of the challenging tasks was to create the TensorFlow.js DDSP model as high performance and quality was mist for “Sounds of India.” This project was to be executed on mobile devices, and so tone transfer must be done accordingly. Users must be mesmerized with the transformed version with good quality to make it memorable for them. So, developers started exploring more about TensorFlwo.js backbends along with architecture.
Lastly, WebGL backend along with WebAssmebly backend was finalized as it would work o low specification mobiles. Based on these, developers were settled at Convent- based DDSP model with WebGL backend. Many constant tensors were suppressed that helped to reduce the download time and size of the project.
Which three areas were considered for the TensorFlow.js model?
The three areas that we’re focused to make the project run smoothly on mobile devices are as follows:
- Reduce Memory
The main aim of launching this project was to make it available on as many mobile devices so that every Indian can try it. Many mobile devices do not have high processing, and so reducing memory footprint was significant. It helps to make the project run on browser that has limited GPU memory. Intermediate tensors were disposed and a new flag was used that allows disposal of GPU textures. Developers were able to reduce memory size by almost 60%, making it compatible with all mobile devices.
- Numerical Stability
DDSP models that used that requires to have high statistical precision to come up with soothing instrumental versions. Developers were working very hard as they were no ready to compensate for accuracy, as our ears can easily hear discontinuities. Developers opted for a generative model and even wrote some key ops to have the required output. All overflow and underflow output were managed and thus, they came up with the best musical model with numerical stability.
- Interference performance
DDSP models use a neural network and signal synthesizer. This signal synthesizer requires significant computation and so to improve its kernels with special WebGL were re-written. It helps to make it run on mobile devices with a common GPU and also able to reduce interference time.
Try it out!
If you want to try something new, try this unique musical experience on your mobile device. Users who have been attempting it must not forget to share their expertise and the transformed version o their voice. If you, too, are very excited to know about the whole processing of models, learn about Magenta. There are many blogs available for Tensor Flow and Magenta that would help learn something new and make sure to try out “Sounds of India.”