fbpx
News

Google details how ML helps transcribe, recognize audio in Recorder app

Recorder uses machine learning to transcribe, recognize and tag audio

Pixel 4 Recorder app

Following up an extensive explainer on some of the artificial intelligence (AI) and machine learning (ML) technology behind the camera in the Pixel 4, Google’s AI team has posted another blog detailing how the new Recorder app works.

Recorder launched on Pixel 4 and recently made its way to older Pixel devices. The app offers the ability to not only record audio but transcribe it and recognize different sounds so users can search through recordings with ease.

In the post, Google notes that it built Recorder because “much of the world’s information is conveyed through speech,” which can be difficult to look through to find information after the fact. Google created Recorder to help users find that information.

The app uses an on-device machine learning model optimized for long, multi-hour audio sessions. Google also made sure that Recorder could index conversations “by mapping words to timestamps as computed by the speech recognition model.” If you’ve used the Recorder app, this is what allows you to tap on a word in the transcription and start playback from that specific point. It also enables the ability to search recordings for a word and jump directly to that point in the audio.

However, Google had to contend with how best to show information as well. The transcript view is useful for finding specific words, but not so helpful when looking at the audio. So, Google designed a waveform view that’s colour-coded based on sound segments. In other words, Recorder recognizes different sounds, such as speech, whistling or music, and designates the audio by colouring the waveform based on the dominant sound at that time in the recording.

Finally, Recorder also tries to make it easy for users to remember what’s in a recording by offering up three suggested tags that it things represents the content best. To present this information immediately after the recording ends, the app analyzes recording content during transcription. It counts occurrences of a term, determines the importance based on grammatical role in a sentence — with priority given to nouns — and more to suggest the most memorable tags.

All in all, Recorder is an impressive bit of software, and if you haven’t had a chance to try it out, you should. It’s been a life-saver for me and will no doubt prove helpful for you as well. If you don’t have a Pixel device, it’s possible to download Recorder’s APK and install it unofficially, but your mileage may vary in terms of effectiveness.

If you’re interested in digging into the finer details, check out the Google AI blog post here.

Source: Google Via: 9to5Google

Related Articles

Comments