Always On: Best practices for Audio UX on Microphone Enabled Devices

Published on 12 April 2018

Minimalist illustration showing various smart speakers and digital assistants on a blue background. The devices are depicted in black and white, arranged in a timeline-like progression from left to right. The illustration includes different shapes and sizes of smart speakers, representing various brands and models.

With the growing popularity of voice and audio-enabled products, such as the Amazon Echo and Google Home, it seems reasonable for consumers to be concerned about exactly what audio data is being collected, stored and shared.

Furthermore, in light of the recent news stories regarding Silverpush and more recently Alphonso, which have raised serious privacy concerns due to background listening being used to stealthily track users behaviour — microphone technologies are increasingly under the spotlight.

At Chirp, we believe in being transparent to the user about how and when their audio data is being used. Whether that is voice assistants such as Alexa or Google Assistant or our very own audio-mediated connectivity technologies — Chirp Connect (data-over-sound) or Chirp React.

Below we look at a number of points to consider whilst integrating audio technologies in a product or app.

Requesting microphone permissions

A general rule of thumb for requesting permissions is that the request should be in context and communicate the value that the access will provide.

Apps should clarify why each permission request is needed, either through the feature name or an explanation provided. Permissions that are less clear should provide education about what the permission involves.

Presenting the user with a request in that context is a gesture of transparency and good faith on the part of the developer — simultaneously indicating that they have nothing to hide, and providing the inquiring user with a deeper insight into how their new app actually does what it does.

This is particularly true for when requesting microphone permission, which can be particularly sensitive. Displaying a primer like the following is a good way to reassure the user why the audio permission is required.

Two iPhone permission screens from the Chirp Share app. The left screen shows Chirp's yellow square mascot with a cute face and a request to use the microphone. The right screen displays a system permission dialogue for microphone access, with icons for music, images and video floating in the background.

Providing context for needing microphone permissions, best practice for microphone permissions


Foreground use


For mobile apps and visual based UIs in general, we advocate foreground use. What this means is that the app running audio technology must only be listening when it is in the foreground, i.e. the current app that the user is using.

Again, this is all about being transparent to the user and operating in good faith. Much in the same way you wouldn’t want your camera to be enabled in the background without your permission — the same is true for your microphone.

Furthermore, Android P, the next major release of the Android OS, Google will prevent apps from using your smartphone’s microphone or camera whenever they’re in the background and not actively being used on screen.

How the audio data is used

Following on the theme of transparency, it’s important to let the user know how exactly the audio-data is used and this detail should not just be buried in a privacy policy somewhere.

A spectrogram visualization showing audio frequencies over time. The image uses a heat map colour scheme where brighter orange/yellow indicates stronger frequency components and darker purple represents weaker signals. The vertical axis shows frequency bands from 0 to 4096 Hz, while the horizontal axis shows time from 0 to 7 seconds.

A Spectrogram, a common way of visualising audio data

When Alexa first emerged on the market, it was unclear how the data was stored which led to speculation and concern from some users about what happens with your audio data once it is on Amazon’s servers and if it is stored there indefinitely.

Since then, Amazon has taken steps in the right direction to address these concerns, and now it is possible to delete all of Alexa’s voice recordings associated with your account.

With Chirp on the other hand, the audio processing is done entirely on the device. No audio data is ever saved to disk or uploaded to an external server — once the audio data has been processed it is forgotten about. This being fundamental to Chirp’s ability to operate offline or in airplane mode.

Privacy threats — how not to do it

Unfortunately some audio technologies have been less than reputable and the industry’s reputation has been left somewhat tarnished due to the more nefarious implementations. A prime examples of this is the notorious Silverpush SDK and the Shopkick app, that use Ultrasonic side channels to unnoticeably track a user’s location, behavior and devices.

In March 2016, the USA’s Federal Trade Commission (FTC) sent warning letters to twelve Android app developers that were apparently using the Silverpush SDK in their apps. The point of contention was that the apps are requesting microphone permissions without a clear need for them, and do not appear to properly notify users of their intent. This means that the developers may be in violation of US law if users have not been notified about what information the apps are collecting.

Thankfully, Silverpush decided to drop this particular technology. However, a similar technology in the form of Alphonso continues to prevail.

Conclusion

The recurring theme in all of this is the importance of building up trust with the user, via transparency and openness — a demand which is not unique to audio applications within the technology industry.

With Voice assistants like Alexa and Google Assistant being undeniably on the rise, with some analysts claiming that voice will become the dominant user interface, it’s important that the actions of a minority of rogue actors, do not detract from the innovation in this field.

Chirp’s technologies bring unique connectivity experiences and a plethora of new use cases that would not otherwise be possible with existing RF based connectivity technologies.

Row of four digital assistant logos: Google Assistant's colourful dots, Amazon Alexa's blue ring, Apple Siri's dynamic purple and blue orb, and Cortana's blue circle.

Logos of various voice assistants


Chirp is a technology company enabling a seamless transfer of digital information via sound-waves, using a device’s loudspeaker and microphone only. The transmission uses audible or inaudible ultrasound tones and takes place with no network connection. To learn more visit chirp.io