Apple and Carnegie Mellon University research shows how smart devices can use acoustics to build situational awareness

Some smart devices are smarter than others, with a range of capabilities that make them truly helpful on a regular basis. But they certainly haven’t reached the ceiling of usefulness quite yet, and new research shows even more potential.

A new paper released by Carnegie Mellon University’s Human-Computer Interaction Institute (via TechCrunch), in conjunction with Apple, shows how embedded AIs can learn by listening to noises in their environment. All of this can be done without any up-front training data, and can be learned over time without the individual user having to do much more than let it happen.

The goal is to give smart devices the ability to learn from their surroundings, and, ultimately, give them situational awareness and make them even more helpful.

The paper calls the new system “Listen Learner”, and it means that a smart device with a microphone –like the HomePod– to “interpret events taking place in its environment”. Now, while the majority of this is done behind the scenes and without any major input by the user, it appears that the speaker will actually need some help, at least initially. So, as an example, if your HomePod is near the kitchen and your microwave dings to let you know it’s done cooking, the HomePod (or other smart speaker) can ask: “What’s that sound?”

Once you let the smart speaker know that the microwave is done, and the speaker can identify that sound, the smart device can then let you know when something is done cooking.

A general pre-trained model can also be looped in to enable the system to make an initial guess on what an acoustic cluster might signify. So the user interaction could be less open-ended, with the system able to pose a question such as ‘was that a faucet?’ — requiring only a yes/no response from the human in the room.

Refinement questions could also be deployed to help the system figure out what the researchers dub “edge cases”, i.e. where sounds have been closely clustered yet might still signify a distinct event — say a door being closed vs a cupboard being closed. Over time, the system might be able to make an educated either/or guess and then present that to the user to confirm.

The paper is readily available to read through if you’re interested. Or, if you prefer the movie adaptation (joking), you can check out the video below, which demonstrates the system working in the real world.

This is a great way to actually avoid needing to litter your house with more and more smart devices. Not that that’s necessarily a bad thing, but the cost can certainly be a bump in the road for many potential customers. But for those who already have a HomePod, or other smart speaker from another company, this could be a nice way to add even more smarts to the home without having to fork over even more cash.

It’s pretty exciting, but, considering it’s a new system, it might be a while before it ever sees the light of day, in any capacity, for consumer-facing products.

Still, what do you think? Would you take advantage of a system like this? Let us know in the comments below.