Patent classifications
G01S3/8006
Persistent interference detection
A multi-microphone algorithm for detecting and differentiating interference sources from desired talker speech in advanced audio processing for smart home applications is described. The approach is based on characterizing a persistent interference source when sounds repeated occur from a fixed spatial location relative to the device, which is also fixed. Some examples of such interference sources include TV, music system, air-conditioner, washing machine, and dishwasher. Real human talkers, in contrast, are not expected to remain stationary and speak continuously from the same position for a long time. The persistency of an acoustic source is established based on identifying historically-recurring inter-microphone frequency-dependent phase profiles in multiple time periods of the audio data. The detection algorithm can be used with a beamforming processor to suppress the interference and for achieving voice quality and automatic speech recognition rate improvements in smart home applications.
Device and method for sound localization
A device for sound localization includes a spatial feature generator, a voice detector, an angle selector, and an angle retriever. The spatial feature generator generates M spatial feature signals according to signals of N microphones of a microphone array. The voice detector generates at least one voice detection signal according to at least one of the signals of the N microphones. The angle selector outputs a candidate angle signal according to the M spatial feature signals to indicate a candidate direction of sound. The angle retriever generates a sound detection result according to the M spatial feature signals to indicate whether any sound source exists, and then outputs an estimated angle signal indicative of a direction of sound according to the sound detection result, the at least one voice detection signal, and the candidate angle signal.
System and method for feature based beam steering
A method, computer program product, and computer system for identifying, by a computing device, a plurality of sources. One or more feature values of a plurality of features may be assigned to a first source of the plurality of sources. One or more feature values of the plurality of features may be assigned to a second source of the plurality of sources. A first score for the first source and a second score for the second source may be determined based upon, at least in part, the one or more feature values assigned to the first source and the second source. One of the first source and the second source may be selected for spatial processing based upon, at least in part, the first score for the first source and the second score for the second source.
Simultaneous acoustic event detection across multiple assistant devices
Implementations can detect respective audio data that captures an acoustic event at multiple assistant devices in an ecosystem that includes a plurality of assistant devices, process the respective audio data locally at each of the multiple assistant devices to generate respective measures that are associated with the acoustic event using respective event detection models, process the respective measures to determine whether the detected acoustic event is an actual acoustic event, and cause an action associated with the actional acoustic event to be performed in response to determining that the detected acoustic event is the actual acoustic event. In some implementations, the multiple assistant devices that detected the respective audio data are anticipated to detect the respective audio data that captures the actual acoustic event based on a plurality of historical acoustic events being detected at each of the multiple assistant devices.
AUDIO PROCESSING
According to an example embodiment, a method for audio focusing is provided, the method comprising: receiving a multi-channel audio signal that represents sounds in sound directions that correspond to respective positions in an image area of an image; receiving an indication of an audio focus direction that corresponds to a first position in the image area; selecting a primary sound direction from a plurality of different available candidate directions, wherein said plurality of different available candidate directions comprise said audio focus direction and one or more offset candidate directions and wherein each offset candidate direction corresponds to a respective candidate offset from said first position in the image area; and deriving, based on said multi-channel audio signal in dependence of the selected primary sound direction, an output audio signal where sounds in sound directions defined via the selected primary sound direction are emphasized in relation to sounds in sound directions other than those defined via the selected primary sound direction.
DETECTION AND CLASSIFICATION OF SIREN SIGNALS AND LOCALIZATION OF SIREN SIGNAL SOURCES
In an embodiment, a method comprises: capturing, by one or more microphone arrays of a vehicle, sound signals in an environment; extracting frequency spectrum features from the sound signals; predicting, using an acoustic scene classifier and the frequency spectrum features, one or more siren signal classifications; converting the one or more siren signal classifications into one or more siren signal event detections; computing time delay of arrival estimates for the one or more detected siren signals; estimating one or more bearing angles to one or more sources of the one or more detected siren signals using the time delay of arrival estimates and a known geometry of the microphone array; and tracking, using a Bayesian filter, the one or more bearing angles. If a siren is detected, actions are performed by the vehicle depending on the location of the emergency vehicle and whether the emergency vehicle is active or inactive.
SOUND SOURCE ENUMERATION AND DIRECTION OF ARRIVAL ESTIMATION USING A BAYESIAN FRAMEWORK
One embodiment provides a method of sound source enumeration and direction of arrival (DoA) estimation. The method, the method includes estimating, by an enumeration module, a number of sound sources associated with an acoustic signal. The estimating includes selecting a specific parametric model from a generalized model. The generalized model is related to a microphone array architecture used to capture the acoustic signal. The method further includes estimating, by a DoA module, a direction of arrival of each sound source of the number of sound sources based, at least in part, on the selected model. The estimating the number of sound sources and estimating the DoA of each sound source are performed using a Bayesian framework.
Detection device and method for audio direction orientation and audio processing system
A detection device and a method for audio direction orientation and an audio processing system are provided. The device includes a first filter, which performs a first infinite impulse response operation on each first audio beam to generate second audio beams; an absolute value operator which performs an absolute value operation on amplitude of each second audio beam to generate third audio beams; a second filter which performs a second infinite impulse response operation on each third audio beam to smooth each third audio beam to generate fourth audio beams; and a DOA processor which divides the fourth audio beams into audio beam groups, and selects a selected audio beam from each audio beam group according to energy of each fourth audio beam in each audio beam group to output beam information corresponding to the selected audio beams and used in a speech recognition and for determining a voice direction.
Two-dimensional direction-of-arrival estimation method for coprime planar array based on structured coarray tensor processing
A two-dimensional direction-of-arrival estimation method for a coprime planar array based on structured coarray tensor processing, the method includes: deploying a coprime planar array; modeling a tensor of the received signals; deriving the second-order equivalent signals of an augmented virtual array based on cross-correlation tensor transformation; deploying a three-dimensional coarray tensor of the virtual array; deploying a five-dimensional coarray tensor based on a coarray tensor dimension extension strategy; forming a structured coarray tensor including three-dimensional spatial information; and achieving two-dimensional direction-of-arrival estimation through CANDECOMP/PARACFAC decomposition. The present disclosure constructs a processing framework of a structured coarray tensor based on statistical analysis of coprime planar array tensor signals, to achieve multi-source two-dimensional direction-of-arrival estimation in the underdetermined case on the basis of ensuring the performance such as resolution and estimation accuracy, and can be used for multi-target positioning.
Method and Apparatus for Robust Low-Cost Variable-Precision Self-Localization With Multi-Element Receivers in GPS-Denied Environments
A practically implementable robust direction-of-arrival (DoA) estimation approach that is resistant to localization errors due to mobility, multipath reflections, impulsive noise, and multiple-access interference. As part of the disclosed invention the inventors consider infrastructure-less 3D localization of autonomous underwater vehicles (AUVs) with no GPS assistance and no availability of global clock synchronization. The proposed method can be extended to challenging communication environments and applied for the localization of assets/objects in space, underground, intrabody, underwater and other complex, challenging, congested and sometimes contested environments. Each AUV leverages known-location beacon signals to self-localize and can simultaneously report its sensor data and measurement location. The approach uses two known location beacon nodes, where the beacons are single-hydrophone acoustic nodes that are deployed at known locations and transmit time-domain coded signals in a spread-spectrum fashion.