Patent classifications
G01S3/8006
SYSTEMS AND METHODS FOR DISPLAYING A USER INTERFACE
An electronic device includes a display, wherein the display is configured to present a user interface, wherein the user interface comprises a coordinate system. The coordinate system corresponds to physical coordinates. The display is configured to present a sector selection feature that allows selection of at least one sector of the coordinate system. The at least one sector corresponds to captured audio from multiple microphones. The sector selection may also include an audio signal indicator. The electronic device includes operation circuitry coupled to the display. The operation circuitry is configured to perform an audio operation on the captured audio corresponding to the audio signal indicator based on the sector selection.
Voice processing device, voice processing method, and voice processing program
A separation unit separates voice signals of a plurality of channels into an incoming component in each incoming direction, a selection unit selects a statistic corresponding to an incoming direction of the incoming component separated by the separation unit from a storage unit which stores a predetermined statistic and a voice recognition model for each incoming direction, an updating unit updates the voice recognition model on the basis of the statistic selected by the selection unit, and a voice recognition unit recognizes a voice of the incoming component separated using the voice recognition model.
FACILITATION OF EFFICIENT SIGNAL SOURCE LOCATION EMPLOYING A COARSE ALGORITHM AND HIGH-RESOLUTION COMPUTATION
Facilitation of determination of detailed location of a source signal is provided. In one embodiment, a device comprises a memory that stores computer executable components; and a processor that executes computer executable components stored in the memory. The computer executable components can comprise: a low-resolution computation logic component that implements a coarse algorithm and determines an approximate direction of arrival (DOA) of a source signal of an input signal, wherein the coarse algorithm uses both a coarse spatial grid and input data received from the input signal to determine the approximate DOA; and an error estimation logic component that estimates an estimation error of the coarse algorithm, and wherein the error estimation logic component uses the estimation error and the approximate DOA to determine a spatial interval range.
PERSISTENT INTERFERENCE DETECTION
A multi-microphone algorithm for detecting and differentiating interference sources from desired talker speech in advanced audio processing for smart home applications is described. The approach is based on characterizing a persistent interference source when sounds repeated occur from a fixed spatial location relative to the device, which is also fixed. Some examples of such interference sources include TV, music system, air-conditioner, washing machine, and dishwasher. Real human talkers, in contrast, are not expected to remain stationary and speak continuously from the same position for a long time. The persistency of an acoustic source is established based on identifying historically-recurring inter-microphone frequency-dependent phase profiles in multiple time periods of the audio data. The detection algorithm can be used with a beamforming processor to suppress the interference and for achieving voice quality and automatic speech recognition rate improvements in smart home applications.
AUDIO RECOGNITION METHOD, METHOD, APPARATUS FOR POSITIONING TARGET AUDIO, AND DEVICE
This application discloses a method for positioning a target audio signal by a computer device. The method includes: performing echo cancellation on the audio signals collected in a plurality of directions in a space, the audio signals comprising a target-audio direct signal; obtaining weights of a plurality of time-frequency points in the echo-canceled audio signals, a weight of each time-frequency point indicating a relative proportion of the target-audio direct signal in the echo-canceled audio signals at the time-frequency point; obtaining a weighted audio signal energy distribution of the audio signals in the plurality of directions by using the weights of the plurality of time-frequency points in the echo-canceled audio signals; and obtaining a sound source azimuth corresponding to the target-audio direct signal in the audio signals by using the weighted audio signal energy distribution of the audio signals in the plurality of directions.
DEVICE AND METHOD FOR DETERMINING A SOUND SOURCE DIRECTION
A device for determining a sound source direction determines a direction in which a source of a reached sound exists, based on at least one of a sound pressure difference between a first sound pressure that is a sound pressure of a first frequency component of a first part of the reached sound acquired by a first microphone and a second sound pressure that is a sound pressure of the first frequency component of a second part of the reached sound acquired by a second microphone, and a phase difference between a first phase that is a phase of a second frequency component of the first part of the reached sound and a second phase that is a phase of the second frequency component of the second part of the reached sound.
PERFORMANCE OF A TIME OF FLIGHT (ToF) LASER RANGE FINDING SYSTEM USING ACOUSTIC-BASED DIRECTION OF ARRIVAL (DoA)
An acoustic-based Direction of Arrival (DoA) system uses acoustic information to determine the direction of incoming sound, such as a person talking. The direction of the sound is then used to focus a laser-based time of flight (ToF) system to narrow the area of laser illumination, improving the signal to noise ratio because laser illumination is focused on the direction of the sound. The DoA system also provides elevation information pertaining to the source of the sound, to further narrow the required field of view of the laser ToF system.
DIRECTION OF ARRIVAL ESTIMATION IN MINIATURE DEVICES USING A SOUND SENSOR ARRAY
A hearing device comprises a sound system for estimating the direction of arrival of sound emitted by one or more sound sources creating a sound field. The sound system comprises an array of N sound receiving transducers (microphones), each providing an electric input signal, a processing unit comprising a) a model unit comprising a parametric model configured to be able to describe the sound field at the array as a function of the direction of arrival in a region surrounding and adjacent to the array; b) a model optimizing unit configured to optimize said model with respect to its parameters based on said sound samples; c) a cost optimizing unit configured to minimize a cost function of the model with respect to said direction of arrivals; d) an estimating unit configured to estimate the direction of arrival based on said parametric model with the optimized parameters and the optimized cost function.
Sound Source association
Multiple Holocam Orbs observe a real-life environment and generate an artificial reality representation of the real-life environment. Depth image data is cleansed of error due to LED shadow by identifying the edge of a foreground object in an (near infrared light) intensity image, identifying an edge in a depth image, and taking the difference between the start of both edges. Depth data error due to parallax is identified noting when associated text data in a given pixel row that is progressing in a given row direction (left-to-right or right-to-left) reverses order. Sound sources are identified by comparing results of a blind audio source localization algorithm, with the spatial 3D model provided by the Holocam Orb. Sound sources that corresponding to identifying 3D objects are associated together. Additionally, types of data supported by a standard movie data container, such as an MPEG container, is expanding to incorporate free viewpoint data (FVD) model data. This is done by inserting FVD data of different individual 3D objects at different sample rates into a single video stream. Each 3D object is separately identified by a separately assigned ID.
System and method for speech enhancement in multisource environments
A method, computer program product, and computer system for receiving, by a computing device, a first signal emitted from one or more sources. A second signal may be received emitted from the one or more sources. A first confidence level that the wake-up-word is included in the first signal may be determined. A second confidence level that the wake-up-word is included in the second signal may be determined. It may be identified that the wake-up-word originated from a first source of the one or more sources based upon, at least in part, the first and second confidence levels. The first source may be enabled to participate in a dialog phase. The second source may be excluded from participating in the dialog phase.