DETECTION OF ACOUSTIC EVENTS
20170343644 · 2017-11-30
Inventors
Cpc classification
International classification
Abstract
Disclosed is a method for detecting an acoustic event of interest in a space. In the method acoustic signal data is obtained from sensors and at least some candidate impulses are determined. The candidate impulses are mapped to a representation on a basis of an origin of the candidate impulse in question and it is determined, from the generated representation, at least one indication quantity representing a likelihood of an acoustic event of interest taking place in the specified positions in space and time. Finally, the at least one indication quantity is compared to a predetermined threshold and an indication is generated if the at least indication quantity meets the predetermined threshold. Also disclosed is a computing unit and a computer program product.
Claims
1. A method for detecting an acoustic event of interest in a space comprising a plurality of sub-spaces, the method comprising: a) obtaining (210) acoustic signal data from sensors, wherein the acoustic signal data from the sensors is tied to a common time reference, b) identifying (220) one or more candidate impulses from the acoustic signal data obtained from the sensors, c) defining (230) for each identified candidate impulse at least a time stamp within the common time reference and a sensor coordinate on the basis of the sensor which obtained the acoustic signal identified as candidate impulse, d) determining (240), for each candidate impulse, a signal source time in each spatial sub-space, in order to generate a representation of an origin of the candidate impulses in specified positions in space and time, e) determining (250), from the generated representation, at least one indication quantity representing a likelihood of an acoustic event of interest taking place in the specified positions in space and time, f) comparing (260) the at least one indication quantity to a predetermined threshold defined for the indication quantity in question, and g) generating (270) an indication that an acoustic event of interest is detected if the at least indication quantity meets the predetermined threshold defined for the indication quantity in question in at least one sub-space.
2. The method of claim 1, wherein the identification (220) is performed by filtering raw data obtained from at least one sensor.
3. The method of claim 1, wherein the step of determining (240) comprises: determining a time scale for a sub-space, dividing the time scale into plurality of bins each bin defining a time window within the time scale, positioning the candidate impulses to the time scale of each sub-space on a basis of source times of the candidate impulses within each sub-space.
4. The method of claim 1, wherein the step of determining (240) comprises: determining a grid comprising a spatial position and time as parameters, mapping the candidate impulses into the grid on a basis of source times of the candidate impulses.
5. The method of claim 1, wherein the indication quantity is a weighted sum of candidate impulses in the specified positions in space and time.
6. The method of claim 5, wherein the weighted sum is derived in at least one following way: an equal weight is provided for all candidate impulses, a weight derived from amplitude of a candidate impulse in question is provided for the candidate impulses.
7. The method of claim 6, wherein the amplitude of the candidate impulse in question is determined from the obtained corresponding acoustic signal.
8. The method of claim 1, the method further comprising: dividing at least the sub-space based on which the indication is generated into further sub-spaces, performing the method steps d), e), f) and g) of claim 1 for candidate impulses in the generated further sub-spaces.
9. A computing unit (120) for detecting an acoustic event of interest in a space comprising a plurality of sub-spaces the computing unit (120) comprising at least one processor (510); and at least one memory (520) including computer program code; wherein the processor (510) is configured to cause the computing unit at least to perform the method of claim 1.
10. A non-transitory computer readable medium on which is stored a computer program, comprising portions of computer program code configured to perform any methods of claim 1 when at least some portion of the computer program code is executed in a computing unit.
11. The method of claim 2, wherein the step of determining (240) comprises: determining a time scale for a sub-space, dividing the time scale into plurality of bins each bin defining a time window within the time scale, positioning the candidate impulses to the time scale of each sub-space on a basis of source times of the candidate impulses within each sub-space.
12. The method of claim 2, wherein the step of determining (240) comprises: determining a grid comprising a spatial position and time as parameters, mapping the candidate impulses into the grid on a basis of source times of the candidate impulses.
13. A computing unit (120) for detecting an acoustic event of interest in a space comprising a plurality of sub-spaces the computing unit (120) comprising at least one processor (510); and at least one memory (520) including computer program code; wherein the processor (510) is configured to cause the computing unit at least to perform the method of claim 2.
14. A computing unit (120) for detecting an acoustic event of interest in a space comprising a plurality of sub-spaces the computing unit (120) comprising at least one processor (510); and at least one memory (520) including computer program code; wherein the processor (510) is configured to cause the computing unit at least to perform the method of claim 3.
15. A computing unit (120) for detecting an acoustic event of interest in a space comprising a plurality of sub-spaces the computing unit (120) comprising at least one processor (510); and at least one memory (520) including computer program code; wherein the processor (510) is configured to cause the computing unit at least to perform the method of claim 4.
16. A computing unit (120) for detecting an acoustic event of interest in a space comprising a plurality of sub-spaces the computing unit (120) comprising at least one processor (510); and at least one memory (520) including computer program code; wherein the processor (510) is configured to cause the computing unit at least to perform the method of claim 5.
17. A computing unit (120) for detecting an acoustic event of interest in a space comprising a plurality of sub-spaces the computing unit (120) comprising at least one processor (510); and at least one memory (520) including computer program code; wherein the processor (510) is configured to cause the computing unit at least to perform the method of claim 6.
18. A computing unit (120) for detecting an acoustic event of interest in a space comprising a plurality of sub-spaces the computing unit (120) comprising at least one processor (510); and at least one memory (520) including computer program code; wherein the processor (510) is configured to cause the computing unit at least to perform the method of claim 7.
19. A computing unit (120) for detecting an acoustic event of interest in a space comprising a plurality of sub-spaces the computing unit (120) comprising at least one processor (510); and at least one memory (520) including computer program code; wherein the processor (510) is configured to cause the computing unit at least to perform the method of claim 8.
20. A non-transitory computer readable medium on which is stored a computer program, comprising portions of computer program code configured to perform any methods of claim 2 when at least some portion of the computer program code is executed in a computing unit.
Description
BRIEF DESCRIPTION OF FIGURES
[0024] The embodiments of the invention are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings.
[0025]
[0026]
[0027]
[0028]
[0029]
DESCRIPTION OF SOME EMBODIMENTS
[0030]
[0031] In order to perform the operations as will be described the detected acoustic signal data by different sensors shall be attached to a same time space. This may be achieved by arranging a common time reference, i.e. common clock signal, to the system and assigning a time stamp complying with the common time reference for at least some of the obtained acoustic signal data. The assignment of the time stamp may be performed by the computing unit 120. If the obtained acoustic signal data is in an analogous form, the computing unit 120 is configured to sample the obtained data prior to assigning the time stamps to the data. If the sensors 110A-110D provide digital data, the computing unit 120 assigns the time stamps directly to the discrete data obtained from the sensors 110A-110D. According to another embodiment of the invention the sensors 110A-110D may be configured to assign the time stamps directly to the obtained data. In such an implementation the common clock signal is provided to the sensors 110A-110D e.g. from the computing unit 120 or from any other entity.
[0032] Next a method according to an example of the invention is described by referring to
[0033] In step 220 the computing unit 120 is configured to identify one or more candidate impulses from the obtained data. The candidate impulse refers to an impulse type signal data, which may represent information on an impulse type event detected by at least one sensor. The identification may be based on a pre-filtering the obtained information, i.e. pre-filtering the raw data in some predefined manner. The pre-filtering may be based on a plurality of principles. For example, one applicable pre-filtering scheme is based on detecting local maximums and minimums meeting predefined thresholds and defining such data values as candidate impulses for further processing. Alternatively or in addition, the pre-filtering may e.g. be arranged with so called matched filtering, which provides a tool for forming an impulse-type signal data from the raw data when the raw data comprises an acoustic event which matches well with filter response. It may also be arranged that the sensors comprise the pre-filtering functionality in which case the computing unit 120 directly receives the candidate impulses as an input.
[0034] Next, in step 230, the computing unit 120 is configured to define at least the time stamp and the sensor coordinate assigned to each identified candidate impulse. Furthermore, the computing unit 120 may be configured to define, from the obtained acoustic signal(s) identified as the candidate impulse(s), amplitude for each identified candidate impulse(s) for purposes to be discussed later. If the identified candidate impulse is arranged to carry information on the sensor identifier, the computing unit may be arranged to determine the sensor coordinate by means of the sensor identifier information. For example, the information on a sensor coordinate may be stored together with a corresponding sensor identifier in a memory accessible by the computing unit 120 from which it is possible to query the sensor coordinate by means of the sensor identifier. As a result of steps 210, 220 and 230 the computing unit 120 comprises information on candidate impulses, which may represent information on an impulse type event and wherein each candidate impulse is provided at least with a time stamp and with a sensor coordinate, and additionally with amplitude information if applicable.
[0035] Next, the computing unit is configured to determine for each candidate impulse a signal source time 240 in each spatial sub-space based on known sensor coordinate. In other words the aim is to determine at which time the candidate impulse would have been produced at each spatial sub-space i.e. when an event has happened in each sub-space. The computation may be performed with the following equation:
wherein [0036] T.sub.n,i is an instant of time when an impulse signal i is produced at the center of the sub-space n, where n ∈ 1, . . . , N; [0037] t.sub.i is the instant of time (time stamp) when an impulse signal i is detected in a sensor; [0038] d.sub.n is a vector d.sub.n=[d.sub.x,n d.sub.y,n d.sub.z,n] describing the spatial position of a center of the n.sup.th sub-space in the space of interest; [0039] s.sub.m,i is a vector S.sub.m=[s.sub.x,m S.sub.y,m S.sub.z,m].sup.T describing the spatial position of sensor, where m ∈ 1, . . . , M; [0040] |d.sub.n−s.sub.m,i|.sub.F denotes the Eucledian distance between d.sub.n and s.sub.m,i, |.Math.|.sub.F denotes the Frobenius norm, and
[0041] v.sub.s is a velocity of sound.
[0042] The outcome of the step 240 is that for each candidate impulse it is determined at least a signal source time at the center of each sub-space in the space.
[0043] Based on the determination 240 a representation of an origin of the candidate impulses is generated in specified positions in space and time. In other words, the sub-space and the source time are used as parameters in the representation.
[0044] According to a first embodiment of the invention the representation may be generated by mapping the candidate impulses resulting from an event to time scale on a basis of the determined source times on a sub-space basis. The time scale is determined so that it spans sufficiently long history so that new candidate impulses may be mapped, i.e. represented, in each sub-space based on their respective source times in each sub-space. Further, the time scale defined for sub-space may be divided into a number of bins, as depicted in an exemplified way in
[0045] According to another embodiment of the invention the representation may be generated by establishing a four dimensional (4D) grid, wherein the dimensions are x, y, z coordinates and time t, which time refers to source time. Now, the event(s) detected by one or more sensors, i.e. all candidate impulses, are mapped in the four dimensional grid, which corresponds to the representation.
[0046] In response to the generation of the representation it is determined 250, from the generated representation, at least one indication quantity representing a likelihood of an acoustic event of interest taking place in the specified positions in space and time. The indication quantity may be a weighted sum of candidate impulses in the specified positions in space and time. According to a first example of the invention an equal weight is applied in the summing of candidate impulses. According to another example, the weights for candidate impulses are derived from the obtained acoustic signal(s) identified as the candidate impulse(s). Example weights of such indication quantities comprise the amplitudes, the absolute values of the amplitudes and the squared amplitudes.
[0047] Next, the at least one indication quantity is compared to a predetermined threshold defined for the indication quantity in question. For example, if the candidate impulses are mapped in a time scale, it may be determined if the indication quantity comprises a predetermined number of candidate impulses with an equal weight is mapped to a time scale defined for the sub-space in question, or within a distance to each other in the time scale. The determination, if a predetermined number of candidate impulses is mapped in the time scale in a predetermined manner, may e.g. be performed by setting a predetermined threshold for the amount.
[0048] If the candidate impulses are mapped in the 4D grid, as described above, the predetermined threshold may be defined so that it is determined if one or more cells defined by x, y and z coordinates and source time within the 4D grid comprises a predetermined number of mapped candidate impulses. Such a situation is depicted in
[0049] In step 270, an indication on a detection of an acoustic event of interest is generated if the at least indication quantity meets the predetermined threshold defined for the indication quantity in question in at least one sub-space (e.g. in at least one bin within the time scale). The meeting of threshold means that the threshold set for the indication is fulfilled (e.g. exceeded). According to an embodiment of the invention a special treatment for evaluating if the threshold is met or not shall be arranged for such a case that there simultaneously within a time window exist an equal number of occurrences in multiple bins wherein the number of occurrences meets the threshold. The special treatment may e.g. be based on a principle in which the indication is arranged to be performed to the bin, which is the closest to a centre of mass of all occurrences being involved in the determination. Similarly, a special treatment in case of 4D grid may be established if it turns out the multiple cells within the grid comprises an equal number of occurrences. The generation of the indication on a detection of an acoustic event may further generate more information with respect to the event. More specifically, the indication on a detection of an acoustic event provides useful information in terms of: 1) It is very likely that an event occurred in the indicated space and time, and 2) The candidate impulses involved in the detection are verified as likely resulting from an event of interest.
[0050] When the subset of time stamped impulses are validated as likely relevant through the proposed method, they may be used e.g. as an input for more sophisticated high accuracy position algorithms, which would otherwise be sensitive to errors in their input data. Furthermore, as the impulses also indicate a common time reference on the raw signals from various sensors, it is possible to perform signal classification by performing specialized processing using a data windows from the data buffers from each involved sensor, respectively, using the validated impulse time stamps as the common time reference for extracting the respective data windows from the individual sensors. Thirdly, the coarse information of the source time and position obtained by the method may be used to extract additional impulses from data buffers of the sensors that had missed the detection in the first phase, i.e. those sensors not included in the set of validated impulses. Hence, further processing on the verified candidate impulses may lead to enhanced positioning, timing, and classification of the acoustic event.
[0051] The space under detection in the method is divided into predetermined sub-spaces, as described above. The division is performed for the purpose of analyzing the events within the space. The size of the sub-spaces may vary and the impact of varying size of the sub-spaces is that it provides a tool for adjusting the accuracy of the solution. A rule of thumb is that the smaller the sub-spaces are the more accurate the detection is. In some implementation of the invention it is possible to define multiple sizes for the sub-spaces or even to combine the sub-space in the method. For example, if it is assumable that occurrence of event in some spatial area in the space is unlikely, or irrelevant e.g. in a sense of result, it is e.g. possible to combine the sub-spaces locally within the space.
[0052] Moreover, in some embodiments of the invention, after initial round of the method steps 210-270, the at least one sub-space which caused the detection, i.e. generated an indication, may be divided into further, i.e. smaller or finer, sub-spaces, and steps 240-270 may be repeated using this finer sub-space division i.e. the candidate impulses in the finer sub-spaces are evaluated accordingly. This second iteration is useful in order to either provide better estimate of the time and location, or to perform rejection of irrelevant impulses which did not actually result from the event of interest although they were included in the same initial sub-space by a chance. In some implementation at least some of the neighboring sub-spaces to the sub-space, which generated the indication, may be taken into the further division.
[0053] The generation of indication may be performed in multiple ways. For example, the computing unit may be configured to provide a sound or a visual effect representing the indication. In some further embodiments the indication may be arranged so that a 3D image is generated on a display, wherein the position of the event producing the indication is illustrated in the space under detection. Alternatively or in addition, the indication may be arranged so that generated information on the event is stored in a memory accessible by the computing unit. The mentioned ways to generate the indication are only examples and the invention is not limited to these examples only.
[0054] The above described method may be applied in multiple application areas. The solution is especially advantageous for monitoring events in a ball game, such as in tennis. In tennis there exists some predetermined events, such as strokes and bounces, within a space, i.e. in a volume within the area of tennis court. In other words, the aforementioned processing steps, either directly or indirectly, may provide necessary information for acoustic tracking of events in a ball game, such as tennis, enabling to make line-calls, animate single shots and rallies, estimate ball trajectories and shot speeds etc.. These pieces of information may be utilized for not only providing information for the audience of a ball game, but making further statistical analysis of individual players style of play to develop their game.
[0055] As already explained the
[0056] The computing unit 120 is configured to implement the method as described. The implementation of the method may be achieved by arranging the processor 510 to execute at least some portion of computer program code 521a -521n stored in the memory 520 causing the processor 510, and thus the computing unit 120, to implement one or more method steps as described. The processor 510 is thus arranged to access the memory 520 and retrieve and store any information therefrom and thereto. Moreover, the processor 510 is configured to control the communication through the communication interface 530 with any external unit, such as with the sensors. The processor 510 may also be configured to control the output of information, i.e. data. The processor 510 may also be configured to control storing of obtained and determined information. For sake of clarity, the processor herein refers to any unit suitable for processing information and control the operation of the apparatus, among other tasks. The mentioned operations may e.g. be implemented with a microcontroller solution with embedded software. Similarly, the invention is not limited to a certain type of memory only, but any memory type suitable for storing the described pieces of information may be applied in the context of the present invention. Some non-limiting examples of a computing unit 120 as described may be a personal computer, a laptop computer, a server, a mobile communication device, a tablet computer, a wrist-computer, a specific circuit connectable to other apparatus, device or system, and so on.
[0057] An example of the invention also relates to a non-transitory computer-readable storage medium, which stores at least portions of computer program code, wherein the portions of computer program code are computer-executable to implement the method steps in a computing unit or in a system as described. In general, the computer-readable storage medium may include a storage medium or memory medium, such as magnetic or optical media e.g. disc, DVD/CD-ROM, volatile or non-volatile media, such as RAM. The computer program code may be written in any form of programming language, including compiled or interpreted languages, and the computer program may be deployed in any form, including as a stand-alone program or as a subroutine, element or other unit suitable for use in a computing environment. A computer program code may be deployed to be executed on one computer or on multiple computers at one site or distributed across multiple sites and interconnected by a communication network. This definition comprises also any solutions based on so called cloud computing. The computer program code comprises instructions for causing the computing unit to perform one or more of the method steps as described above.
[0058] A minimum number of sensors suitable for detecting acoustic signals is four in order to implement the invention as described. In practice, however, it is preferred that the number of sensors is more than four for improving an accuracy of the invention. For example, it may be arranged that there are at least two sensors are positioned per each face defining the volume of interest. In order to improve the accuracy of the present invention it may be arranged so that obtained signals from different predefined sensors are compared and if a match is found the obtained signal may be considered as reliable. Furthermore, it may be arranged that not all sensors, the obtained signals from all sensors, are used in calculations for each sub-space.
[0059] Features described in the preceding description may be used in combinations other than the combinations explicitly described. Although functions have been described with reference to certain features, those functions may be performable by other features whether described or not. Although features have been described with reference to certain embodiments, those features may also be present in other embodiments whether described or not.