DETECTION METHODS TO DETECT OBJECTS SUCH AS BICYCLISTS
20230017357 · 2023-01-19
Inventors
Cpc classification
G06V20/58
PHYSICS
International classification
Abstract
To reliably detect an object such as a bicycle at increased range, an Advanced Driving Support System uses a deep neural network(s) to process an ambient (grey-scale) image into an object that is then tracked by a second range detection camera. Most objects of interest, such as bicycles and automobiles, are outfitted with one or more retroreflectors that are used to cue the neural network to the object of most interest. As the retroreflectors also tend to saturate the range detection camera, a method is used to manage the saturation and estimate the correct range to the object.
Claims
1. A method for reliably detecting an object such as a bicycle at increased range comprising: a sensor, and a deep neural network(s) operatively coupled to the sensor, the deep neural network(s) configured to process an ambient (grey-scale) image from the sensor.
2. The method of claim 1 further comprising a combiner that combines an output of the deep neural network(s) with distance information determined by the at least one processor.
3. The method of claim 1 further including at least one processor that processes captured retro-reflector information to determine the distance information based on neighboring pixels.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0017]
[0018]
[0019]
[0020]
[0021]
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
[0028]
[0029]
[0030]
[0031]
DETAILED DESCRIPTION OF EXAMPLE NON-LIMITING EMBODIMENTS
[0032] A main difficulty in object detection for human-sized objects such as a bicyclist is that a bicyclist/person appears small in the image when he/she is located at 20 m away from the sensor. This is particularly true for sensors that develop a 3D point cloud, which tend to be much more limited in resolution than standard RGB cameras. This trade-off, however, is usually beneficial in that a 3D camera can provide direct distance measurements at a high frame rate, whereas a monocular RGB camera can infer only distance, normally using AI techniques such as neural networks that to date cannot operate at high frame rates.
[0033] The general approach follows the process flow shown in
[0034] The
[0035] Fortunately, there are two extra pieces of information can be leveraged to detect a bicyclist from 20 m away. First, it is quite common that there are retroreflectors on either the bicycle or the clothing of the rider. In that case, due to its high amplitude return, the retroreflectors can in general be detected farther away. By detecting dynamic retroreflectors, we can at least identify potential threats approaching to the camera.
[0036] The amplitude map containing defined objects is then merged with the corrected 3D point cloud to determine which points correspond to the object(s) of interest. These may then be tracked as defined objects.
[0037] Use Retroreflector Information
[0038] A retroreflector can easily saturate the pixels it occupies and even bloom out to the neighboring pixels. From previous experience, the distance returns of the saturated pixels are erroneous, but their distance can be inferred from the neighboring pixels. By using the neighboring pixel information, distance information of the saturated pixels can be reconstructed as described below.
[0039] After applying saturation compensation, the retroreflectors on the bicycle become visible at a farther distance. In one of the bicycle capture data, the retroreflector was detected at ˜25 m away, as
[0040] Use Grayscale Image Information
[0041] The
[0042] The present technology also outputs the ambient (gray-scale) image, and any general purpose pretrained object detection model can be applied to identify objects in the gray-scale image.
[0043] Saturation Compensation Method
[0044] Saturation compensation processes the raw amplitude (
[0045] If the light intensity received by any pixels exceeds their capacity, the emitted and received phase shift cannot be accurately determined and thus the calculated distance will be inaccurate.
[0046] A main problem due to saturation is that it will disrupt the shape completeness of the observed object, especially when the object has a significant area covered by retroreflectors (as shown in
[0047] In one embodiment, the saturation compensation method consists of the following steps as shown in
[0048] 1. Isolate the saturated pixels: find out all pixels with amplitude values greater than 2048 (or whatever the highest threshold value may be for a given sensor) (block 1002)
[0049] 2. Find the border of the saturated region (block 1004)
[0050] 3. Calculate the mean distance on the border (block 1006)
[0051] 4. Replace the distance values of each saturated region with its corresponding border mean value (block 1008). The assumption here is that the saturated region should have similar distance with its unsaturated border region.
[0052] Two example object detection deep learning models may be used:
[0053] YOLOv5 (“You only look once”) is an object detection algorithm that divides images into a grid system. Each cell in the grid is responsible for detecting objects within itself. YOLO is one of the most famous object detection algorithms due to its speed and accuracy. The output includes the bounding boxes, class names, and confidences of the detected objects in the image. It has several variants (YOLOv5s, YOLOv5m, YOLOv5l, etc), and each variant has its own pros/cons.
[0054] Masked-RCNN: it is a deep neural network (see
[0055] The preliminary results using NN model to detect a bicyclist may be as follows:
TABLE-US-00003 TABLE 3 Preliminary Results Model Bicyclist Detection Range Runtime YOLOv5s 15.0m 102 fps YOLOv5m 15.0m 76 fps YOLOv5l 19.6m 50 fps Masked-RCNN 19.5m 10 fps
[0056] Combining Retroreflector and Ambient Image Information
[0057] By the processor combining the objects detected by retro-reflector method and neural-network method, the redundancy of the detection sources make the bicycle detection more reliable. The detection range can be summarized in the following table:
TABLE-US-00004 TABLE 4 Example Detection and Classification Ranges Detection Classification Data Source Range Range Runtime Retro-reflector 25m n/a ~100 fps approach neural-network 15m 15m ~100 fps approach (YOLOv5s) Combined 25m 15m ~50 fps
[0058] The combined method can detect an approaching object (with retro-reflectors) from 25 m away, and can identify the class of the object from 15 m away with a frame rate of ˜50 fps.
[0059] While the invention has been described in connection with what is presently considered to be the most practical and preferred embodiments, it is to be understood that the invention is not to be limited to the disclosed embodiments, but on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.