H04N13/271

Active stereo matching for depth applications

A head-mounted device (HMD) is configured to perform depth detection with a stereo camera pair comprising a first camera and a second camera, both of which are configured to detect/capture visible light and IR light. The fields of view for both of the cameras overlap to form an overlapping field of view. The HMD also includes an IR dot-pattern illuminator that is mounted on the HMD with the cameras and that is configured to emit an IR dot-pattern illumination. The IR dot-pattern illuminator emits a dot-pattern illumination that spans at least a part of the overlapping field of view. The IR dot-pattern illumination adds texture to objects in the environment and enables the HMD to determine depth for those objects, even if they have textureless/smooth surfaces.

Active stereo matching for depth applications

A head-mounted device (HMD) is configured to perform depth detection with a stereo camera pair comprising a first camera and a second camera, both of which are configured to detect/capture visible light and IR light. The fields of view for both of the cameras overlap to form an overlapping field of view. The HMD also includes an IR dot-pattern illuminator that is mounted on the HMD with the cameras and that is configured to emit an IR dot-pattern illumination. The IR dot-pattern illuminator emits a dot-pattern illumination that spans at least a part of the overlapping field of view. The IR dot-pattern illumination adds texture to objects in the environment and enables the HMD to determine depth for those objects, even if they have textureless/smooth surfaces.

RANGING SYSTEM AND ELECTRONIC APPARATUS

A system includes a processor, a light source controlled by the processor and configured to emit a light, and an event based vision sensor controlled by the processor. The sensor includes a plurality of pixels. At least one of the plurality of pixels includes a photosensor configured to detect incident light and first circuitry configured to output a first signal based on an output from the photosensor. The first signal indicates a change of amount of incident light. The sensor includes a comparator configured to output a comparison result based on the first signal and at least one of a first reference voltage and a second reference voltage. The processor is configured to apply one of the first reference voltage and the second reference voltage to the comparator selectively based on an operation of the light source.

I-TOF PIXEL CIRCUIT FOR BACKGROUND LIGHT SUPPRESSION
20220399385 · 2022-12-15 ·

A pixel circuit for background light suppression includes: a 2-tap pixel circuit including first and second pixel capacitors, first and second storage switches, and first and second transfer switches; an in-pixel sigma delta circuit including a plurality of switching switches and a storage capacitor for storing charge transferred from the first and second pixel capacitors; an adaptive sigma delta controller configured to determine switching states of the plurality of switching switches according to a first state of the first pixel capacitor, or a second state of the second pixel capacitor, or both; and a chopping controller configured to instruct the storage switches and the transfer switches of the 2-tap pixel circuit to be selectively switched according to an output of the adaptive sigma delta controller.

I-TOF PIXEL CIRCUIT FOR BACKGROUND LIGHT SUPPRESSION
20220399385 · 2022-12-15 ·

A pixel circuit for background light suppression includes: a 2-tap pixel circuit including first and second pixel capacitors, first and second storage switches, and first and second transfer switches; an in-pixel sigma delta circuit including a plurality of switching switches and a storage capacitor for storing charge transferred from the first and second pixel capacitors; an adaptive sigma delta controller configured to determine switching states of the plurality of switching switches according to a first state of the first pixel capacitor, or a second state of the second pixel capacitor, or both; and a chopping controller configured to instruct the storage switches and the transfer switches of the 2-tap pixel circuit to be selectively switched according to an output of the adaptive sigma delta controller.

Robust use of semantic segmentation for depth and disparity estimation

This disclosure relates to techniques for generating robust depth estimations for captured images using semantic segmentation. Semantic segmentation may be defined as a process of creating a mask over an image, wherein pixels are segmented into a predefined set of semantic classes. Such segmentations may be binary (e.g., a ‘person pixel’ or a ‘non-person pixel’) or multi-class (e.g., a pixel may be labelled as: ‘person,’ ‘dog,’ ‘cat,’ etc.). As semantic segmentation techniques grow in accuracy and adoption, it is becoming increasingly important to develop methods of utilizing such segmentations and developing flexible techniques for integrating segmentation information into existing computer vision applications, such as depth and/or disparity estimation, to yield improved results in a wide range of image capture scenarios. In some embodiments, an optimization framework may be employed to optimize a camera device's initial scene depth/disparity estimates that employs both semantic segmentation and color regularization in a robust fashion.

Robust use of semantic segmentation for depth and disparity estimation

This disclosure relates to techniques for generating robust depth estimations for captured images using semantic segmentation. Semantic segmentation may be defined as a process of creating a mask over an image, wherein pixels are segmented into a predefined set of semantic classes. Such segmentations may be binary (e.g., a ‘person pixel’ or a ‘non-person pixel’) or multi-class (e.g., a pixel may be labelled as: ‘person,’ ‘dog,’ ‘cat,’ etc.). As semantic segmentation techniques grow in accuracy and adoption, it is becoming increasingly important to develop methods of utilizing such segmentations and developing flexible techniques for integrating segmentation information into existing computer vision applications, such as depth and/or disparity estimation, to yield improved results in a wide range of image capture scenarios. In some embodiments, an optimization framework may be employed to optimize a camera device's initial scene depth/disparity estimates that employs both semantic segmentation and color regularization in a robust fashion.

Camera device having shiftable optical path

A camera device according to an embodiment of the present invention includes a light output unit that outputs an output light signal to be irradiated to an object, a lens unit that condenses an input light signal reflected from the object, an image sensor that generates an electric signal from the input light signal condensed by the lens unit and an image processing unit that extracts a depth map of the object using at least one of a time difference and a phase difference between the output light signal and the input light signal received by the image sensor, the lens unit including IR (InfraRed) filter, a plurality of solid lenses disposed on the IR filter and a liquid lens disposed on the plurality of solid lenses, or disposed between the plurality of solid lenses, the camera device further including a first driving unit that controls shifting of the IR filter or the image sensor and a second driving unit that controls a curvature of the liquid lens, an optical path of the input light signal being repeatedly shifted according to a predetermined rule by one of the first driving unit and the second driving unit, and the optical path of the input light signal being shifted according to predetermined control information by the other one of the first driving unit and the second driving unit.

IMAGE PROCESSING APPARATUS, IMAGE PROCESSING METHOD, AND STORAGE MEDIUM
20220392092 · 2022-12-08 ·

An image processing apparatus includes a unit (input unit) configured to acquire image data and depth information corresponding to the image data, a unit (layer division image generation unit) configured to generate layer division image data based on the depth information by dividing the image data into a plurality of layers depending on a subject distance, and a unit (output unit) configured to output the layer division image data. The layer division image data includes image data of a first layer including image data corresponding to a subject at a subject distance less than a first distance, and image data of a second layer including image data corresponding to a subject at a subject distance larger than or equal to the first distance. The first distance changes based on the depth information.

GRASP LEARNING USING MODULARIZED NEURAL NETWORKS
20220388162 · 2022-12-08 ·

A method for modularizing high dimensional neural networks into neural networks of lower input dimensions. The method is suited to generating full-DOF robot grasping actions based on images of parts to be picked. In one example, a first network encodes grasp positional dimensions and a second network encodes rotational dimensions. The first network is trained to predict a position at which a grasp quality is maximized for any value of the grasp rotations. The second network is trained to identify the maximum grasp quality while searching only at the position from the first network. Thus, the two networks collectively identify an optimal grasp, while each network's searching space is reduced. Many grasp positions and rotations can be evaluated in a search quantity of the sum of the evaluated positions and rotations, rather than the product. Dimensions may be separated in any suitable fashion, including three neural networks in some applications.