Patent classifications
H04N13/271
MULTI-CHANNEL DEPTH ESTIMATION USING CENSUS TRANSFORMS
A depth estimation system is described capable of determining depth information using two images from two cameras. A first camera captures a first image and a second camera captures a second image, both images including a plurality of light channels. A scan direction is selected from a plurality of scan directions. For the selected scan direction, along each of a plurality of scanlines, the system compares pixels from the first image to pixels from the second image. The comparison is based on calculating a census transform for each pixel in the first image and a census transform for each pixel in the second image. This comparison is used to determine a stereo correspondence between the pixels in the first image and the pixels in the second image. The system generates a depth map based on the stereo correspondence.
Deep Learning-Based Three-Dimensional Facial Reconstruction System
A 3D facial reconstruction system includes a main color range camera, a plurality of auxiliary color cameras, a processor and a memory. The main color range camera is arranged at a front angle of a reference user to capture a main color image and a main depth map of the reference user. The plurality of auxiliary color cameras are arranged at a plurality of side angles of the reference user to capture a plurality of auxiliary color images of the reference user. The processor executes instructions stored in the memory to generate a 3D front angle image according to the main color image and the main depth map, generate 3D side angle images according to the 3D front angle image and the plurality of auxiliary color images, and train an artificial neural network model according to a training image, the 3D front angle image and 3D side angle images.
Deep Learning-Based Three-Dimensional Facial Reconstruction System
A 3D facial reconstruction system includes a main color range camera, a plurality of auxiliary color cameras, a processor and a memory. The main color range camera is arranged at a front angle of a reference user to capture a main color image and a main depth map of the reference user. The plurality of auxiliary color cameras are arranged at a plurality of side angles of the reference user to capture a plurality of auxiliary color images of the reference user. The processor executes instructions stored in the memory to generate a 3D front angle image according to the main color image and the main depth map, generate 3D side angle images according to the 3D front angle image and the plurality of auxiliary color images, and train an artificial neural network model according to a training image, the 3D front angle image and 3D side angle images.
Dynamic parallax correction for visual sensor fusion
An augmented reality (AR) vision system is disclosed. A display is configured to present a surrounding environment to eyes of a user of the AR vision system. A depth tracker is configured to produce a measurement of a focal depth of a focus point in the surrounding environment. Two or more image sensors receive illumination from the focus point and generate a respective image. A controller receives the measurement of the focal depth, generates an interpolated look-up-table (LUT) function by interpolating between two or more precalculated LUTs, applies the interpolated LUT function to the images to correct a parallax error and a distortion error at the measured focal depth, generates a single image of the surrounding environment, and displays the single image to the user.
Three-Dimensional Tracking Using Hemispherical or Spherical Visible Light-Depth Images
Three-dimensional tracking includes obtaining a hemispherical visible light-depth image capturing an operational environment of a user device. Obtaining the hemispherical visible light-depth image includes, obtaining a hemispherical visual light image, and obtaining a hemispherical non-visual light depth image. Three-dimensional tracking includes generating a perspective converted hemispherical visible light-depth image. Generating the perspective converted hemispherical visible light-depth image includes generating a perspective converted hemispherical visual light image, and generating a perspective converted hemispherical non-visual light depth image. Three-dimensional tracking includes generating object identification and tracking data representing an external object in the operational environment based on the perspective converted hemispherical visible light-depth image and outputting the object identification and tracking data.
Three-Dimensional Tracking Using Hemispherical or Spherical Visible Light-Depth Images
Three-dimensional tracking includes obtaining a hemispherical visible light-depth image capturing an operational environment of a user device. Obtaining the hemispherical visible light-depth image includes, obtaining a hemispherical visual light image, and obtaining a hemispherical non-visual light depth image. Three-dimensional tracking includes generating a perspective converted hemispherical visible light-depth image. Generating the perspective converted hemispherical visible light-depth image includes generating a perspective converted hemispherical visual light image, and generating a perspective converted hemispherical non-visual light depth image. Three-dimensional tracking includes generating object identification and tracking data representing an external object in the operational environment based on the perspective converted hemispherical visible light-depth image and outputting the object identification and tracking data.
DYNAMIC DEPTH DETERMINATION
A depth camera assembly (DCA) determines depth information for a local area. The DCA includes a plurality of cameras and at least one illuminator. The DCA dynamically determines depth sensing modes (e.g., passive stereo, active stereo, structured stereo) based in part on the surrounding environment and/or user activity. The DCA uses the depth information to update a depth model describing the local area. The DCA may determine that a portion of the depth information associated with some of portion of the local area is not accurate. The DCA may then select a different depth sensing mode for the portion of the local area and update the depth model with the additional depth information. In some embodiments, the DCA may update the depth model by utilizing a machine learning model to generate a refined depth model.
DYNAMIC DEPTH DETERMINATION
A depth camera assembly (DCA) determines depth information for a local area. The DCA includes a plurality of cameras and at least one illuminator. The DCA dynamically determines depth sensing modes (e.g., passive stereo, active stereo, structured stereo) based in part on the surrounding environment and/or user activity. The DCA uses the depth information to update a depth model describing the local area. The DCA may determine that a portion of the depth information associated with some of portion of the local area is not accurate. The DCA may then select a different depth sensing mode for the portion of the local area and update the depth model with the additional depth information. In some embodiments, the DCA may update the depth model by utilizing a machine learning model to generate a refined depth model.
VEHICLE EXTERIOR ENVIRONMENT RECOGNITION APPARATUS
A vehicle exterior environment recognition apparatus includes a monocular distance calculator, a relaxation distance calculator, and an updated distance calculator. The monocular distance calculator calculates a monocular distance of a three-dimensional object from a luminance image generated by an imaging unit. The relaxation distance calculator calculates a relaxation distance of the three-dimensional object from two luminance images generated by two imaging units based on a degree of image matching between the two luminance images determined using a threshold more lenient than another threshold used to determine the degree of image matching to generate a stereo distance of the three-dimensional object. The updated distance calculator calculates an updated distance of the three-dimensional object by mixing the monocular distance and the relaxation distance at a predetermined ratio.
METHODS, SYSTEMS, AND MEDIA FOR GENERATING AN IMMERSIVE LIGHT FIELD VIDEO WITH A LAYERED MESH REPRESENTATION
Mechanisms for generating compressed images are provided. More particularly, methods, systems, and media for capturing, reconstructing, compressing, and rendering view-dependent immersive light field video with a layered mesh representation are provided.