H04N13/271

Systems and methods for automatically calibrating multiscopic image capture systems
11496722 · 2022-11-08 · ·

A method includes receiving, from a multiscopic image capture system, a plurality of images depicting a scene. The method includes determining, by application of a neural network based on the plurality of images, a disparity map of the scene. The neural network includes a plurality of layers, and the layers include a rectification layer. The method include determining a matching error of the disparity map based on differences between corresponding pixels of two or more images associated with the disparity map. The method includes back-propagating the matching error to the rectification layer of the neural network. Back-propagating the matching error includes updating one or more weights applied to the rectification layer.

Scene crop via adaptive view-depth discontinuity
11615583 · 2023-03-28 · ·

A method, apparatus, and system provide the ability to crop a three-dimensional (3D) scene. The 3D scene is acquired and includes multiple 3D images (with each image from a view angle of an image capture device) and a depth map for each image. The depth values in each depth map are sorted. Multiple initial cutoff depths are determined for the scene based on the view angles of the images (in the scene). A cutoff relaxation depth is determined based on a jump between depth values. A confidence map is generated for each depth map and indicates whether each depth value is above or below the cutoff relaxation depth. The confidence maps are aggregated into an aggregated model. A bounding volume is generated out of the aggregated model. Points are cropped from the scene based on the bounding volume.

Scene crop via adaptive view-depth discontinuity
11615583 · 2023-03-28 · ·

A method, apparatus, and system provide the ability to crop a three-dimensional (3D) scene. The 3D scene is acquired and includes multiple 3D images (with each image from a view angle of an image capture device) and a depth map for each image. The depth values in each depth map are sorted. Multiple initial cutoff depths are determined for the scene based on the view angles of the images (in the scene). A cutoff relaxation depth is determined based on a jump between depth values. A confidence map is generated for each depth map and indicates whether each depth value is above or below the cutoff relaxation depth. The confidence maps are aggregated into an aggregated model. A bounding volume is generated out of the aggregated model. Points are cropped from the scene based on the bounding volume.

SYSTEM AND METHOD FOR RECONSTRUCTING A 3D HUMAN BODY USING COMPACT KIT OF DEPTH CAMERAS

The invention presents a system and a method for 3D human reconstruction using a compact kit of depth cameras. Instead of using complex and expensive devices as in traditional methods, the proposed system and method employs a simple, easy-to-install system to accurately collect the human body shape. The generated model is capable of moving thanks to a skeleton system simulating the human skeleton. The proposed system includes four blocks: Data Collection Block, Point Cloud Standardization Block, Human Digitization Block and Output Block. The proposed method includes five steps: Point Cloud Collecting, Point Cloud Filtering, Point Cloud Calibrating, Point Cloud Optimizing and 3D Human Model Generating.

Enclosed multi-view visual media representation

Images may be captured at an image capture device mounted on an image capture device gimbal capable of rotating the image capture device around a nodal point in one or more dimensions. Each of the plurality of images may be captured from a respective rotational position. The images may be captured by a designated camera that is not located at the nodal point in one or more of the respective rotational positions. A designated three-dimensional point cloud may be determined based on the plurality of images. The designated three-dimensional point cloud may include a plurality of points each having a respective position in a virtual three-dimensional space.

Depth and vision sensors for challenging agricultural environments

Provided is a method for three-dimensional imaging a plant in an indoor agricultural environment having an ambient light power spectrum that differs from a power spectrum of natural outdoor light. The method comprises directing a spatially separated stereo pair of cameras at a scene including the plant, illuminating the scene with a non-uniform pattern provided by a light projector utilizing light in a frequency band having a lower than average ambient intensity in the indoor agricultural environment, filtering light entering image sensors of each of the cameras with filters which selectively pass light in the frequency band utilized by the light projector, capturing an image of the scene with each of the cameras to obtain first and second camera images, and generating a depth map including a depth value corresponding to each pixel in the first camera image.

Depth and vision sensors for challenging agricultural environments

Provided is a method for three-dimensional imaging a plant in an indoor agricultural environment having an ambient light power spectrum that differs from a power spectrum of natural outdoor light. The method comprises directing a spatially separated stereo pair of cameras at a scene including the plant, illuminating the scene with a non-uniform pattern provided by a light projector utilizing light in a frequency band having a lower than average ambient intensity in the indoor agricultural environment, filtering light entering image sensors of each of the cameras with filters which selectively pass light in the frequency band utilized by the light projector, capturing an image of the scene with each of the cameras to obtain first and second camera images, and generating a depth map including a depth value corresponding to each pixel in the first camera image.

Systems and methods for temporally consistent depth map generation

Systems and methods are provided for performing temporally consistent depth map generation by implementing acts of obtaining a first stereo pair of images of a scene associated with a first timepoint and a first pose, generating a first depth map of the scene based on the first stereo pair of images, obtaining a second stereo pair of images of the scene associated with at a second timepoint and a second pose, generating a reprojected first depth map by reprojecting the first depth map to align the first depth map with the second stereo pair of images, and generating a second depth map that corresponds to the second stereo pair of images using the reprojected first depth map.

Systems and methods for temporally consistent depth map generation

Systems and methods are provided for performing temporally consistent depth map generation by implementing acts of obtaining a first stereo pair of images of a scene associated with a first timepoint and a first pose, generating a first depth map of the scene based on the first stereo pair of images, obtaining a second stereo pair of images of the scene associated with at a second timepoint and a second pose, generating a reprojected first depth map by reprojecting the first depth map to align the first depth map with the second stereo pair of images, and generating a second depth map that corresponds to the second stereo pair of images using the reprojected first depth map.

TOUCHLESS PHOTO CAPTURE IN RESPONSE TO DETECTED HAND GESTURES
20230093612 · 2023-03-23 ·

Example systems, devices, media, and methods are described for capturing still images in response to hand gestures detected by an eyewear device that is capturing frames of video data with its camera system. A localization system determines the eyewear location relative to the physical environment. An image processing system detects a hand shape in the video data and determines whether the detected hand shape matches a border gesture or a shutter gesture. In response to a border gesture, the system establishes a border that defines the still image to be captured. In response to a shutter gesture, the system captures a still image from the frames of video data. The system determines a shutter gesture location relative to the physical environment. The captured still image is presented on the display at or near the shutter gesture location, such that the still image appears anchored relative to the physical environment. The captured still image is viewable by other devices that are using the image capture system.