Image processing system, and image processing method
09633439 · 2017-04-25
Assignee
- National Institute Of Advanced Industrial Science And Technology (Tokyo, JP)
- Kagoshima University (Kagoshima-Shi, Kagoshima, JP)
- Hiroshima City University (Hiroshima-shi, JP)
Inventors
Cpc classification
G01B11/2545
PHYSICS
G06T7/521
PHYSICS
International classification
Abstract
A high-density shape reconstruction is conducted in measuring animal bodies as well. An image processing system has a projection device, an imaging device, and an image processing apparatus connected to the projection device and the imaging device, wherein the projection device projects a projected pattern to an observation target, the imaging device captures the projected pattern, and the image processing apparatus performs shape reconstruction based on an input image including the projected pattern. The image processing apparatus includes a unit for fetching the input image captured by the imaging device and performing line detection for the projected pattern projected by the projection device, wherein the projected pattern is a grid pattern formed of wave lines; and a unit for performing shape reconstruction by associating intersection points of vertical and horizontal lines extracted by the line detection with the projected pattern.
Claims
1. An image processing system comprising: a projection device for projecting a projected pattern to an observation target; an imaging device for capturing the projected pattern; and an image processing apparatus connected to the projection device and the imaging device, for performing shape reconstruction based on an input image including the projected pattern, the image processing apparatus including a personal computer configured to: fetch the input image captured by the imaging device and performing line detection for the projected pattern projected by the projection device, wherein the projected pattern is a grid pattern formed of wave lines, the wave lines are wavy curves having predetermined periodicity, the grid pattern formed of the wave lines is formed of a plurality of wave lines that are arranged at predetermined intervals, the grid pattern is a set of wave lines that intersect each other in two directions, and the interval of the wave lines in one of the directions is not equal to an integral multiple of a wavelength for the wave line in the other direction; and perform shape reconstruction by associating intersection points of vertical and horizontal lines extracted by the line detection with the projected pattern.
2. The image processing system according to claim 1, wherein the personal computer is further configured to: reproject a patch to an image output by the projection device, wherein the patch obtained by approximating, to a tangent plane, a region around each intersection point of an input image that is captured by the imaging device; calculate energy for stereo matching between each intersection point of the reprojected patch and a correspondence candidate for a grid point of the projected pattern projected by the projection device by employing a sum of a data term assigned to each grid point and a regularization term obtained between the grid point and an adjacent grid point, wherein the grid point is an intersection point of the wave lines in two directions of the grid pattern; and perform shape reconstruction by associating a grid point with the projected pattern, wherein the grid point is a correspondence candidate having a minimum value of energy for stereo matching among the correspondence candidate.
3. The image processing system according to claim 1, wherein the personal computer is further configured to: create a triangular mesh consisting of three pixel samples and calculate a depth of each sub-pixel; and calculate, for all of the pixel samples, an error that occurs when the triangular mesh is re-projected to an output image of the projection device, minimize the error obtained, and perform linear interpolation for depths of pixels other than the pixel samples.
4. The image processing system according to claim 2, wherein the projection device includes first and second imaging devices; and the personal computer is further configured to select the correspondence candidate by adding a regularization term for the grid point that is obtained between the first and second imaging devices to energy for stereo matching of the correspondence candidates.
5. The image processing system according to claim 4, wherein the personal computer is further configured to employ an average to merge a depth for each pixel that is obtained, for the grid point, between the first and second imaging devices.
6. The image processing system according to claim 1, wherein the projection device includes first and second projection devices; and the personal computer is further configured to optimize a depth of each pixel, for grid points for which matching is obtained between a first projected pattern projected by the first projection device and a second projected pattern projected by the second projection device, wherein the grid points are intersection points of the wave lines in two directions of the grid pattern.
7. The image processing system according to claim 1, wherein the personal computer is further configured to perform shape reconstruction by calculating, for a plurality of positions around grid points being intersection points of the wave lines in two directions of the grid pattern, a difference between the projected pattern of the grid points and a result obtained through the line detection, and by employing the result as a matching cost for a correspondence candidate to associate and associating a grid point that is a minimum correspondence candidate with the projected pattern.
8. The image processing system according to claim 1, wherein, when the projected pattern is projected to the observation target, a parameter for the projected pattern is selected by comparing degrees of similarity for two arbitrary intersection points on the same epipolar line so that a degree of similarity becomes minimum.
9. An image processing method of performing shape reconstruction based on an input image including a projected pattern in an image processing apparatus connected to a projection device and an imaging device, wherein the projection device projects a projected pattern to an observation target, and the imaging device captures the projected pattern, the method comprising the steps of: fetching, by the image processing apparatus, the input image captured by the imaging device, and performing line detection for the projected pattern projected by the projection device, wherein the projected pattern is a grid pattern formed of wave lines, the wave lines are wavy curves having predetermined periodicity, the grid pattern formed of the wave lines is formed of a plurality of wave lines that are arranged at predetermined intervals, the grid pattern is a set of wave lines that intersect each other in two directions, and the interval of the wave lines in one of the directions is not equal to an integral multiple of a wavelength for the wave line in the other direction; and performing, by the image processing apparatus, shape reconstruction by associating intersection points of vertical and horizontal lines extracted by the line detection with the projected pattern.
10. A non-transitory computer readable storage medium having a computer program stored therein, said computer program including computer executable commands enabling an imaging device to perform shape reconstruction based on an input image including a projected pattern in an image processing apparatus connected to a projection device and the imaging device, wherein the projection device projects a projected pattern to an observation target, and the imaging device captures the projected pattern, the computer executable commands further enabling the imaging device to performing the steps of: fetching, by the image processing apparatus, the input image captured by the imaging device, and performing line detection for the projected pattern projected by the projection device, wherein the projected pattern is a grid pattern formed of wave lines, the wave lines are wavy curves having predetermined periodicity, the grid pattern formed of the wave lines is formed of a plurality of wave lines that are arranged at predetermined intervals, the grid pattern is a set of wave lines that intersect each other in two directions, and the interval of the wave lines in one of the directions is not equal to an integral multiple of a wavelength for the wave line in the other direction; and performing, by the image processing apparatus, shape reconstruction by associating intersection points of vertical and horizontal lines extracted by the line detection with the projected pattern.
Description
BRIEF DESCRIPTION OF DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
(22)
(23)
(24)
(25)
(26)
(27)
(28)
(29)
(30)
(31)
(32)
(33)
(34)
(35)
(36)
(37)
(38)
(39)
(40)
(41)
(42)
(43)
(44)
(45)
(46)
(47)
(48)
(49)
(50)
(51)
DESCRIPTION OF EMBODIMENTS
(52) The embodiments of the present invention will now be described in detail, while referring to drawings. In the embodiments of this invention, a spatial-encoding method using the continuity of a grid pattern is employed. It is known that this method has problems on ambiguity of correspondences of points and erroneous reconstruction caused by incorrect determination of the continuity of the detected lines (see, for example, NPL 2 to 4). To resolve these problems, the use of a grid pattern formed of a plurality of colors has been proposed for a conventional method. However, since the conventional method is adversely affected by the reflectivity and the texture of the surface of a target object, stable measurement cannot be performed. In this embodiment, a single-colored grid pattern is employed, and the two problems for a grid pattern and a multi-colored pattern can be resolved at the same time.
First Embodiment
(53) An image processing system according to a first embodiment of the present invention is illustrated in
(54) The image processing apparatus 104 stores projected patterns, such as grid patterns formed of wave lines, in a storage medium in advance, and can transmit projected pattern data to the projector 102 to project the pattern to the observation target 103. Further, the image processing apparatus 104 fetches an input image captured by the camera 101, stores the input image in the storage medium, and performs the image processing for shape reconstruction based on the input image.
(55) A shape reconstruction algorithm for the first embodiment of the present invention is shown in
(56) For each node, the position of the epipolar line on the projected pattern is calculated to find a correspondence, and in a case wherein the intersection point is present along the line, this point is defined as a correspondence candidate. Since multiple candidates of correspondences are usually found, the optimal combination of the correspondence candidates is obtained for each point by using the BP (S208). Since the reconstruction result is still sparse, the depths of all the pixels are calculated by performing interpolation and pixel-wise matching between the pattern and the captured image (S210), and as a result, a dense 3D shape is reconstructed (S212).
(57) To obtain unique correspondences between the camera image (an image captured on the camera's image plane) and a projector image (a pattern projected from the projector's image plane) by spatial encoding, a complicated pattern having the size of a large window has been required for the conventional methods. Moreover, while a broad baseline is desirable to improve accuracy, the observed pattern will be greatly distorted, which makes it practically difficult to decode the pattern. Therefore, a simple but highly unique pattern that is to be easily detected and decoded is desirable. In this embodiment, a pattern that gives information related to the priority for matching is employed, instead of a pattern for which the correspondence is uniquely determined through the image processing. Specifically, a grid pattern formed of vertical and horizontal wave lines is employed.
(58) An example grid pattern consisting of wave lines is shown in
(59) The grid pattern of wave lines provides useful information for detecting correspondences. In this embodiment, the intersection points of vertical and horizontal wave lines are employed as feature points. The arrangement of intersection points is determined by the intervals and the wavelengths of the wave lines. The same interval and wavelength are employed for the wave lines; however, as will be described below, in a case wherein the interval of the vertical wave lines is not equal to the integral multiple of the wavelength of the horizontal wave lines (or in a case wherein the interval of the horizontal wave lines is not equal to the integral multiple of the wavelength of the vertical wave lines), the intersection points appear at the different phases. It means that the local pattern is shifted from the peripheral intersection point, and this difference can be used as a discriminative feature.
(60) The local pattern around an intersection point is not unique in the whole projected pattern. Therefore, the same pattern appears at every Nx and Ny wave lines along the horizontal and vertical axes, based on
Nx=lcm(Sx,Wx)/Sx
Ny=lcm(Sy,Wy)/Sy
where Sx and Sy in
(61) A static pattern projected by the projector 102 is shown in
(62) Sx=10, Sy=11, Wx=Wy=14, Ax=Ay=1.
(63) In this example, each cycle has 7 and 14 wave lines along horizontal and vertical axes, respectively. Consequently, 98 (=714) intersection points are present in a rectangle formed in one cycle.
(64) In stereo matching, the candidates of corresponding points are limited to the points on the epipolar line. In a case wherein an intersection point of a specific projector image is located within a certain distance from the epipolar line, the intersection point of the projector image is selected as a candidate. The number of candidates depends on the positions of intersection points in the camera image. Since the correspondence candidates are sparsely located in the projector image, the number of correspondence candidates is much smaller than that employed for pixel-based stereo for searching for candidate points.
(65) To find the best combinations of correspondences, a method using regularization with local matching will be described while referring to
(66) First, a matching cost is calculated for all the correspondence candidates, and is employed as a data term for energy minimization. The cost is computed as an SSD (Sum of Squared Difference) between the camera image and the projector image (pattern image). However, since there is an error for the detected position of the grid point, and the pattern captured by the camera is distorted according to the surface of the target object, the simple SSD with respect to a quadrilateral area is unsuitable for the data term. Therefore, a patch obtained by approximating the area around the grid point of the target object to the tangent plane of the grid point is employed. With this patch, a more accurate matching cost can be calculated, and the corresponding points can be calculated in sub-pixel accuracy.
(67) A patch obtained by approximation to the tangent plane of a grid point is shown in
ax+by+cz+1=0.
(68) It should be noted that a, b and c are parameters of a plane. The parameters are calculated by minimizing the SSD, while taking the distortion of an image into account.
(69) The algorithm employed for calculation is as follows:
(70) (1) Project a quadrilateral patch R(p) 511 around a grid point p in a camera image 501 to the 3D tangent plane, and re-project this patch onto a projector image 502.
(71) (2) Calculate the SSD of the intensities between the re-projected quadrilateral patch 512 and the projector image 502.
(72) (3) Employ a, b and c as variables to minimize the SSD value.
(73) (4) Repeat the above steps for several times.
(74) The initial values of a, b and c are set, so that the tangent plane includes the 3D position of the grid point computed using a parallax error, and is parallel to the camera's image plane, and the SSD value is represented by the following equation:
(75)
In this case, R(p) is a quadrilateral patch around p and H.sub.a, b, c(p) is the transformation in a case wherein p is re-projected to the projector's image plane. I.sub.c () and I.sub.p() are the intensities of the camera image and the projector image, respectively.
(76) In this case, the grid pattern consists of nodes pV, which are grid points, and edges (p, q)U that represent the connections of the grid points. It should be noted that p and q are grid points, V is a set of grid points, and U is a set of edges of a grid graph. A grid point p includes correspondence candidates t.sub.pT.sub.p. In this case, T.sub.p is a set of correspondence candidates for the grid point p. While a set of correspondences is employed as a parameter, the energy for stereo matching is defined as follows:
(77)
It should be noted that T={t.sub.p|pV}, and D.sub.p(t.sub.p) is a data term in case of assigning the point corresponding to p to the candidate t.sub.p. W.sub.pq(t.sub.p, t.sub.q) is a regularization term used to assign candidates t.sub.p and t.sub.q to neighboring grid points.
(78) The data term is a value of the SSD calculated by the method described above. The regularization term is defined as follows:
(79)
(80) It should be noted that is a user-defined constant. The energy is minimized by the BP method.
(81) An advantage of using energy minimization is that the regularization terms defined using the neighboring grid points can be soft constraints. This is important because, according to the actual data, there is always a chance that incorrect grid connections might be generated due to erroneous line detection. According to NPL 3, wrong connection should be removed at the stage of line detection before 3D reconstruction is started, while in this embodiment, removal of wrong connection and 3D reconstruction are simultaneously performed, and therefore, reconstruction with higher density and higher accuracy is enabled.
(82) The correspondences for sparse grid points are obtained by the grid-based stereo matching method. At the next step, dense correspondences are acquired by using information for all the pixels. In this process, depth values of densely resampled pixel samples are calculated by interpolating the grid points. Then, the depth values of these pixel samples are employed as variables to minimize a difference of intensities between the camera image and the projector image.
(83) A method employed based on interpolation of the detected grid lines is described in NPL 8. In this embodiment, independent depth estimation for each (sub) pixel is achieved by optimization based on photo-consistency.
(84) When a viewing vector from the camera origin to a pixel x is represented as (u, v, 1), the depth dx for the pixel is computed as follows.
(85)
It should be noted that a.sub.x, b.sub.x and c.sub.x are the parameters computed for the pixel. a.sub.x for each pixel is interpolated as follows:
(86)
It should be noted that p is a grid point, G() is a Gaussian function and |px| is a distance between p and x. b.sub.x and c.sub.x are calculated in the same manner by weighted averaging.
(87) For optimization, it is possible that the depths of all the pixels are employed as independent variables to estimate the depths of all the pixels (pixel-based depth estimation). However, in this embodiment, a triangular mesh formed of three pixel samples is resampled to estimate the depths of the pixel samples (sub-pixel based depth estimation). As a result, the more appropriate resolution of the triangular mesh can be obtained. When the estimation for the depth is simply performed for all of the pixels, the accuracy might be reduced, because the resolution of a pattern to be projected is lower than the image resolution. To resolve this problem, a method for using a matching window having a certain size, for example, can be employed; however, the calculation cost would be increased.
(88) In contrast, in this embodiment, the following method is employed to reduce the number of points and the number of variables without scarifying the accuracy, and to perform efficient calculation. The sub-pixel based depth estimation will be described while referring to
(89)
It should be noted that w.sub.x2 and w.sub.x3 are the weights for linear interpolation. Now, D+AD is a vector obtained by collecting d.sub.x+d.sub.x for all the pixel samples. A reprojection error for the projector image (the pattern image) is calculated for all the pixels including the pixel samples by using the following expression:
(90)
It should be noted that the position of reprojection onto the projector image is represented by P.sub.D+AD(x). For reprojection of each pixel, part of D+D is employed. x and x are adjacent vertices. is a user-defined parameter for regularization. The parameter D is determined so as to minimize the error. When the reprojection and minimization are alternatively and repetitively performed until convergence of a solution is reached, the depth D is determined.
Second Embodiment
(91) An image processing system according to a second embodiment of the present invention is illustrated in
(92) The image processing apparatus 1105 stores projected patterns, such as grid patterns formed of wave lines, in a storage medium in advance, and can transmit projected pattern data to the projector 1103 to project the pattern to the observation target 1104. Further, the image processing apparatus 1105 fetches input images captured by the cameras 1101 and 1102, stores the input images in the storage medium, and performs the image processing for shape reconstruction based on the input images.
(93) According to the second embodiment, the constraint condition between the two cameras is employed as additional information to find correspondence candidates. A method for assigning corresponding points based on the energy minimization on the grid graph will now be described. The additional constraints are introduced as the edges that connect graphs of two cameras. Generation of edges between two grid graphs will be described while referring to
(94) A search for a corresponding point in a projected pattern 1201 for a node p.sub.0 of the camera 1101 will be described. The correspondence candidates t.sub.p0T.sub.p0 are the intersection points of a projected pattern 1204 on an epipolar line 1211 of a grid point p.sub.0, while T.sub.p0 is a set of the correspondence candidates for the grid point p.sub.0. When it is assumed that the correspondence candidate of the grid point p.sub.0 is t.sub.p0, the coordinates P.sub.3D(t.sub.p0) for the grid point p.sub.0 on a surface 1203 of the observation target 1104 are calculated by triangulation between the camera 1101 and the projector 1103. P.sub.1(t.sub.p0) is the point at which the coordinates point P.sub.3D(t.sub.p0) is projected onto a grid pattern 1202 of the camera 1102. When the grid point p.sub.1 of the camera 1102 satisfies the following expression, the grid point p.sub.0 and the grid point p.sub.1 are associated with each other (linear line L1).
D(p.sub.1,P.sub.1(t.sub.p0))< and t.sub.p0T.sub.p1
Here, D(a, b) is a distance between points a and b, is the radius of the search area for a grid point near P.sub.1(t.sub.p0), and T.sub.p1 is a set of correspondence candidates t.sub.p1.
(95) Referring to
(96) There is a chance wherein some incorrect edges might be generated by using this method (linear line L2). A second projection point 1223 in
(97) Now, a single grid graph is obtained for two cameras by detecting lines and by reprojecting points by one camera to the other camera. Next, the best combination of correspondences is to be found by performing the energy minimization on the grid graph. The grid graph consists of grid points p.sub.0V.sub.0 and p.sub.1V.sub.1, edges (p.sub.0, q.sub.0)U.sub.0 and (p.sub.1, q.sub.1)U.sub.1 obtained by line detection, and edges (p.sub.0, p.sub.1)S obtained between the cameras. As for the camera 1101, p.sub.0 and q.sub.0 are grid points, V.sub.0 is a set of grid points and U.sub.0 is a set of edges. As for the camera 1102, p.sub.1 and q.sub.1 are grid points, V.sub.1 is a set of grid points and U.sub.1 is a set of edges. S is a set of edges between the cameras. A grid point P.sub.0 includes the correspondence candidates t.sub.p0T.sub.p0 of the projector pattern.
(98) For the one-camera one-projector system in the first embodiment, the energy used to assign corresponding points tp0 to the individual grid points p0 is defined by the following expression (2). When this definition is extended for the use in the two-camera one projector system in this embodiment, the following expression is established:
(99)
It should be noted that X.sub.p0, p1(t.sub.p0, t.sub.p1) is a regularization term for the edges (p.sub.0, p.sub.1) between cameras. This term is represented as:
(100)
It should be noted that where is a user-defined constant. When a grid point p has camera-camera edges, one of the camera-camera edges is selected for the assignment of t.sub.p for the grid point. This is because the energy will be increased if the assignment of an edge other than the edge between the cameras is selected.
(101) In the first embodiment, a dense range image has been created by interpolating the grid graph in the camera image. The two-camera one-projector system in this embodiment provides two sets of grid graphs. When the graphs are created on the camera image, there is a case wherein the graphs are partially occluded from the other camera, and it is not possible to integrate the grid graphs and to perform dense reconstruction. Therefore, reprojection is performed for the graphs obtained by the two cameras to merge pixel information in the coordinate system of the projector.
(102) A case wherein a grid point t.sub.p of the projector pattern is associated with grid points p.sub.0 and p.sub.1 of the two cameras is shown in
(103)
(104) Here, d(t.sub.p, p) is the depth of the coordinate system calculated based on t.sub.p and p. Further, D(r, t.sub.pk) is a distance between two points r and t.sub.pk, and is a user-defined parameter to determine the neighborhood of a grid point. Since every coordinate point p.sub.3D is visible from the projector, the depth information can be merged. An example method employed for calculation of d(t.sub.p, p) can be linear interpolation (e.g., bilinear interpolation) in consonance with the distance extended from a set of the grid point t.sub.p and the neighboring grid point to p. Furthermore, the weighted average may be employed for calculating expression (9) to obtain the average. An angle formed by the camera and the projector, for example, can be employed for weighting.
Third Embodiment
(105) An image processing system according to a third embodiment of the present invention is illustrated in
(106) The image processing apparatus 2401 stores projected patterns, such as grid patterns formed of wave lines, in a storage medium in advance, and can transmit projected pattern data to the projectors 2201 to 2206 to project the patterns to the observation target 2301. Further, the image processing apparatus 2401 fetches input images captured by the cameras 2101 to 2106, stores the input images in the storage medium, and performs the image processing for shape reconstruction based on the input images.
(107) In the third embodiment, since multiple patterns are included in images obtained by the cameras, it is required that a pattern should be examined to identify a projector that projected the pattern. Thus, colors are employed for identification of the projectors. In this case, patterns of the three primary colors of light, red, green and blue, are projected to an observation target respectively by the two projectors.
(108) An image obtained by projecting grid patterns of wave lines of the three primary colors is shown in
(h,s,v)=RGB2HSV(r,g,b)
(r,g,b)=HSV2RGB(h,1,v)(11)
It should be noted that RGB2HSV and HSV2RGB represent conversion in the color space, and colors are represented in the range of [0, 1]. By conversion of the colors into saturated colors, the affect of the green pattern can be reduced, as shown in
(109) A method for finding corresponding points for the red pattern and the blue pattern can be performed in the same manner as for the two-camera one-projector case in the second embodiment. Since more projectors are employed in the second embodiment, camera images are employed to detect points of correspondence between projectors.
(110) A camera image where a plurality of grid patterns are overlapped is shown in
D(p.sub.ik,p.sub.il)<(12)
At this time, D(a, b) is a distance between points a and b, and is the radius of a search area around p.sub.ik.
(111) As shown in
(112) [Ex. 10]
Z.sub.pikpil(t.sub.pik,t.sub.pil)=|d.sub.i(P.sub.3D(t.sub.pik))d.sub.i(P.sub.3D(t.sub.pil))|(13)
(113) It should be noted that d.sub.i(P.sub.3D) is the depth of the coordinate point P.sub.3D of the camera i, and is a user-defined weight. The total energy with multiple cameras and projectors is defined by the following equation:
(114)
It should be noted that A.sub.p(i) is a set of projectors that share the field of view with the camera i, A.sub.c(k) is a set of cameras that share the field of view with the projector k. S.sub.ijk is a set of edges between the cameras i and j given by the pattern of the projector k. Q.sub.ikl is a set of edges between the projectors k and l in the image of the camera i.
(115) To increase the density of an image, a method described while referring to
(116) Next, optimization for the image in the entire range is performed by minimizing the energy. In the second embodiment, the energy consists of the data term and regularization term. The data term is calculated based on the difference of intensities between the camera and the projector, and the regularization term is defined by using the curvature around each vertex of the grid graph. When images in two ranges are superimposed with each other, the shapes are matched, and the depths of the images are optimized by employing the additional constraint.
(117) The state wherein the images in two ranges of two projectors are superimposed with each other is shown in
(118) When the depth at a point r is d.sub.r, and a small change of d.sub.r is d.sub.r, iterative minimization is performed by employing d.sub.r to update the depth. The energy is defined by using d.sub.r as follows:
(119)
It should be noted that D is a set of d.sub.r, and E.sub.I is a data term, while E.sub.S is a regularization term. E.sub.P represents the constraint between images in two ranges. G(r.sub.k) is a function to find the corresponding point r.sub.ln of a point r.sub.k. P.sub.3D(d.sub.r) represents that the coordinate point has been moved at a distance d.sub.r along the line of sight. d.sub.r for each pixel is iteratively updated by adding d.sub.r that minimizes an error E(D) in a non-linear minimization manner.
(120) According to the third embodiment, a case wherein, for example, six cameras and six projectors are alternately arranged on a circumference has been considered. Since one camera is located on each side of a single projector, six combinations are available as a set of two cameras and one projector, described in the second embodiment. When the colors of patterns projected by the individual projectors are selected as, for example, RGBRGB to avoid the same colors adjacent to each other, two different patterns are projected to one camera by the two projectors located on the respective sides. Therefore, the combination of two colors, RG, GB or BR, is identified by the above described method.
(121) As a conclusion of the above embodiments, correspondence is searched for by additionally employing the camera-projector information in the first embodiment, the camera-camera information in the second embodiment, or the projector-projector information in the third embodiment.
Fourth Embodiment
(122) In the first to the third embodiments, the matching cost has been obtained as the SSD between a camera image and a projector image (pattern image). Since a simple SSD with respect to a quadrilateral area is not appropriate as a data term, a patch obtained by approximating the area around the grid point of a target object to the tangent plane of the grid point has been employed. In a fourth embodiment of this invention, results obtained by line detection are to be compared, instead of comparison of the images.
(123) Another example for the intersection comparison method will be described while referring to
(124) Further, the camera image and the projector image are directly compared with each other for the calculation of the SSD, and therefore, when an object has a texture, the camera image might be adversely affected by the texture. That is, the intensity of an image is changed by the texture, and a difference between the comparison results is increased. In contrast, in case of line detection, the positions of the detected lines are compared, instead of comparing the images, and therefore, the result is not affected by the change of the intensity of the image. Thus, the affect due to the reflectivity of the object can be reduced.
Fifth Embodiment
(125) As described while referring to
(126) As shown in
(127) The degrees of similarity are compared for two arbitrary intersection points on the same epipolar line, and a parameter is selected to obtain the smallest degree of similarity. The average of the evaluation values of all of the intersection points is employed as the total evaluation value; however, the average evaluation value obtained by taking only arbitrary intersection points into account, or the smallest or largest value of the evaluation values for all of the intersection points, may also be employed as the total evaluation value. The parameters for which the smallest evaluation values are obtained are determined to be the optimal parameters.
(128) For determining the optimal parameter, only the projector image is employed to compare the intersection points on the epipolar line of the projector image. Assuming that the camera and the projector have been calibrated, when the parameter of the grid pattern is changed, the epipolar line is unchanged, while the intersection points on the same epipolar line are changed. Thus, the parameter for which the evaluation value obtained by calculation using the intersection points on the same epipolar line is the smallest should be selected.
(129) The intervals of the wave lines, the wavelengths of the wave lines, or the amplitudes of the wave lines are changed as the parameters of the grid pattern, or the pattern is rotated, and in every case, the energy is calculated to determine, as an optimal parameter, the parameter for which the total evaluation value is the smallest. It should be noted that the thicknesses or the colors (wavelengths) of the wave lines may also be included in the parameter.
Example 1
(130) The simulation result in the first embodiment is shown in
(131) An input image obtained by a method, described in NPL 8, that employs two colors is shown in
(132) Correspondence errors for
(133) The root-mean-square error (RMSE) for each pixel is shown in a table below:
(134) TABLE-US-00001 TABLE 1 Evaluation Method Input Image RMSE 1 RMSE 2 First Embodiment FIG. 16B 0.3957 0.2964 FIG. 17B 0.6245 0.4210 Method in NPL 8 FIG. 18A 0.6286 0.2356
(135) The RMSE values are RMSE1, obtained by calculation for all of the corresponding points that have been reconstructed, and RMSE2 obtained by calculation for the corresponding points, other than outliers that are beyond one pixel. It is apparent from this table that, in case of no texture, better RMSE1 is obtained for all of the pixels by the method in the first embodiment than by the method in NPL 8, while better RMSE2 for which the outliers are removed is obtained by the method in NPL 8 than by the method in the first embodiment.
(136) The probable reason for this is as follows. Since according to the method in NPL 8, the corresponding points are calculated based on the local ID (phase) of the line pattern that appears locally, the accuracy is high so long as the local ID information is correctly obtained. However, when decoding of the local ID is not successful, a large error occurs. This error is observed as salt-and-pepper noise in
(137) Polygon meshes reconstructed in the first embodiment are shown in
Example 2
(138) The results obtained through the experiment based on real data will be described. A camera of 16001200 pixels and a projector of 1024768 pixels were employed. The image sequences were captured at 30FPS, and a PC equipped with Intel Core i7 2.93 GHz and NVIDIA GeForce 580GTX was used. The above described algorithms were implemented by CUDA (Compute Unified Device Architecture). Line detection was implemented as a single thread on a CPU. First, in order to demonstrate the effectiveness of a grid pattern of wave lines, comparison of the grid pattern of wave lines with a linear line pattern was performed.
(139) The result of reconstruction based on the grid pattern of wave lines is shown in
(140) The result of 3D reconstruction for this embodiment is shown in
(141) A dense shape generated by the above described method is shown in
(142)
(143) The result for capturing the opening and closing movement of a hand is shown in
(144) The result for capturing the human movement that repels a punch is shown in
(145) The 3D reconstruction (one-shot reconstruction) method for a single image based on the projection of a single-colored and static pattern has been described. The correspondence information is implicitly represented by employing a difference of the patterns at the individual intersection points on a grid pattern of wave lines. Then, when the regularity of the pattern is distorted, the specificity of the pattern is increased, and the stable solution is obtained. Further, a description has also been given for the method whereby the shape reconstruction by the stereo matching method is extended to the use for the projector-camera system by taking the continuity of the grid into account. At the final stage of reconstruction, reconstruction by the grid is interpolated to estimate the depth for each pixel. It is proved that, compared with the conventional method, the more stable results are obtained, and effective measurement for a mobbing object is performed.