Multi-source image correspondence method and system based on heterogeneous model fitting
12131517 ยท 2024-10-29
Assignee
Inventors
Cpc classification
G06V10/751
PHYSICS
G06V10/50
PHYSICS
G06V10/449
PHYSICS
International classification
G06V10/75
PHYSICS
G06V10/44
PHYSICS
Abstract
A multi-source image correspondence method and system based on heterogeneous model fitting is provided, the method includes the following steps: constructing a multi-orientation phase consistency model, fusing phase consistency, image amplitude, and orientation detection feature points, constructing logarithmic polar coordinate descriptors with variable-size bins using sub-region grids and orientation histograms, effectively estimating model parameters through heterogeneous model fitting, accumulating matching pairs from different heterogeneous models that meet a preset joint position offset transformation error, outputting a final matching pair, and completing multi-source image correspondence. The present disclosure alleviate the influence of nonlinear radiation distortion by constructing the multi-orientation phase consistency model, constructing logarithmic polar coordinate descriptors with variable-size bins by sub-region grids and orientation histograms, removing an abnormal matching relationship in multi-source images with the heterogeneous model fitting method, thereby improving the accuracy and robustness of feature detection and improving multi-source image correspondence performance.
Claims
1. A multi-source image correspondence method based on heterogeneous model fitting, comprising the following steps: obtaining a two-dimensional image, constructing a two-dimensional log-Gabor filter for the two-dimensional image, and it is represented as: .sub.(,) (x,y) represents the amplitude component,
.sub.(,)(x,y) represents the phase component, .sub.(x,y) is a weight coefficient, is a truncation function,
.sub.(,)(x,y) is phase deviation with respect to the scale and orientation, ; and
are constant;
.sub. and
.sub. represent a maximum moment and a minimum moment corresponding to the scale ;
represent the phase consistency model; W.sub. represent the multi-orientation phase consistency model, I(x,y) represent the two-dimensional image, G.sub.(,).sup.eve(x,y) and G.sub.(,).sup.odd(x,y) represent the even-symmetric and odd-symmetric of the two-dimensional log-Gabor filter G, respectively; extracting image feature information from the multi-orientation phase consistency model by using Shi-Tomasi operator, filtering out a feature point with a response value below a set threshold, constructing a variable-size bin strategy based on image feature information, the variable-size bin strategy divides a circular neighborhood of feature distribution into a plurality of sub-regions according to different angular quantization rules, and different circular neighborhoods use gradient orientation histograms with different dimensions as a local descriptor; calculating the orientational histogram of each sub-region as a descriptor, defining a quantified orientation histogram for each feature point as a descriptor, and normalizing a descriptor vector; taking the two-dimensional image as a reference image, obtaining a to-be-matched target image, constructing an optimal geometric transformation model, and minimizing feature information between the reference image and the target image; obtaining coordinates of two feature points from the reference image and the target image respectively, and constructing an initial matching pair; generating multiple model hypotheses for every two images based on the heterogeneous model, the model hypotheses are generated by randomly sampling multiple minimum subsets from feature points in the model hypotheses; calculating a transformation error of any two feature points in the reference image and target image with respect to the model hypotheses with Sampson distance, forming an ascending permutation, selecting the least k-th-order statistic of a square transformation error as a minimum cost, k represents an acceptable size of a structure; extracting more matching pairs by combining horizontal displacement, vertical displacement, and cosine similarity of descriptor vector as a constraint criteria, calculating the offset of matching pairs in horizontal and vertical orientations as a position transformation error constraint feature descriptor, and constructing a joint position offset transformation error; accumulating matching pairs from different heterogeneous models that meet a preset joint position offset transformation error; retaining only one matching pair when two matching pairs have the same feature points; outputting a final matching pair to complete the multi-source image correspondence after an accumulation operation.
2. The multi-source image correspondence method based on heterogeneous model fitting as claimed in claim 1, wherein the calculating the orientational histogram of each sub-region as a descriptor and defining a quantified orientation histogram for each feature point as a descriptor are represented as:
.sub.d={
(1,1).Math.
(1,1), . . . ,
(h,l).Math.
(h,q), . . . ,
(n,k).Math.
(n,m)},h{1, . . . ,n},q{1, . . . ,m},l{1, . . . ,k} wherein,
(h,l) represents a sub-region of a q.sup.th radial quantization and i.sup.th angular quantization;
(h,q) represents an orientation histogram that is quantized of the q.sup.th histogram in h.sup.th radial quantization; n is the number of radial quantization; m is the number of histogram quantization; is the number of angular quantization; dimension of each descriptor is described as
3. The multi-source image correspondence method based on heterogeneous model fitting as claimed in claim 1, wherein the calculating a transformation error of any two feature points in the reference image and target image with respect to the model hypotheses with Sampson distance and forming an ascending permutation are represented as:
.sub.i.sup.(v)=[.sub.i,1.sup.(v),.sub.i,2.sup.(v), . . . ,.sub.i,M.sup.(v)] which satisfies .sub..sub.
4. The multi-source image correspondence method based on heterogeneous model fitting as claimed in claim 1, wherein the selecting the least k-th-order statistic of a square transformation error as a minimum cost is represented as:
5. The multi-source image correspondence method based on heterogeneous model fitting as claimed in claim 1, wherein a constructed joint position offset transformation error is represented as:
.sup.(v)(s.sub.i,s.sub.j)=(1+
.sup.(v)(s.sub.is.sub.j)).Math.
(s.sub.i,s.sub.j) wherein, s.sub.i and s.sub.j represent feature points in two heterogeneous images, respectively; subscripts i and j represent indexes of feature points in the image,
(s.sub.i,s.sub.j) denotes inverse cosine similarity of descriptors corresponding to s.sub.i and s.sub.j,
.sup.(v)(s.sub.i,s.sub.j)=s.sub.i.sub..sub.
6. The multi-source image correspondence method based on heterogeneous model fitting as claimed in claim 1, wherein the accumulating matching pairs from different heterogeneous models with a small joint position offset transformation error is represented as: () represents removing duplicates by traversing all candidate matching pairs,
.sup.(v) represents a candidate matching pair defined by the joint position offset transformation error,
Description
BRIEF DESCRIPTION OF DRAWINGS
(1)
(2)
(3)
(4)
(5)
DESCRIPTION OF EMBODIMENTS
(6) In order to make the purpose, technical solution, and advantages of the present disclosure clearer and untestable, the following will provide further detailed explanations of the present disclosure in combination with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only intended to explain the present disclosure and are not intended to limit the present disclosure.
Embodiment 1
(7) As shown in
(8) In this embodiment, the multi-orientation phase consistency model is constructed for multi-orientation feature detection, and which includes specific steps: S21: Giving a two-dimensional image I(x,y), a two-dimensional log-Gabor filter G for the image can be represented as:
(9)
(10)
(11)
(12)
(13)
(14) S24: Constructing a phase consistency (PC) model based on the amplitude component A.sub.(,)(x,y) and phase component P.sub.(,)(x,y),
(15)
where, .sub.(x,y) is a weight coefficient, is a truncation function, it is used to truncate a real number to an integer, which can effectively alleviate the influence of nonlinear radiative distortions (NRDs); A.sub.(,)(x,y) is the amplitude component; P.sub.(,)(x,y) denotes a phase deviation with respect to the scale and orientation ; and are constant so as to avoid the denominator being 0.
(16) S25: Calculating phase consistency weighted moments for multiple orientations so as to enhance robustness;
(17)
where, M.sub. and M.sub. represent a maximum moment and a minimum moment corresponding to the scale ; .sub.=PC(x,y).Math.cos(.sub.(,)), .sub.=PC(x,y).Math.sin(.sub.(,)) and (.sub.(,), represent the angular of orientation at scale .
(18) S26: Constructing a multi-orientation phase consistency model based on phase consistency model W.sub. and multi-orientation weighted moments to alleviate the influence of nonlinear radiative distortions (NRDs):
W.sub.=0.5(M.sub.+M.sub.+
where,
(19) S27: Extracting a feature point from the multi-orientation phase consistency model W.sub. by using the Shi-Tomasi operator, filtering out a feature point with a response value below a set threshold; Specifically, the Shi-Tomasi operator first calculates a structural tensor of each pixel in the multi-orientation phase consistency model W.sub. to obtain a minimum feature value of each pixel. Then, based on the set threshold, the pixel with a larger minimum feature value is selected as a corner point, the feature information represents prominent features in the image, such as corners and edges. The extracted feature information will participate in a subsequent descriptor construction.
(20) S3: Constructing a logarithmic polar coordinate descriptor with variable-size bins using sub-region grids and orientation histograms, this descriptor is robust to geometric distortion; this embodiment constructs the logarithmic polar coordinate descriptor with variable-size bins, which can improve the ability of the descriptor to distinguish a local geometric distortion and establish a high-quality initial correspondence, and it includes the following steps:
(21) S31: As shown in
(22) S32: Calculating an orientational histogram of each sub-region as a descriptor and defining a quantified orientation histogram for each feature point as a descriptor, and it is represented as:.sub.d={
(1,1).Math.
(1,1), . . . ,
(h,l).Math.
(h,q), . . . ,
(n,k).Math.
(n,m)},h{1, . . . ,n},q{1, . . . ,m},l{1, . . . ,k}(Formula 7)
where, R(h,t) represents a sub-region of a q.sup.th radial quantization and i.sup.th angular quantization; (h,q) represents an orientation histogram that is quantized of the q.sup.th histogram in h.sup.th radial quantization; n denotes the amount of radial quantization; m is the number of histogram quantization; is the number of angular quantization; thus, the dimension of each descriptor can be described as d=.sub.h=1.sup.nm.sub.k.Math.k.sub.h.
(23) S33: Normalizing a descriptor vector to reduce the influence of illumination variations. A normalized descriptor vector can be used in a subsequent matching process to calculate and evaluate a similarity between image pairs and obtain a determined initial matching pair.
(24) S4: Effectively estimating a parameter of the model by fitting the heterogeneous model and fusing advantages of various basic transformation models so as to effectively estimate the parameter of the model and alleviate the influence of an outlier; the specific steps include: S41: giving a two-dimensional image (i.e., a reference image) I(x,y) and another image that needs to be matched (a target image) I(x,y), a goal of this step is to find an optimal geometric transformation model {circumflex over (f)}(x,y) so as to minimize the feature information (such as distance) between the reference image I(x,y) and the transformed target image I(f(x,y)):
{circumflex over (f)}(x,y)=argmin[(I(x,y),I(f(x,y)))](Formula 8)
where, f(x,y), I(f(x,y)), and are the geometric transformation model, a transformed target image, and a distance metric; {circumflex over (f)}(x,y) represents the optimal geometric transformation model and which refers to a geometric transformation model that can minimize the feature information distance. For example, feature point pairs between two images can be used to estimate an affine transformation matrix (also known as an affine transformation model). If there are enough feature points can be found to support the affine transformation model (i.e. minimizing the distance between the feature point and the model), then the current geometric transformation is considered an optimal geometric transformation model (i.e. the optimal geometric transformation model). In addition, due to factors such as deformation in images, accurate matching between the two images requires compliance with the constraints of geometric transformation models.
(25) S42: Giving a set of initial matching pairs S={(s.sub.i,s.sub.i)}.sub.i=1.sup.N where N is the number of matching pairs, s.sub.i=(x.sub.i,y.sub.i) and s.sub.i=(x.sub.i,y.sub.i) represents the coordinates of two feature points from the reference image and the target image, respectively.
(26) S43: Generating a model hypothesis .sup.(v)={.sub.i.sup.(v)}.sub.i=1:M for every two images and for each type of model vV, where, vV represents the heterogeneous model (i.e., a collection of different types of models, including a similarity transformation model, an affine transformation model, and a perspective transformation model). These model hypotheses are generated by randomly sampling a minimum subset p from feature points.
(27) S4: Calculating a transformation error .sub..sub.
.sub.i.sup.(v)=[.sub.i,1.sup.(v),.sub.i,2.sup.(v), . . . ,.sub.i,M.sup.(v)](Formula 9)
which satisfies .sub..sub.
(28) S45: Introducing a modified cost function to select the least k-th-order statistic of a square transformation error as a minimum cost:
(29)
where, .sub.j.sup.2(v) denotes a j.sup.th sorted squared transformation error, and k represents an acceptable size of a structure, which is greater than the size of a minimum subset (kp).
(30) S46: Effectively quantified the significant transformation model as the minimum cost of the k-th-order statistic with the above cost function and then combined with horizontal, vertical displacement and cosine similarity of descriptor vectors as a constraint criterion to extract more correct matching pairs.
(31) S47: Evaluating the quality of model hypotheses that are generated by random sampling through minimizing the k-th-order statistic of a square transformation error by a significant transformation model. Based on the significant transformation model, a small number of reliable feature matching pairs can be obtained. However, significant models typically only contain a small number of reliable feature matching pairs. Therefore, it is necessary to calculate offsets of these matching pairs in the horizontal and vertical orientations as position transformation errors to constrain the feature descriptors. Specifically, their offsets in the horizontal and vertical orientations can be obtained by calculating the Euclidean distance between the matching pairs. Then, these offsets are used as position transformation errors to constrain the feature descriptors. Finally, a joint position offset transformation error J.sup.(v)(s.sub.i,s.sub.j) is defined as:
J.sup.(v)(s.sub.i,s.sub.j)=(1+E.sup.(v)(s.sub.i,s.sub.j).Math.D(s.sub.i,s.sub.j)(Formula 11)
where, s.sub.i and s.sub.j represent feature points in two heterogeneous images, respectively. The subscripts i and j are used to represent indexes of feature points in the image, D(s.sub.i,s.sub.j) denotes inverse cosine similarity of descriptors corresponding s.sub.i and s.sub.j, and E.sup.(v)(s.sub.i,s.sub.j)=s.sub.i.sub..sub.
(32) S48: Accumulating matching pairs with smaller joint position offset transformation error from different models (i.e. similarity model, affine model, perspective model, etc.).
(33)
where, F(.Math.) represents removing duplicates by traversing all candidate matching pairs, if two matching pairs have the same feature points, only one matching pair is retained; {tilde over (S)}.sup.(v) represents a candidate matching pair defined by joint position offset transformation error,
(34)
represents an accumulating operation performed on each heterogeneous model.
(35) S5: Outputting a final matching pair S*.
(36) As shown in
Embodiment 2
(37) This embodiment has the same technical solution as the embodiment 1, except for the following technical solution.
(38) This embodiment provides a multi-source image correspondence system based on heterogeneous model fitting, including: an image acquisition module, a log-Gabor filter construction module, a frequency domain conversion module, a multi-orientation phase consistency model construction module, an image feature information extraction module, a variable-size bin strategy construction module, a descriptor construction module, a normalization module, a target image acquisition module, an optimal geometric transformation model construction module, an initial matching pair construction module, a model hypothesis generation module, a minimum cost calculation module, a joint position offset transformation error construction module, and a matching output module.
(39) In this embodiment, the image acquisition module is configured to obtain a two-dimensional image.
(40) In this embodiment, the log-Gabor filter construction module is configured to construct a two-dimensional log-Gabor filter for the two-dimensional image.
(41) In this embodiment, the frequency domain conversion module is configured to transform the two-dimensional log-Gabor filter from a frequency domain to a spatial domain based on a Fourier inverse transform.
(42) In this embodiment, the multi-orientation phase consistency model construction module is configured to construct a multi-orientation phase consistency model, calculate an amplitude component and a phase component with respect to the scale and orientation, construct a phase consistency model based on the amplitude component and phase component, calculate phase consistency weighted moments in multiple orientations, and construct a multi-orientation phase consistency model based on phase consistency model and phase consistency weighted moments in multiple orientations.
(43) In this embodiment, the image feature information extraction module is configured to extract image feature information from the multi-orientation phase consistency model by using the Shi-Tomasi operator, to filter out a feature point with a response value below a set threshold.
(44) In this embodiment, the variable-size bin strategy construction module is configured to construct a variable-size bin strategy based on image feature information, the variable-size bin strategy divides a circular neighborhood of feature distribution into a plurality of sub-regions with different numbers according to different angle quantization rules, and different circular neighborhoods use gradient orientation histograms with different dimensions as a local descriptor.
(45) In this embodiment, the descriptor construction module is configured to calculate the orientation histogram of each sub-region as a descriptor, and define a quantified orientation histogram for each feature point as a descriptor.
(46) In this embodiment, the normalization module is configured to normalize a descriptor vector.
(47) In this embodiment, the target image acquisition module is configured to take the two-dimensional image as a reference image to obtain a to-be-matched target image.
(48) In this embodiment, the optimal geometric transformation model construction module is configured to construct an optimal geometric transformation model, and minimize feature information between the reference image and the target image.
(49) In this embodiment, the initial matching construction module is configured to obtain the coordinates of two feature points from the reference image and the target image, respectively; and construct an initial matching pair.
(50) In this embodiment, the model hypotheses generation module is configured to generate multiple model hypotheses for every two images based on the heterogeneous model, the model hypotheses are generated by randomly sampling multiple minimum subsets from feature points in the model hypotheses.
(51) In this embodiment, the minimum cost calculation module is configured to calculate a transformation error of any two feature points in the reference image and target image with respect to the model hypotheses using Sampson distance, form an ascending permutation, select the least k-th-order statistic of a square transformation error as a minimum cost, k represents an acceptable size of a structure.
(52) In this embodiment, the joint position offset transformation error construction module is configured to construct a joint position offset transformation error, which combines horizontal displacement, vertical displacement, and cosine similarity of descriptor vectors as a constraint criteria to extract more matching pairs; calculate offset of matching pairs in horizontal and vertical orientations as a position transformation error constraint feature descriptor, and construct a joint position offset transformation error.
(53) In this embodiment, the matching output module is configured to accumulate matching pairs from different heterogeneous models that meet a preset joint position offset transformation error; retain only one matching pair when two matching pairs have the same feature points, output a final matching pair to complete multi-source image correspondence after an accumulation operation.
(54) The above embodiments are preferred embodiments of the present disclosure, but the embodiments of the present disclosure are not limited by the above embodiments. Any other changes, modifications, substitutions, combinations, or simplifications that do not deviate from the spirit and principles of the present disclosure should be included in the protection scope of the present disclosure.