Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

A Robust Subpixel Motion Estimation Algorithm Using HOS in the Parametric Domain

A Robust Subpixel Motion Estimation Algorithm Using HOS in the Parametric Domain A Robust Subpixel Motion Estimation Algorithm Using HOS in the Parametric Domain //// Hindawi Publishing Corporation Home Journals About Us About this Journal Submit a Manuscript Table of Contents Journal Menu Abstracting and Indexing Aims and Scope Article Processing Charges Articles in Press Author Guidelines Bibliographic Information Contact Information Editorial Board Editorial Workflow Free eTOC Alerts Reviewers Acknowledgment Society Affiliation Subscription Information Open Special Issues Published Special Issues Special Issue Guidelines Abstract Full-Text PDF Full-Text HTML Linked References How to Cite this Article Complete Special Issue EURASIP Journal on Image and Video Processing Volume 2009 (2009), Article ID 381673, 10 pages doi:10.1155/2009/381673 Research Article <h2>A Robust Subpixel Motion Estimation Algorithm Using HOS in the Parametric Domain</h2> E. M. Ismaili Aalaoui , 1,2 E. Ibn-Elhaj , 2 and E. H. Bouyakhf 1 1 Faculté des Sciences de Rabat, Université Mohammed V Agdal, 4 avenue Ibn Battouta, B.P. 1014 RP, 10000 Rabat, Morocco 2 Department of Telecommunication, Institut National des Postes et Télécommunication, avenue Allal Al Fassi, Madinat El Irfane, 10000 Rabat, Morocco Received 28 April 2008; Accepted 24 October 2008 Academic Editor: Simon Lucey Copyright © 2009 E. M. Ismaili Aalaoui et al. This is an open access article distributed under the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Abstract Motion estimation techniques are widely used in todays video processing systems. The most frequently used techniques are the optical flow method and phase correlation method. The vast majority of these algorithms consider noise-free data. Thus, in the case of the image sequences are severely corrupted by additive Gaussian (perhaps non-Gaussian) noises of unknown covariance, the classical techniques will fail to work because they will also estimate the noise spatial correlation. In this paper, we have studied this topic from a viewpoint different from the above to explore the fundamental limits in image motion estimation. Our scheme is based on subpixel motion estimation algorithm using bispectrum in the parametric domain. The motion vector of a moving object is estimated by solving linear equations involving third-order hologram and the matrix containing Dirac delta function. Simulation results are presented and compared to the optical flow and phase correlation algorithms; this approach provides more reliable displacement estimates particularly for complex noisy image sequences. In our simulation, we used the database freely available on the web. 1. Introduction The importance of image sequence processing is constantly growing with the ever increasing use of television and video systems in consumer, commercial, medical, and scientific applications. Image sequences can be acquired by film-based motion picture cameras or electronic video cameras. In either case, there are several factors related to imaging sensor limitations that contribute to the graininess (noise) of resulting images. Electronic sensor noise and film grain are among these factors [ 1 ]. In many cases, graininess may result in visually disturbing degradation of the image quality, or it may mask important image information. Even if the noise may not be perceived at full-speed video due to the temporal masking effect of the eye, it often leads to unacceptable single-frame hardcopies and to poor-quality freeze-frames that adversely affect the performance of subsequent image analysis [ 2 ]. The motion estimation process must be able to track objects within a noisy source. In a noisy source, objects appear to change from frame to frame because of the noise, not necessarily as the result of object motion [ 3 ]. Tracking objects within a noisy environment is difficult, especially if the image frames are severely corrupted by additive Gaussian noises of unknown covariance; second-order statistics methods do not work well. Higher-order statistics (HOS) in general and the bispectrum (order 3) in particular have recently been widely used as an important tool for signal processing. The classical methods based on the power spectrum are now being effectively superseded by the bispectral ones due to some definite disadvantages of the former. These include the inability to identify systems fed by non-Gaussian noise (NGN) inputs and nonminimum phase (NMP) systems and identification of system nonlinearity [ 4 ]. In these cases, the autocorrelation-based methods offer no answer. Out of all these, the identifiability of NMP systems has received the maximum attention from researchers. HOS-based methods have been proposed to estimate motion between image frames [ 5 – 9 ]. In, the motion estimation is based on the bispectrum method for sub-pixel resolution of noisy image sequences. In [ 7 ], the displacement vector is obtained by maximizing a third-order statistics criterion. In [ 8 ], the global motion parameters are obtained by a new region recursive algorithm. In [ 6 ], several algorithms are developed based on a parametric cumulant method, a cumulant-matching method, and a mean kurtosis error criterion. The latter is an extension of the quadratic pixel-recursive method by Netravali and Robbins [ 10 ]. In [ 11 ], it is shown that such statistical parameters are insensitive to additive Gaussian noises. In particular, bispectrum parameters are insensitive to any symmetrically distributed noise and also exhibit the capability of better characterizing NGN and identifying NMP linear systems as well as nonlinear systems. Therefore, transformation to a higher-order domain reduces the effect of noise significantly. In this correspondence, a novel algorithm for the detection of motion vectors in video sequences is proposed. The algorithm uses bispectrum model-based subpixel motion estimation in the parametric domain for noisy image sequences to obtain a measure of content similarity for temporally adjacent frames and responds very well to scene motion vectors. The algorithm is insensitive to the presence of symmetrically distributed noise. The outline of this paper is as follows. First, the problem formulation is introduced in Section 2 . In Section 3 , we first present briefly the definitions and properties of the bispecrum and cross-bispectrum. Next, we describe the motion estimation in the parametric domain. High-accuracy subpixel motion estimation is discussed in Section 4 . Section 5 presents an evaluation of the computational complexity of our algorithm. The results of the experimental evaluation of the proposed method are shown in Section 6 and compared to existing methods while Section 7 concludes the paper. 2. Problem Formulation The problem of motion estimation can be stated as follows: “Given an image sequence, compute a representation of the motion field that best aligns pixels in one frame of the sequence with those in the next” [ 9 ]. This is formulated as follows: 𝑔 𝑘 − 1 ( 𝑥 , 𝑦 ) = 𝑓 𝑘 − 1 ( 𝑥 , 𝑦 ) + 𝑛 𝑘 − 1 𝑔 ( 𝑥 , 𝑦 ) , ( 1 ) 𝑘 ( 𝑥 , 𝑦 ) = 𝑓 𝑘 ( 𝑥 , 𝑦 ) + 𝑛 𝑘 ( 𝑥 , 𝑦 ) = 𝑓 𝑘 − 1  𝑥 − 𝑑 𝑥 , 𝑦 − 𝑑 𝑦  + 𝑛 𝑘 ( 𝑥 , 𝑦 ) , ( 2 ) where ( 𝑥 , 𝑦 ) denotes spatial image position of a point; 𝑔 𝑘 ( 𝑥 , 𝑦 ) and 𝑔 𝑘 − 1 ( 𝑥 , 𝑦 ) are observed image intensities at instants 𝑘 and 𝑘 − 1 , respectively; 𝑓 𝑘 ( 𝑥 , 𝑦 ) and 𝑓 𝑘 − 1 ( 𝑥 , 𝑦 ) are noise-free frames; 𝑛 𝑘 ( 𝑥 , 𝑦 ) and 𝑛 𝑘 − 1 ( 𝑥 , 𝑦 ) are assumed to be spatially and temporally stationary, zero-mean image Gaussian (or non-Gaussian) noise sequences with unknown covariance; and ( 𝑑 𝑥 , 𝑑 𝑦 ) is the displacement vector of the object during the time interval [ 𝑘 − 1 , 𝑘 ] . The goal is to estimate ( 𝑑 𝑥 , 𝑑 𝑦 ) from 𝑔 𝑘 ( 𝑥 , 𝑦 ) and 𝑔 𝑘 − 1 ( 𝑥 , 𝑦 ) . 3. Bispectrum-Based Image Motion Estimation 3.1. Definitions and Properties In this subsection, some HOS functions are defined and their properties are described in order to provide the necessary tools to understand the motion estimation methodology. Also, the third-order autocumulant/moment sequence, 𝐶 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3 ( 𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2 ) , of 𝑔 𝑘 ( 𝑥 , 𝑦 ) is defined as follows [ 5 ]: 𝐶 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2   𝑔 = 𝐸 𝑘 ( 𝑥 , 𝑦 ) 𝑔 𝑘  𝑥 + 𝑟 1 , 𝑦 + 𝑟 2  𝑔 𝑘  𝑥 + 𝑠 1 , 𝑦 + 𝑠 2 ,   ( 3 ) where 𝐸 { ⋅ } denotes the expectation operation; ( 𝑟 1 , 𝑟 2 ) and ( 𝑠 1 , 𝑠 2 ) are two shifted versions of the 𝑔 𝑘 ( 𝑥 , 𝑦 ) . To understand the theory of triple correlations physically for 2D data [ 4 ], the reader is referred to Figure 1 . The Figure shows the spaces occupied by the original data (denoted by continuous box) and two shifted versions of the same data (denoted by dashed boxes). The shifts are made by the amounts ( 𝑟 1 , 𝑟 2 ) and ( 𝑠 1 , 𝑠 2 ) , respectively. It is now obvious that the product of the overlapping data positions (shown by the shaded portion) denotes the triple correlation function as defined by ( 3 ). Figure 1: Physical representation of cumulants [ 4 ]. Also, the third-order autocumulant/moment sequence is defined as follows [ 12 ]: 𝐶 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2  = 𝐶 𝑓 𝑘 𝑓 𝑘 𝑓 𝑘 3  𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2  + 𝐶 𝑛 𝑘 𝑛 𝑘 𝑛 𝑘 3  𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2  . ( 4 ) Equation ( 4 ) states that the triple correlation of 𝑓 𝑘 ( 𝑥 , 𝑦 ) plus 𝑛 𝑘 ( 𝑥 , 𝑦 ) is equal to the triple correlation of the 𝑓 𝑘 ( 𝑥 , 𝑦 ) plus the triple correlation of the noise. For 𝑛 𝑘 ( 𝑥 , 𝑦 ) a zero-mean Gaussian noise, then its triple correlation is identically zero [ 7 , 12 ]. This provides a theoretical basis for using the triple correlation (or the bispectrum) as a method of reducing the effects of additive noise. Then the term 𝐶 𝑛 𝑘 𝑛 𝑘 𝑛 𝑘 3 ( 𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2 ) is negligible which renders the triple correlation very effective in detecting a signal embedded in noise. Therefore, 𝐶 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2  = 𝐶 𝑓 𝑘 𝑓 𝑘 𝑓 𝑘 3  𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2  . ( 5 ) Also, 𝑛 𝑘 ( 𝑥 , 𝑦 ) can be non-Gaussian if it is independent and identically distributed (i.i.d.) and nonskewed (e.g., symmetrically distributed). The bispectrum, 𝐵 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3 ( 𝑢 1 , 𝑢 2 ; 𝑣 1 , 𝑣 2 ) , is defined as the 4D Fourier transform of the third-order autocumulant (or moment) [ 4 ]: 𝐵 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑢 1 , 𝑢 2 ; 𝑣 1 , 𝑣 2  = ℱ 4  𝐶 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2   = ℱ 4  𝐶 𝑓 𝑘 𝑓 𝑘 𝑓 𝑘 3  𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2   = 𝐵 𝑓 𝑘 𝑓 𝑘 𝑓 𝑘 3  𝑢 1 , 𝑢 2 ; 𝑣 1 , 𝑣 2  , ( 6 ) where ℱ 4 denotes the 4D Fourier transform operation; ( 𝑢 1 , 𝑢 2 ) and ( 𝑣 1 , 𝑣 2 ) are the frequency coordinates for the 2D Fourier transform. Let 𝐺 𝑔 𝑘 ( 𝑢 ) be the discrete Fourier transform (DFT) of the frame 𝑔 𝑘 ( 𝑥 , 𝑦 ) . Each component of the bispectrum is estimated by a triple product of Fourier coefficients as follows [ 12 ]: 𝐵 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑢 1 , 𝑢 2 ; 𝑣 1 , 𝑣 2  = 𝐺 𝑔 𝑘  𝑢 1 , 𝑢 2  𝐺 𝑔 𝑘  𝑣 1 , 𝑣 2  𝐺 ∗ 𝑔 𝑘  𝑢 1 + 𝑣 1 , 𝑢 2 + 𝑣 2  , ( 7 ) where ∗ indicates the complex conjugate. The cross-bispectrum is obtained in a similar manner as the bispectrum. Thus, 𝐵 𝑔 𝑘 − 1 𝑔 𝑘 𝑔 𝑘 − 1 3  𝑢 1 , 𝑢 2 ; 𝑣 1 , 𝑣 2  = 𝐵 𝑔 𝑘 − 1 𝑔 𝑘 − 1 𝑔 𝑘 − 1 3  𝑢 1 , 𝑢 2 ; 𝑣 1 , 𝑣 2  𝑒 − 𝑗 2 𝜋 ( 𝑢 1 𝑑 𝑥 + 𝑢 2 𝑑 𝑦 ) , 𝐵 𝑔 𝑘 − 1 𝑔 𝑘 𝑔 𝑘 − 1 3  𝑢 1 , 𝑢 2 ; 𝑣 1 , 𝑣 2  = 𝐺 𝑔 𝑘 − 1  𝑢 1 , 𝑢 2  𝐺 𝑔 𝑘  𝑣 1 , 𝑣 2  𝐺 ∗ 𝑔 𝑘 − 1  𝑢 1 + 𝑣 1 , 𝑢 2 + 𝑣 2  . ( 8 ) On close observation and after certain algebraic manipulations, ( 3 ) shows the third-order cumulant sequence to possess the following symmetry properties [ 4 ]: 𝐶 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2  = 𝐶 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑠 1 , 𝑠 2 ; 𝑟 1 , 𝑟 2  = 𝐶 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  − 𝑠 1 , − 𝑠 2 ; 𝑟 1 − 𝑠 1 , 𝑟 2 − 𝑠 2  = 𝐶 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑠 1 − 𝑟 1 , 𝑠 2 − 𝑟 2 ; − 𝑟 1 , − 𝑟 2  = 𝐶 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑟 1 − 𝑠 1 , 𝑟 2 − 𝑠 2 ; − 𝑠 1 , − 𝑠 2  = 𝐶 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  − 𝑟 1 , − 𝑟 2 ; 𝑠 1 − 𝑟 1 , 𝑠 2 − 𝑟 2  . ( 9 ) This directly leads to the following symmetry properties of the bispectrum sequence [ 4 ]: 𝐵 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑢 1 , 𝑢 2 ; 𝑣 1 , 𝑣 2  = 𝐵 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑣 1 , 𝑣 2 ; 𝑢 1 , 𝑢 2  = 𝐵 ∗ 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  − 𝑣 1 , − 𝑣 2 ; − 𝑢 1 , − 𝑢 2  = 𝐵 ∗ 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  − 𝑢 1 , − 𝑢 2 ; − 𝑣 1 , − 𝑣 2  = 𝐵 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  − 𝑢 1 − 𝑣 1 , − 𝑢 2 − 𝑣 2 ; 𝑣 1 , 𝑣 2  = 𝐵 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑢 1 , 𝑢 2 ; − 𝑢 1 − 𝑣 1 , − 𝑢 2 − 𝑣 2  = 𝐵 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  − 𝑢 1 − 𝑣 1 , − 𝑢 2 − 𝑣 2 ; 𝑢 1 , 𝑢 2  = 𝐵 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑣 1 , 𝑣 2 ; − 𝑢 1 − 𝑣 1 , − 𝑢 2 − 𝑣 2  . ( 1 0 ) These symmetry properties reduce the computational burden while calculating the bispectrum. 3.2. Parametric Model-Based Motion Estimation The problem of signal processing using bispectrum in the parametric domain has recently been widely addressed by researchers. Two primary schools of thought exist for this. The first line is headed by Raghuveer and Nikias [ 13 ], who have parametrized the bispectrum through solution of a cumulant matrix equation. The other school of thought headed by Giannakis [ 14 ] calculates the system impulse response coefficients directly using a linear combination of cumulant slices. Other papers have been subsequently published which give extensions and applications of these basic approaches (see [ 15 ] and references therein). Here, we propose the bispectrum model-based subpixel motion estimation in the parametric domain. Simulations demonstrate that this method requires large blocks of data and thus may be appropriate for estimating object displacement in background noise. Consequently, our approach will be derived in this context and hence, ( 𝑑 𝑥 , 𝑑 𝑦 ) = c o n s t . in ( 2 ). Substituting 𝑓 𝑘 − 1 from ( 1 ) into ( 2 ) we obtain 𝑔 𝑘  ( 𝑥 , 𝑦 ) = 𝑖 ∈ 𝑅  𝑗 ∈ 𝑅 𝛼 ( 𝑖 , 𝑗 ) 𝑔 𝑘 − 1 ( 𝑥 − 𝑖 , 𝑦 − 𝑗 ) + 𝑛 𝑘 ( 𝑥 , 𝑦 ) + 𝑛 𝑘 − 1  𝑥 − 𝑑 𝑥 , 𝑦 − 𝑑 𝑦  , ( 1 1 ) where, in theory, 𝛼 ( 𝑖 , 𝑗 ) = 0 for all { ( 𝑖 , 𝑗 ) } except ( 𝑖 , 𝑗 ) = ( 𝑑 𝑥 , 𝑑 𝑦 ) , and 𝛼 ( 𝑑 𝑥 , 𝑑 𝑦 ) = 1 . If the search region 𝑅 contains the largest possible horizontal and vertical delays, and if we take 𝑔 𝑘 ( 𝑥 , 𝑦 ) for 𝑔 𝑘 ( 𝑥 + 𝑟 1 , 𝑦 + 𝑟 2 ) , we obtain 𝑔 𝑘  𝑥 + 𝑟 1 , 𝑦 + 𝑟 2  =  𝑖 ∈ 𝑅  𝑗 ∈ 𝑅 𝛼 ( 𝑖 , 𝑗 ) 𝑔 𝑘 − 1  𝑥 − 𝑖 + 𝑟 1 , 𝑦 − 𝑗 + 𝑟 2  + 𝑛 𝑘  𝑥 + 𝑟 1 , 𝑦 + 𝑟 2  + 𝑛 𝑘 − 1  𝑥 + 𝑟 1 − 𝑑 𝑥 , 𝑦 + 𝑟 2 − 𝑑 𝑦  . ( 1 2 ) By multiplying both sides of ( 12 ) by 𝑔 𝑘 − 1 ( 𝑥 , 𝑦 ) 𝑔 𝑘 − 1 ( 𝑥 + 𝑠 1 , 𝑦 + 𝑠 2 ) and taking expectations, we obtain 𝐸  𝑔 𝑘 − 1 ( 𝑥 , 𝑦 ) 𝑔 𝑘  𝑥 + 𝑟 1 , 𝑦 + 𝑟 2  𝑔 𝑘 − 1  𝑥 + 𝑠 1 , 𝑦 + 𝑠 2 =    𝑖 ∈ 𝑅  𝑗 ∈ 𝑅  𝑔 𝛼 ( 𝑖 , 𝑗 ) 𝐸 𝑘 − 1 ( 𝑥 , 𝑦 ) 𝑔 𝑘 − 1  𝑥 − 𝑖 + 𝑟 1 , 𝑦 − 𝑗 + 𝑟 2  × 𝑔 𝑘 − 1  𝑥 + 𝑠 1 , 𝑦 + 𝑠 2  𝑔   + 𝐸 𝑘 − 1 ( 𝑥 , 𝑦 ) 𝑛 𝑘  𝑥 + 𝑟 1 , 𝑦 + 𝑟 2  × 𝑔 𝑘 − 1  𝑥 + 𝑠 1 , 𝑦 + 𝑠 2  𝑔   + 𝐸 𝑘 − 1 ( 𝑥 , 𝑦 ) 𝑔 𝑘 − 1  𝑥 + 𝑠 1 , 𝑦 + 𝑠 2  × 𝑛 𝑘 − 1  𝑥 + 𝑟 1 − 𝑑 𝑥 , 𝑦 + 𝑟 2 − 𝑑 𝑦 .   ( 1 3 ) Therefore, 𝐶 𝑔 𝑘 − 1 𝑔 𝑘 𝑔 𝑘 − 1 3  𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2  =  𝑖 ∈ 𝑅  𝑗 ∈ 𝑅 𝛼 ( 𝑖 , 𝑗 ) 𝐶 𝑔 𝑘 − 1 𝑔 𝑘 − 1 𝑔 𝑘 − 1 3  𝑟 1 − 𝑖 , 𝑟 2 − 𝑗 ; 𝑠 1 , 𝑠 2  + 𝐶 𝑔 𝑘 − 1 𝑛 𝑘 𝑔 𝑘 − 1 3  𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2  + 𝐶 𝑔 𝑘 − 1 𝑛 𝑘 − 1 𝑔 𝑘 − 1 3  𝑟 1 − 𝑑 𝑥 , 𝑟 2 − 𝑑 𝑦 ; 𝑠 1 , 𝑠 2  . ( 1 4 ) However, 𝐶 𝑔 𝑘 − 1 𝑛 𝑘 𝑔 𝑘 − 1 3 ( 𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2 ) and 𝐶 𝑔 𝑘 − 1 𝑛 𝑘 − 1 𝑔 𝑘 − 1 3 ( 𝑟 1 − 𝑑 𝑥 , 𝑟 2 − 𝑑 𝑦 ; 𝑠 1 , 𝑠 2 ) are identically zero for all ( 𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2 ) due to the fact that the signal and noise are zero-mean and independent. Consequently, ( 14 ) becomes 𝐶 𝑔 𝑘 − 1 𝑔 𝑘 𝑔 𝑘 − 1 3  𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2  =  𝑖 ∈ 𝑅  𝑗 ∈ 𝑅 𝛼 ( 𝑖 , 𝑗 ) 𝐶 𝑔 𝑘 − 1 𝑔 𝑘 − 1 𝑔 𝑘 − 1 3  𝑟 1 − 𝑖 , 𝑟 2 − 𝑗 ; 𝑠 1 , 𝑠 2  . ( 1 5 ) Taking the 4D Fourier transform of ( 15 ) and rearranging, the following pulse transfer function for the system is obtained: 𝐵 𝑔 𝑘 − 1 𝑔 𝑘 𝑔 𝑘 − 1 3  𝑢 1 , 𝑢 2 ; 𝑣 1 , 𝑣 2  = 𝐵 𝑔 𝑘 − 1 𝑔 𝑘 − 1 𝑔 𝑘 − 1 3  𝑢 1 , 𝑢 2 ; 𝑣 1 , 𝑣 2   𝑖 ∈ 𝑅  𝑗 ∈ 𝑅 𝛼 ( 𝑖 , 𝑗 ) 𝑒 − 𝑗 2 𝜋 ( 𝑢 1 𝑖 + 𝑢 2 𝑗 ) . ( 1 6 ) The third-order hologram, ℎ ( 𝑟 1 , 𝑟 2 ) , is then defined by ℎ  𝑟 1 , 𝑟 2  = ℱ − 4  𝐵 𝑔 𝑘 − 1 𝑔 𝑘 𝑔 𝑘 − 1 3  𝑢 1 , 𝑢 2 ; 𝑣 1 , 𝑣 2  𝐵 𝑔 𝑘 − 1 𝑔 𝑘 − 1 𝑔 𝑘 − 1 3  𝑢 1 , 𝑢 2 ; 𝑣 1 , 𝑣 2   =  𝑖 ∈ 𝑅  𝑗 ∈ 𝑅  𝑟 𝛼 ( 𝑖 , 𝑗 ) 𝛿 1 − 𝑖 , 𝑟 2  , − 𝑗 ( 1 7 ) where ℱ − 4 denotes the 4D inverse Fourier transform operation. Selecting various integers for 𝑟 1 and 𝑟 2 , we form an overdetermined system of equations. For example, if the chosen search region 𝑅 is a rectangular that varies from − 𝑃 𝑥 to 𝑃 𝑥 in the horizontal direction and − 𝑃 𝑦 to 𝑃 𝑦 in the vertical direction, and if 𝑟 1 ranges from − 𝑃 𝑥 to 𝑃 𝑥 , and 𝑟 2 ranges from − 𝑃 𝑦 to 𝑃 𝑦 , a set of linear equations can be produced as follows:  ̂ ℎ = 𝛼 𝛿 , ( 1 8 ) where   ℎ ℎ = 𝑙  − 𝑃 𝑥 , − 𝑃 𝑦  , ℎ 𝑙  − 𝑃 𝑥 + 1 , − 𝑃 𝑦  + 1 , … , ℎ 𝑙  𝑃 𝑥 , 𝑃 𝑦   𝑇 ,  𝛼 = [ 𝛼 − 𝑃 𝑥 , − 𝑃 𝑦  ) , 𝛼 − 𝑃 𝑥 + 1 , − 𝑃 𝑦   𝑃 + 1 , … , 𝛼 𝑥 , 𝑃 𝑦   𝑇 , ̂ ⎡ ⎢ ⎢ ⎢ ⎣  𝛿 = 𝛿 ( 0 , 0 ) ⋯ 𝛿 − 2 𝑃 𝑥 , − 2 𝑃 𝑦   𝛿 ( 1 , 1 ) ⋯ 𝛿 − 2 𝑃 𝑥 + 1 , − 2 𝑃 𝑦  + 1 ⋮ ⋮ 𝛿 ( 2 𝑃 𝑥 , 2 𝑃 𝑦 ⎤ ⎥ ⎥ ⎥ ⎦ . ) ⋯ 𝛿 ( 0 , 0 ) ( 1 9 ) The least-squares solution of ( 18 ) is given: 𝛼 𝑙 𝑠 =  ̂ 𝛿 𝑇 ̂ 𝛿  − 1 ̂ 𝛿 𝑇  ℎ . ( 2 0 ) The least-squares solution  𝛼 𝑙 𝑠 is obtained and its maximum  𝑑  𝛼 ( 𝑥 ,  𝑑 𝑦 ) is determined. The image motion estimate is then (  𝑑 𝑥 ,  𝑑 𝑦 ) . Although we assume that the signals are non-Gaussian, it can be shown that for binary, deterministic signals and large images sizes bispectrum is insensitive to Gaussian noise, and thus ( 20 ) is approximately true. Therefore,  ℎ  𝑟 1 , 𝑟 2    𝑑 =  𝛼 𝑥 ,  𝑑 𝑦  ̂ 𝛿  𝑟 1 −  𝑑 𝑥 , 𝑟 2 −  𝑑 𝑦  = ̂ 𝛿  𝑟 1 −  𝑑 𝑥 , 𝑟 2 −  𝑑 𝑦  . ( 2 1 ) 4. High-Accuracy Subpixel Motion Estimation Subpixel performance is a critical element of the proposed algorithm. With reference to our previously published work [ 16 , 17 ], we are introducing a number of important new features, which improve the accuracy of the motion estimates. The coordinates ( 𝑟 1 𝑚 , 𝑟 2 𝑚 ) of the maximum of the real-valued array  ℎ ( 𝑟 1 , 𝑟 2 ) can be used as an estimate of the horizontal and vertical components of motion between 𝑔 𝑘 ( 𝑥 , 𝑦 ) and 𝑔 𝑘 − 1 ( 𝑥 , 𝑦 ) as follows:  𝑟 1 𝑚 , 𝑟 2 𝑚    ℎ  𝑟 = a r g m a x R e 1 , 𝑟 2   , ( 2 2 ) where R e { ⋅ } denotes the real part of complex array  ℎ ( 𝑟 1 , 𝑟 2 ) . Subpixel accuracy of motion measurements is obtained by variable-separable fitting performed in the neighborhood of the maximum using one-dimensional quadratic function. Using the notation in ( 22 ), prototype functions are fitted to the triplets:   ℎ  𝑟 1 𝑚 − 1 , 𝑟 2 𝑚  ,  ℎ  𝑟 1 𝑚 , 𝑟 2 𝑚  ,  ℎ  𝑟 1 𝑚 + 1 , 𝑟 2 𝑚     ℎ  𝑟 , ( 2 3 ) 1 𝑚 , 𝑟 2 𝑚  ,  ℎ  𝑟 − 1 1 𝑚 , 𝑟 2 𝑚  ,  ℎ  𝑟 1 𝑚 , 𝑟 2 𝑚   + 1 , ( 2 4 ) that is, the maximum peak of the phase correlation surface and its two neighboring values on either side, vertically and horizontally. The location of the maximum of the fitted function provides the required subpixel motion estimate (   𝑑 𝑘   𝑑 𝑥 , 𝑘 𝑦 ) . Fitting a parabolic function horizontally to the data triplet ( 23 ) yields a closed-form solution for the horizontal component of the motion estimate   𝑑 𝑘 𝑥 as follows:   𝑑 𝑘  ℎ  𝑟 𝑥 = 1 𝑚 + 1 , 𝑟 2 𝑚  −  ℎ  𝑟 1 𝑚 − 1 , 𝑟 2 𝑚  2  𝐻 , ( 2 5 ) where   𝐻 = [ ℎ ( 𝑟 1 𝑚 + 1 , 𝑟 2 𝑚  ) − 2 ℎ ( 𝑟 1 𝑚 , 𝑟 2 𝑚  ) + ℎ ( 𝑟 1 𝑚 − 1 , 𝑟 2 𝑚 ) ] . The fractional part   𝑑 𝑘 𝑦 of the vertical component can be obtained in a similar way using ( 24 ) instead of ( 23 ). Finally the horizontal and vertical components of the subpixel accurate motion estimate are obtained by computing the location of the maxima of each of the above fitted quadratics. In [ 18 ], it is shown that half-pixel accuracy motion vectors lead to a very significant improvement when compared to one pixel accuracy, whereas a higher precision results in negligible changes. Therefore, a half-pixel accuracy was chosen in our simulations. 5. Computational Cost Comparison The majority of the computational cost of the proposed bispectrum is due to the fast Fourier transform (FFT). Therefore, the fundamental computation required for bispectral estimates is given by ( 7 ), the triple product of the three individual Fourier transformations, while this computation is straightforward, limitations on computer time and statistical variance impose severe limitations on implementation of the definition of the bispectrum [ 19 ]. On the other hand, we take advantage of the symmetrical properties of the bispectrum to reduce the computational complexity and memory requirements of calculating third-order statistics. It can now be calculated in any one sector and mapped onto the others [ 20 ]. The phase correlation is estimated by multiplying each coefficient 𝐺 𝑔 𝑘 ( 𝑢 1 , 𝑢 2 ) by its complex conjugate, but each component of the bispectrum is estimated by a triple product of Fourier coefficients as demonstrated in ( 7 ). Thus, the number of operations required to compute the bispectrum is significantly increased relative to the phase correlation. There are 𝑁 2 / 8 independent components of the bispectrum while there are only 𝑁 / 2 independent components of the phase correlation for an 𝑁 × 𝑁 image [ 21 ]. 6. Simulation Results Our experiments have aimed at evaluating the performance of the proposed approach and comparing it with that of the optical flow and phase correlation techniques. For the optical flow method we used the implementation obtained from Bruhn method [ 22 ]. In our simulation we used the database freely available on the web at http://vision.middlebury.edu/flow/ . We contribute three types of data to test different aspects of all techniques: real sequences of independent motion; realistic synthetic sequences; and high frame-rate video. These sequences have been chosen for their difficult motion and their different characteristics. Although the original sequences are in color, only the luminance component is used to estimate the motion vectors. Figure 2 shows the estimated motion vector fields for the Grove sequence using the three aforementioned motion estimation methods. Note that for a fair comparison we used optical flow technique and phase correlation algorithm with half-pixel accuracy. The motion vectors estimated between frames 6 and 7 are shown for the Grove sequence. For this particular sequence, our scheme provides the most consistent and reliable motion vector field. Both optical flow and phase correlation algorithms fail to detect the true motion vector. Similar results are shown in Figures 3 and 4 for the motion vectors estimated between frames 2 and 3, and between frames 5 and 6 in the Walking and Mequon sequences, respectively. Both optical flow and phase correlation algorithms produce abrupt motion vector fields. Although these abrupt motion vectors may lead to lower numerical mean squared errors (MSEs), they are incorrect motion vectors. Because of the noise resistant property of the parametric bispectrum method, it produces more reliable estimates. Therefore, our approach motion estimation results globally in motion fields more representative of the true motion in the scene. Figure 2: Motion vector fields of Grove sequence in the presence of noise using (a) our algorithm, (b) optical flow algorithm, (c) phase correlation algorithm. Figure 3: Motion vector fields of Walking sequence in the presence of noise using (a) our algorithm, (b) optical flow algorithm, (c) phase correlation algorithm. Figure 4: Motion vector fields of Mequon sequence in the presence of noise using (a) our algorithm, (b) optical flow algorithm, (c) phase correlation algorithm. To see more clearly the correctness of motion estimation, we use Beanbags sequence as an example. The motion compensated pictures using three methods are shown in Figure 5 . Portions of these three pictures are enlarged in Figure 6 to show the differences. We observe better compensated images by the proposed method. We also observe that the motion compensated images for our scheme are much closer to the original images. Thus, the scheme is able to measure the motion vector more accurately and is more robust in general. Overall, parametric bispectrum scheme typically offers better visual quality images than the other methods. Figure 5: Prediction for frame 5 of the Beanbags sequence in the presence of noise using (b) our algorithm, (c) optical flow algorithm, (d) phase correlation algorithm, (a) is original image. Figure 6: Enlarged portions of the motion compensated pictures of the Beanbags sequence using (a) our algorithm, (b) optical flow algorithm, (c) phase correlation algorithm. The detection of motion vectors relies on successive phase correlation operations applied to pairs of consecutive block partitioned frames of a video sequence. The heights of the dominant peaks are monitored, and when a sudden magnitude change is detected, then this is interpreted as a displacement vector. Figure 7 shows sample phase correlation surface between two blocks 𝑏 𝑘 − 1 ( 𝑥 , 𝑦 ) and 𝑏 𝑘 ( 𝑥 , 𝑦 ) , related to frames 3 and 4 of the Hydrangea sequence, respectively. The bispectrum retains both amplitude and phase information from the Fourier transform of a signal, unlike the other methods. The phase of the Fourier transform contains important shape information. Therefore, the bispectrum minimizes the influence of the noise and simplifies the identification of the dominant peak on the correlation surface. Figure 7: Phase correlation surfaces between two blocks using (a) our algorithm, (b) optical flow algorithm, (c) phase correlation algorithm. The PSNR of motion compensated is a popular performance measure for motion estimation, giving insight about the quality of the prediction. The PSNRs of the three motion estimation algorithms are shown in Figure 8 . This result is obtained by using two real video sequences Tempete and Stefan . These sequences were run for 60 frames with a frame rate of 30 frame/sec. Both sequences are degraded with additive zero-mean Gaussian noise to a signal-to-noise ratio (SNR) of 10 dB. Here we define S N R = 1 0 l o g 1 0 𝜎 2 𝑓 𝜎 2 𝑛 , ( 2 6 ) where 𝜎 2 𝑓 is the variance of the frame, 𝜎 2 𝑛 is the variance of the noise. From Figure 8 , it is clear that the implemented optical flow technique is significantly less efficient than the parametric bispectrum technique. It is mainly due to the difficulty of the optical flow technique to cope with large displacement and discontinuities in the motion field. On the other hand, the normalization (equalization) operation in the phase correlation technique enhances the noise power at high frequencies, and it produces incorrect displacement estimates on noisy image sequences. On the whole, the bispectrum retains both amplitude and phase information from the Fourier transform of a signal, unlike the other techniques. This confirms the motion that the proposed technique of an image is a superior feature selector utilizing the portions of the image spectrum most likely to contribute to reliable motion estimation. Figure 8: PSNR obtained for noisy sequences (SNR = 10 dB). In terms of complexity, this is measured by the computation time. All the computations are performed on Intel centrino duo machines (Toshiba Satellite A100-579 T5500, 2 GHz(2 CPUs)) with Windows XP. The three algorithms have been implemented using a prototype written in Matlab 6.5 R13. The comparison between three methods for the motion estimation computation time (MECT) is shown in Table 1 . Table 1: The comparison between three methods for the computation time. We employ 60 frames of the video Tempete sequence. We perform the motion compensation procedure for each current frame 𝑘 with respect to reference frames 𝑘 − 𝑟 , where 𝑟 = 1 , 2 , 3 , and 4 . The average PSNR of the motion compensated images is given in Table 2 , with Tempete sequence degraded with additive zero-mean Gaussian noise to an SNR of 10 dB. Table 2: Average PSNR of motion compensated images for the three motion estimation techniques (unit: dB) for Tempete sequence. The average PSNR, P S N R a v g , is given as follows: P S N R a v g = 1 𝐹 𝐹  𝑖 = 1 P S N R 𝑖 , ( 2 7 ) where P S N R 𝑖 is the measured PSNR for frame 𝑖 and 𝐹 is the total number of frames. In Table 2 , we observe that the P S N R a v g decreases with larger apparent disparity between the global motion of the background and the local motion of the foreground. For each value of 𝑟 , we see that the P S N R a v g is higher for the proposed scheme than the other methods. 7. Conclusion In this paper, subpixel motion estimation algorithm using bispectrum in the parametric domain was presented. We have presented a collection of datasets for the evaluation of our method, available on the web at http://vision.middlebury.edu/flow/ . In the case of the data is severely corrupted by additive Gaussian noises of unknown covariance, our method suppresses the effects of noise and simplifies the identification of the dominant peak on the correlation surface, unlike other techniques. At high noise levels SNR around 10 dB the optical flow and phase correlation techniques fail, yet even under these extreme conditions, the parametric bispectrum provides improvement in performance over the other algorithms. Overall, our scheme produces smoother displacement vector field with a more accurate measure of object motion in different SNR scenarios. <h4>References</h4> N. Benmoussat, M. Faouzi Belbachir, and B. Benamar, “ Motion estimation and compensation from noisy image sequences: a new filtering scheme ,” Image and Vision Computing , vol. 25, no. 5, pp. 686–694, 2007. J. C. Brailean, R. P. Kleihorst, S. Efstratiadis, A. K. Katsaggelos, and R. L. Lagendijk, “ Noise reduction filters for dynamic image sequences: a review ,” Proceedings of the IEEE , vol. 83, no. 9, pp. 1272–1292, 1995. R. M. Armitano, R. W. Schafer, F. L. Kitson, and V. Bhaskaran, “ Robust block-matching motion-estimation technique for noisy sources ,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) , vol. 4, pp. 2685–2688, Munich, Germany, April 1997. S. Bhattacharya, N. C. Ray, and S. Sinha, “ 2-D signal modelling and reconstruction using third-order cumulants ,” Signal Processing , vol. 62, no. 1, pp. 61–72, 1997. E. M. Ismaili Aalaoui and E. Ibn-Elhaj, “ Estimation of subpixel motion using bispectrum ,” Research Letters in Signal Processing , vol. 2008, Article ID 417915, 5 pages, 2008. J. M. M. Anderson and G. B. Giannakis, “ Image motion estimation algorithms using cumulants ,” IEEE Transactions on Image Processing , vol. 4, no. 3, pp. 346–357, 1995. R. P. Kleihorst, R. L. Lagendijk, and J. Biemond, “ Noise reduction of severely corrupted image sequences ,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '93) , vol. 5, pp. 293–296, Minneapolis, Minn, USA, April 1993. E. Ibn-Elhaj, D. Aboutajdine, S. Pateux, and L. Morin, “ HOS-based method of global motion estimation for noisy image sequences ,” Electronics Letters , vol. 35, no. 16, pp. 1320–1322, 1999. E. Sayrol, A. Gasull, and J. R. Fonollosa, “ Motion estimation using higher order statistics ,” IEEE Transactions Image Processing , vol. 5, no. 6, pp. 1077–1084, 1996. A. N. Netravali and J. D. Robbins, “Motion-compensated television coding—part I,” Bell System Technical Journal , vol. 58, no. 3, pp. 629–668, 1979. V. Murino, C. Ottonello, and S. Pagnan, “ Noisy texture classification: a higher-order statistics approach ,” Pattern Recognition , vol. 31, no. 4, pp. 383–393, 1998. B. M. Sadler and G. B. Giannakis, “ Shift- and rotation-invariant object reconstruction using the bispectrum ,” Journal of the Optical Society of America A , vol. 9, no. 1, pp. 57–69, 1992. M. R. Raghuveer and C. L. Nikias, “ Bispectrum estimation: a parametric approach ,” IEEE Transactions on Acoustics, Speech and Signal Processing , vol. 33, no. 5, pp. 1213–1230, 1985. G. B. Giannakis, “ On the identifiability of non-Gaussian ARMA models using cumulants ,” IEEE Transactions on Automatic Control , vol. 35, no. 1, pp. 18–26, 1990. J. M. Mendel, “ Tutorial on higher-order statistics (spectra) in signal processing and system theory: theoretical results and some applications ,” Proceedings of the IEEE , vol. 79, no. 3, pp. 278–305, 1991. E. M. Ismaili Aalaoui and E. Ibn-Elhaj, “Estimation of motion fields from noisy image sequences: using generalized cross-correlation methods,” in Proceedings of the IEEE International Conference on Signal Processing and Communications (ICSPC '07) , Dubai, UAE, November 2007. E. M. Ismaili Aalaoui and E. Ibn Elhaj, “ Estimation of displacement vector field from noisy data using maximum likelihood estimator ,” in Proceedings of the 14th IEEE International Conference on Electronics, Circuits, and Systems (ICECS '07) , pp. 1380–1383, Marrakech, Morocco, December 2007. G. Madec, “Half pixel accuracy in block matching,” in Proceedings on the Picture Coding Symposium (PCS '90) , Cambridge, Mass, USA, March 1990. K. S. Lii and K. N. Helland, “ Cross-bispectrum computation and variance estimation ,” ACM Transactions on Mathematical Software , vol. 7, no. 3, pp. 284–294, 1981. J.-M. Le Caillec and R. Garello, “ Comparison of statistical indices using third order statistics for nonlinearity detection ,” Signal Processing , vol. 84, no. 3, pp. 499–525, 2004. R. W. Means, B. Wallach, and D. Busby, “ Bispectrum signal processing on HNC's SIMD numerical array processor (SNAP) ,” in Proceedings of the ACM/IEEE Conference on Supercomputing (SC '93) , pp. 535–537, Portland, Ore, USA, November 1993. A. Bruhn, J. Weickert, and C. Schnörr, “ Lucas/Kanade meets Horn/Schunck: combining local and global optic flow methods ,” International Journal of Computer Vision , vol. 61, no. 3, pp. 211–231, 2005. // http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png EURASIP Journal on Image and Video Processing Hindawi Publishing Corporation

A Robust Subpixel Motion Estimation Algorithm Using HOS in the Parametric Domain

Loading next page...
 
/lp/hindawi-publishing-corporation/a-robust-subpixel-motion-estimation-algorithm-using-hos-in-the-bRzVQVFfaB

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher
Hindawi Publishing Corporation
Copyright
Copyright © 2009 E. M. Ismaili Aalaoui et al.
ISSN
1687-5176
eISSN
1687-5281
Publisher site
See Article on Publisher Site

Abstract

A Robust Subpixel Motion Estimation Algorithm Using HOS in the Parametric Domain //// Hindawi Publishing Corporation Home Journals About Us About this Journal Submit a Manuscript Table of Contents Journal Menu Abstracting and Indexing Aims and Scope Article Processing Charges Articles in Press Author Guidelines Bibliographic Information Contact Information Editorial Board Editorial Workflow Free eTOC Alerts Reviewers Acknowledgment Society Affiliation Subscription Information Open Special Issues Published Special Issues Special Issue Guidelines Abstract Full-Text PDF Full-Text HTML Linked References How to Cite this Article Complete Special Issue EURASIP Journal on Image and Video Processing Volume 2009 (2009), Article ID 381673, 10 pages doi:10.1155/2009/381673 Research Article <h2>A Robust Subpixel Motion Estimation Algorithm Using HOS in the Parametric Domain</h2> E. M. Ismaili Aalaoui , 1,2 E. Ibn-Elhaj , 2 and E. H. Bouyakhf 1 1 Faculté des Sciences de Rabat, Université Mohammed V Agdal, 4 avenue Ibn Battouta, B.P. 1014 RP, 10000 Rabat, Morocco 2 Department of Telecommunication, Institut National des Postes et Télécommunication, avenue Allal Al Fassi, Madinat El Irfane, 10000 Rabat, Morocco Received 28 April 2008; Accepted 24 October 2008 Academic Editor: Simon Lucey Copyright © 2009 E. M. Ismaili Aalaoui et al. This is an open access article distributed under the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Abstract Motion estimation techniques are widely used in todays video processing systems. The most frequently used techniques are the optical flow method and phase correlation method. The vast majority of these algorithms consider noise-free data. Thus, in the case of the image sequences are severely corrupted by additive Gaussian (perhaps non-Gaussian) noises of unknown covariance, the classical techniques will fail to work because they will also estimate the noise spatial correlation. In this paper, we have studied this topic from a viewpoint different from the above to explore the fundamental limits in image motion estimation. Our scheme is based on subpixel motion estimation algorithm using bispectrum in the parametric domain. The motion vector of a moving object is estimated by solving linear equations involving third-order hologram and the matrix containing Dirac delta function. Simulation results are presented and compared to the optical flow and phase correlation algorithms; this approach provides more reliable displacement estimates particularly for complex noisy image sequences. In our simulation, we used the database freely available on the web. 1. Introduction The importance of image sequence processing is constantly growing with the ever increasing use of television and video systems in consumer, commercial, medical, and scientific applications. Image sequences can be acquired by film-based motion picture cameras or electronic video cameras. In either case, there are several factors related to imaging sensor limitations that contribute to the graininess (noise) of resulting images. Electronic sensor noise and film grain are among these factors [ 1 ]. In many cases, graininess may result in visually disturbing degradation of the image quality, or it may mask important image information. Even if the noise may not be perceived at full-speed video due to the temporal masking effect of the eye, it often leads to unacceptable single-frame hardcopies and to poor-quality freeze-frames that adversely affect the performance of subsequent image analysis [ 2 ]. The motion estimation process must be able to track objects within a noisy source. In a noisy source, objects appear to change from frame to frame because of the noise, not necessarily as the result of object motion [ 3 ]. Tracking objects within a noisy environment is difficult, especially if the image frames are severely corrupted by additive Gaussian noises of unknown covariance; second-order statistics methods do not work well. Higher-order statistics (HOS) in general and the bispectrum (order 3) in particular have recently been widely used as an important tool for signal processing. The classical methods based on the power spectrum are now being effectively superseded by the bispectral ones due to some definite disadvantages of the former. These include the inability to identify systems fed by non-Gaussian noise (NGN) inputs and nonminimum phase (NMP) systems and identification of system nonlinearity [ 4 ]. In these cases, the autocorrelation-based methods offer no answer. Out of all these, the identifiability of NMP systems has received the maximum attention from researchers. HOS-based methods have been proposed to estimate motion between image frames [ 5 – 9 ]. In, the motion estimation is based on the bispectrum method for sub-pixel resolution of noisy image sequences. In [ 7 ], the displacement vector is obtained by maximizing a third-order statistics criterion. In [ 8 ], the global motion parameters are obtained by a new region recursive algorithm. In [ 6 ], several algorithms are developed based on a parametric cumulant method, a cumulant-matching method, and a mean kurtosis error criterion. The latter is an extension of the quadratic pixel-recursive method by Netravali and Robbins [ 10 ]. In [ 11 ], it is shown that such statistical parameters are insensitive to additive Gaussian noises. In particular, bispectrum parameters are insensitive to any symmetrically distributed noise and also exhibit the capability of better characterizing NGN and identifying NMP linear systems as well as nonlinear systems. Therefore, transformation to a higher-order domain reduces the effect of noise significantly. In this correspondence, a novel algorithm for the detection of motion vectors in video sequences is proposed. The algorithm uses bispectrum model-based subpixel motion estimation in the parametric domain for noisy image sequences to obtain a measure of content similarity for temporally adjacent frames and responds very well to scene motion vectors. The algorithm is insensitive to the presence of symmetrically distributed noise. The outline of this paper is as follows. First, the problem formulation is introduced in Section 2 . In Section 3 , we first present briefly the definitions and properties of the bispecrum and cross-bispectrum. Next, we describe the motion estimation in the parametric domain. High-accuracy subpixel motion estimation is discussed in Section 4 . Section 5 presents an evaluation of the computational complexity of our algorithm. The results of the experimental evaluation of the proposed method are shown in Section 6 and compared to existing methods while Section 7 concludes the paper. 2. Problem Formulation The problem of motion estimation can be stated as follows: “Given an image sequence, compute a representation of the motion field that best aligns pixels in one frame of the sequence with those in the next” [ 9 ]. This is formulated as follows: 𝑔 𝑘 − 1 ( 𝑥 , 𝑦 ) = 𝑓 𝑘 − 1 ( 𝑥 , 𝑦 ) + 𝑛 𝑘 − 1 𝑔 ( 𝑥 , 𝑦 ) , ( 1 ) 𝑘 ( 𝑥 , 𝑦 ) = 𝑓 𝑘 ( 𝑥 , 𝑦 ) + 𝑛 𝑘 ( 𝑥 , 𝑦 ) = 𝑓 𝑘 − 1  𝑥 − 𝑑 𝑥 , 𝑦 − 𝑑 𝑦  + 𝑛 𝑘 ( 𝑥 , 𝑦 ) , ( 2 ) where ( 𝑥 , 𝑦 ) denotes spatial image position of a point; 𝑔 𝑘 ( 𝑥 , 𝑦 ) and 𝑔 𝑘 − 1 ( 𝑥 , 𝑦 ) are observed image intensities at instants 𝑘 and 𝑘 − 1 , respectively; 𝑓 𝑘 ( 𝑥 , 𝑦 ) and 𝑓 𝑘 − 1 ( 𝑥 , 𝑦 ) are noise-free frames; 𝑛 𝑘 ( 𝑥 , 𝑦 ) and 𝑛 𝑘 − 1 ( 𝑥 , 𝑦 ) are assumed to be spatially and temporally stationary, zero-mean image Gaussian (or non-Gaussian) noise sequences with unknown covariance; and ( 𝑑 𝑥 , 𝑑 𝑦 ) is the displacement vector of the object during the time interval [ 𝑘 − 1 , 𝑘 ] . The goal is to estimate ( 𝑑 𝑥 , 𝑑 𝑦 ) from 𝑔 𝑘 ( 𝑥 , 𝑦 ) and 𝑔 𝑘 − 1 ( 𝑥 , 𝑦 ) . 3. Bispectrum-Based Image Motion Estimation 3.1. Definitions and Properties In this subsection, some HOS functions are defined and their properties are described in order to provide the necessary tools to understand the motion estimation methodology. Also, the third-order autocumulant/moment sequence, 𝐶 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3 ( 𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2 ) , of 𝑔 𝑘 ( 𝑥 , 𝑦 ) is defined as follows [ 5 ]: 𝐶 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2   𝑔 = 𝐸 𝑘 ( 𝑥 , 𝑦 ) 𝑔 𝑘  𝑥 + 𝑟 1 , 𝑦 + 𝑟 2  𝑔 𝑘  𝑥 + 𝑠 1 , 𝑦 + 𝑠 2 ,   ( 3 ) where 𝐸 { ⋅ } denotes the expectation operation; ( 𝑟 1 , 𝑟 2 ) and ( 𝑠 1 , 𝑠 2 ) are two shifted versions of the 𝑔 𝑘 ( 𝑥 , 𝑦 ) . To understand the theory of triple correlations physically for 2D data [ 4 ], the reader is referred to Figure 1 . The Figure shows the spaces occupied by the original data (denoted by continuous box) and two shifted versions of the same data (denoted by dashed boxes). The shifts are made by the amounts ( 𝑟 1 , 𝑟 2 ) and ( 𝑠 1 , 𝑠 2 ) , respectively. It is now obvious that the product of the overlapping data positions (shown by the shaded portion) denotes the triple correlation function as defined by ( 3 ). Figure 1: Physical representation of cumulants [ 4 ]. Also, the third-order autocumulant/moment sequence is defined as follows [ 12 ]: 𝐶 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2  = 𝐶 𝑓 𝑘 𝑓 𝑘 𝑓 𝑘 3  𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2  + 𝐶 𝑛 𝑘 𝑛 𝑘 𝑛 𝑘 3  𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2  . ( 4 ) Equation ( 4 ) states that the triple correlation of 𝑓 𝑘 ( 𝑥 , 𝑦 ) plus 𝑛 𝑘 ( 𝑥 , 𝑦 ) is equal to the triple correlation of the 𝑓 𝑘 ( 𝑥 , 𝑦 ) plus the triple correlation of the noise. For 𝑛 𝑘 ( 𝑥 , 𝑦 ) a zero-mean Gaussian noise, then its triple correlation is identically zero [ 7 , 12 ]. This provides a theoretical basis for using the triple correlation (or the bispectrum) as a method of reducing the effects of additive noise. Then the term 𝐶 𝑛 𝑘 𝑛 𝑘 𝑛 𝑘 3 ( 𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2 ) is negligible which renders the triple correlation very effective in detecting a signal embedded in noise. Therefore, 𝐶 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2  = 𝐶 𝑓 𝑘 𝑓 𝑘 𝑓 𝑘 3  𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2  . ( 5 ) Also, 𝑛 𝑘 ( 𝑥 , 𝑦 ) can be non-Gaussian if it is independent and identically distributed (i.i.d.) and nonskewed (e.g., symmetrically distributed). The bispectrum, 𝐵 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3 ( 𝑢 1 , 𝑢 2 ; 𝑣 1 , 𝑣 2 ) , is defined as the 4D Fourier transform of the third-order autocumulant (or moment) [ 4 ]: 𝐵 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑢 1 , 𝑢 2 ; 𝑣 1 , 𝑣 2  = ℱ 4  𝐶 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2   = ℱ 4  𝐶 𝑓 𝑘 𝑓 𝑘 𝑓 𝑘 3  𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2   = 𝐵 𝑓 𝑘 𝑓 𝑘 𝑓 𝑘 3  𝑢 1 , 𝑢 2 ; 𝑣 1 , 𝑣 2  , ( 6 ) where ℱ 4 denotes the 4D Fourier transform operation; ( 𝑢 1 , 𝑢 2 ) and ( 𝑣 1 , 𝑣 2 ) are the frequency coordinates for the 2D Fourier transform. Let 𝐺 𝑔 𝑘 ( 𝑢 ) be the discrete Fourier transform (DFT) of the frame 𝑔 𝑘 ( 𝑥 , 𝑦 ) . Each component of the bispectrum is estimated by a triple product of Fourier coefficients as follows [ 12 ]: 𝐵 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑢 1 , 𝑢 2 ; 𝑣 1 , 𝑣 2  = 𝐺 𝑔 𝑘  𝑢 1 , 𝑢 2  𝐺 𝑔 𝑘  𝑣 1 , 𝑣 2  𝐺 ∗ 𝑔 𝑘  𝑢 1 + 𝑣 1 , 𝑢 2 + 𝑣 2  , ( 7 ) where ∗ indicates the complex conjugate. The cross-bispectrum is obtained in a similar manner as the bispectrum. Thus, 𝐵 𝑔 𝑘 − 1 𝑔 𝑘 𝑔 𝑘 − 1 3  𝑢 1 , 𝑢 2 ; 𝑣 1 , 𝑣 2  = 𝐵 𝑔 𝑘 − 1 𝑔 𝑘 − 1 𝑔 𝑘 − 1 3  𝑢 1 , 𝑢 2 ; 𝑣 1 , 𝑣 2  𝑒 − 𝑗 2 𝜋 ( 𝑢 1 𝑑 𝑥 + 𝑢 2 𝑑 𝑦 ) , 𝐵 𝑔 𝑘 − 1 𝑔 𝑘 𝑔 𝑘 − 1 3  𝑢 1 , 𝑢 2 ; 𝑣 1 , 𝑣 2  = 𝐺 𝑔 𝑘 − 1  𝑢 1 , 𝑢 2  𝐺 𝑔 𝑘  𝑣 1 , 𝑣 2  𝐺 ∗ 𝑔 𝑘 − 1  𝑢 1 + 𝑣 1 , 𝑢 2 + 𝑣 2  . ( 8 ) On close observation and after certain algebraic manipulations, ( 3 ) shows the third-order cumulant sequence to possess the following symmetry properties [ 4 ]: 𝐶 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2  = 𝐶 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑠 1 , 𝑠 2 ; 𝑟 1 , 𝑟 2  = 𝐶 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  − 𝑠 1 , − 𝑠 2 ; 𝑟 1 − 𝑠 1 , 𝑟 2 − 𝑠 2  = 𝐶 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑠 1 − 𝑟 1 , 𝑠 2 − 𝑟 2 ; − 𝑟 1 , − 𝑟 2  = 𝐶 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑟 1 − 𝑠 1 , 𝑟 2 − 𝑠 2 ; − 𝑠 1 , − 𝑠 2  = 𝐶 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  − 𝑟 1 , − 𝑟 2 ; 𝑠 1 − 𝑟 1 , 𝑠 2 − 𝑟 2  . ( 9 ) This directly leads to the following symmetry properties of the bispectrum sequence [ 4 ]: 𝐵 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑢 1 , 𝑢 2 ; 𝑣 1 , 𝑣 2  = 𝐵 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑣 1 , 𝑣 2 ; 𝑢 1 , 𝑢 2  = 𝐵 ∗ 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  − 𝑣 1 , − 𝑣 2 ; − 𝑢 1 , − 𝑢 2  = 𝐵 ∗ 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  − 𝑢 1 , − 𝑢 2 ; − 𝑣 1 , − 𝑣 2  = 𝐵 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  − 𝑢 1 − 𝑣 1 , − 𝑢 2 − 𝑣 2 ; 𝑣 1 , 𝑣 2  = 𝐵 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑢 1 , 𝑢 2 ; − 𝑢 1 − 𝑣 1 , − 𝑢 2 − 𝑣 2  = 𝐵 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  − 𝑢 1 − 𝑣 1 , − 𝑢 2 − 𝑣 2 ; 𝑢 1 , 𝑢 2  = 𝐵 𝑔 𝑘 𝑔 𝑘 𝑔 𝑘 3  𝑣 1 , 𝑣 2 ; − 𝑢 1 − 𝑣 1 , − 𝑢 2 − 𝑣 2  . ( 1 0 ) These symmetry properties reduce the computational burden while calculating the bispectrum. 3.2. Parametric Model-Based Motion Estimation The problem of signal processing using bispectrum in the parametric domain has recently been widely addressed by researchers. Two primary schools of thought exist for this. The first line is headed by Raghuveer and Nikias [ 13 ], who have parametrized the bispectrum through solution of a cumulant matrix equation. The other school of thought headed by Giannakis [ 14 ] calculates the system impulse response coefficients directly using a linear combination of cumulant slices. Other papers have been subsequently published which give extensions and applications of these basic approaches (see [ 15 ] and references therein). Here, we propose the bispectrum model-based subpixel motion estimation in the parametric domain. Simulations demonstrate that this method requires large blocks of data and thus may be appropriate for estimating object displacement in background noise. Consequently, our approach will be derived in this context and hence, ( 𝑑 𝑥 , 𝑑 𝑦 ) = c o n s t . in ( 2 ). Substituting 𝑓 𝑘 − 1 from ( 1 ) into ( 2 ) we obtain 𝑔 𝑘  ( 𝑥 , 𝑦 ) = 𝑖 ∈ 𝑅  𝑗 ∈ 𝑅 𝛼 ( 𝑖 , 𝑗 ) 𝑔 𝑘 − 1 ( 𝑥 − 𝑖 , 𝑦 − 𝑗 ) + 𝑛 𝑘 ( 𝑥 , 𝑦 ) + 𝑛 𝑘 − 1  𝑥 − 𝑑 𝑥 , 𝑦 − 𝑑 𝑦  , ( 1 1 ) where, in theory, 𝛼 ( 𝑖 , 𝑗 ) = 0 for all { ( 𝑖 , 𝑗 ) } except ( 𝑖 , 𝑗 ) = ( 𝑑 𝑥 , 𝑑 𝑦 ) , and 𝛼 ( 𝑑 𝑥 , 𝑑 𝑦 ) = 1 . If the search region 𝑅 contains the largest possible horizontal and vertical delays, and if we take 𝑔 𝑘 ( 𝑥 , 𝑦 ) for 𝑔 𝑘 ( 𝑥 + 𝑟 1 , 𝑦 + 𝑟 2 ) , we obtain 𝑔 𝑘  𝑥 + 𝑟 1 , 𝑦 + 𝑟 2  =  𝑖 ∈ 𝑅  𝑗 ∈ 𝑅 𝛼 ( 𝑖 , 𝑗 ) 𝑔 𝑘 − 1  𝑥 − 𝑖 + 𝑟 1 , 𝑦 − 𝑗 + 𝑟 2  + 𝑛 𝑘  𝑥 + 𝑟 1 , 𝑦 + 𝑟 2  + 𝑛 𝑘 − 1  𝑥 + 𝑟 1 − 𝑑 𝑥 , 𝑦 + 𝑟 2 − 𝑑 𝑦  . ( 1 2 ) By multiplying both sides of ( 12 ) by 𝑔 𝑘 − 1 ( 𝑥 , 𝑦 ) 𝑔 𝑘 − 1 ( 𝑥 + 𝑠 1 , 𝑦 + 𝑠 2 ) and taking expectations, we obtain 𝐸  𝑔 𝑘 − 1 ( 𝑥 , 𝑦 ) 𝑔 𝑘  𝑥 + 𝑟 1 , 𝑦 + 𝑟 2  𝑔 𝑘 − 1  𝑥 + 𝑠 1 , 𝑦 + 𝑠 2 =    𝑖 ∈ 𝑅  𝑗 ∈ 𝑅  𝑔 𝛼 ( 𝑖 , 𝑗 ) 𝐸 𝑘 − 1 ( 𝑥 , 𝑦 ) 𝑔 𝑘 − 1  𝑥 − 𝑖 + 𝑟 1 , 𝑦 − 𝑗 + 𝑟 2  × 𝑔 𝑘 − 1  𝑥 + 𝑠 1 , 𝑦 + 𝑠 2  𝑔   + 𝐸 𝑘 − 1 ( 𝑥 , 𝑦 ) 𝑛 𝑘  𝑥 + 𝑟 1 , 𝑦 + 𝑟 2  × 𝑔 𝑘 − 1  𝑥 + 𝑠 1 , 𝑦 + 𝑠 2  𝑔   + 𝐸 𝑘 − 1 ( 𝑥 , 𝑦 ) 𝑔 𝑘 − 1  𝑥 + 𝑠 1 , 𝑦 + 𝑠 2  × 𝑛 𝑘 − 1  𝑥 + 𝑟 1 − 𝑑 𝑥 , 𝑦 + 𝑟 2 − 𝑑 𝑦 .   ( 1 3 ) Therefore, 𝐶 𝑔 𝑘 − 1 𝑔 𝑘 𝑔 𝑘 − 1 3  𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2  =  𝑖 ∈ 𝑅  𝑗 ∈ 𝑅 𝛼 ( 𝑖 , 𝑗 ) 𝐶 𝑔 𝑘 − 1 𝑔 𝑘 − 1 𝑔 𝑘 − 1 3  𝑟 1 − 𝑖 , 𝑟 2 − 𝑗 ; 𝑠 1 , 𝑠 2  + 𝐶 𝑔 𝑘 − 1 𝑛 𝑘 𝑔 𝑘 − 1 3  𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2  + 𝐶 𝑔 𝑘 − 1 𝑛 𝑘 − 1 𝑔 𝑘 − 1 3  𝑟 1 − 𝑑 𝑥 , 𝑟 2 − 𝑑 𝑦 ; 𝑠 1 , 𝑠 2  . ( 1 4 ) However, 𝐶 𝑔 𝑘 − 1 𝑛 𝑘 𝑔 𝑘 − 1 3 ( 𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2 ) and 𝐶 𝑔 𝑘 − 1 𝑛 𝑘 − 1 𝑔 𝑘 − 1 3 ( 𝑟 1 − 𝑑 𝑥 , 𝑟 2 − 𝑑 𝑦 ; 𝑠 1 , 𝑠 2 ) are identically zero for all ( 𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2 ) due to the fact that the signal and noise are zero-mean and independent. Consequently, ( 14 ) becomes 𝐶 𝑔 𝑘 − 1 𝑔 𝑘 𝑔 𝑘 − 1 3  𝑟 1 , 𝑟 2 ; 𝑠 1 , 𝑠 2  =  𝑖 ∈ 𝑅  𝑗 ∈ 𝑅 𝛼 ( 𝑖 , 𝑗 ) 𝐶 𝑔 𝑘 − 1 𝑔 𝑘 − 1 𝑔 𝑘 − 1 3  𝑟 1 − 𝑖 , 𝑟 2 − 𝑗 ; 𝑠 1 , 𝑠 2  . ( 1 5 ) Taking the 4D Fourier transform of ( 15 ) and rearranging, the following pulse transfer function for the system is obtained: 𝐵 𝑔 𝑘 − 1 𝑔 𝑘 𝑔 𝑘 − 1 3  𝑢 1 , 𝑢 2 ; 𝑣 1 , 𝑣 2  = 𝐵 𝑔 𝑘 − 1 𝑔 𝑘 − 1 𝑔 𝑘 − 1 3  𝑢 1 , 𝑢 2 ; 𝑣 1 , 𝑣 2   𝑖 ∈ 𝑅  𝑗 ∈ 𝑅 𝛼 ( 𝑖 , 𝑗 ) 𝑒 − 𝑗 2 𝜋 ( 𝑢 1 𝑖 + 𝑢 2 𝑗 ) . ( 1 6 ) The third-order hologram, ℎ ( 𝑟 1 , 𝑟 2 ) , is then defined by ℎ  𝑟 1 , 𝑟 2  = ℱ − 4  𝐵 𝑔 𝑘 − 1 𝑔 𝑘 𝑔 𝑘 − 1 3  𝑢 1 , 𝑢 2 ; 𝑣 1 , 𝑣 2  𝐵 𝑔 𝑘 − 1 𝑔 𝑘 − 1 𝑔 𝑘 − 1 3  𝑢 1 , 𝑢 2 ; 𝑣 1 , 𝑣 2   =  𝑖 ∈ 𝑅  𝑗 ∈ 𝑅  𝑟 𝛼 ( 𝑖 , 𝑗 ) 𝛿 1 − 𝑖 , 𝑟 2  , − 𝑗 ( 1 7 ) where ℱ − 4 denotes the 4D inverse Fourier transform operation. Selecting various integers for 𝑟 1 and 𝑟 2 , we form an overdetermined system of equations. For example, if the chosen search region 𝑅 is a rectangular that varies from − 𝑃 𝑥 to 𝑃 𝑥 in the horizontal direction and − 𝑃 𝑦 to 𝑃 𝑦 in the vertical direction, and if 𝑟 1 ranges from − 𝑃 𝑥 to 𝑃 𝑥 , and 𝑟 2 ranges from − 𝑃 𝑦 to 𝑃 𝑦 , a set of linear equations can be produced as follows:  ̂ ℎ = 𝛼 𝛿 , ( 1 8 ) where   ℎ ℎ = 𝑙  − 𝑃 𝑥 , − 𝑃 𝑦  , ℎ 𝑙  − 𝑃 𝑥 + 1 , − 𝑃 𝑦  + 1 , … , ℎ 𝑙  𝑃 𝑥 , 𝑃 𝑦   𝑇 ,  𝛼 = [ 𝛼 − 𝑃 𝑥 , − 𝑃 𝑦  ) , 𝛼 − 𝑃 𝑥 + 1 , − 𝑃 𝑦   𝑃 + 1 , … , 𝛼 𝑥 , 𝑃 𝑦   𝑇 , ̂ ⎡ ⎢ ⎢ ⎢ ⎣  𝛿 = 𝛿 ( 0 , 0 ) ⋯ 𝛿 − 2 𝑃 𝑥 , − 2 𝑃 𝑦   𝛿 ( 1 , 1 ) ⋯ 𝛿 − 2 𝑃 𝑥 + 1 , − 2 𝑃 𝑦  + 1 ⋮ ⋮ 𝛿 ( 2 𝑃 𝑥 , 2 𝑃 𝑦 ⎤ ⎥ ⎥ ⎥ ⎦ . ) ⋯ 𝛿 ( 0 , 0 ) ( 1 9 ) The least-squares solution of ( 18 ) is given: 𝛼 𝑙 𝑠 =  ̂ 𝛿 𝑇 ̂ 𝛿  − 1 ̂ 𝛿 𝑇  ℎ . ( 2 0 ) The least-squares solution  𝛼 𝑙 𝑠 is obtained and its maximum  𝑑  𝛼 ( 𝑥 ,  𝑑 𝑦 ) is determined. The image motion estimate is then (  𝑑 𝑥 ,  𝑑 𝑦 ) . Although we assume that the signals are non-Gaussian, it can be shown that for binary, deterministic signals and large images sizes bispectrum is insensitive to Gaussian noise, and thus ( 20 ) is approximately true. Therefore,  ℎ  𝑟 1 , 𝑟 2    𝑑 =  𝛼 𝑥 ,  𝑑 𝑦  ̂ 𝛿  𝑟 1 −  𝑑 𝑥 , 𝑟 2 −  𝑑 𝑦  = ̂ 𝛿  𝑟 1 −  𝑑 𝑥 , 𝑟 2 −  𝑑 𝑦  . ( 2 1 ) 4. High-Accuracy Subpixel Motion Estimation Subpixel performance is a critical element of the proposed algorithm. With reference to our previously published work [ 16 , 17 ], we are introducing a number of important new features, which improve the accuracy of the motion estimates. The coordinates ( 𝑟 1 𝑚 , 𝑟 2 𝑚 ) of the maximum of the real-valued array  ℎ ( 𝑟 1 , 𝑟 2 ) can be used as an estimate of the horizontal and vertical components of motion between 𝑔 𝑘 ( 𝑥 , 𝑦 ) and 𝑔 𝑘 − 1 ( 𝑥 , 𝑦 ) as follows:  𝑟 1 𝑚 , 𝑟 2 𝑚    ℎ  𝑟 = a r g m a x R e 1 , 𝑟 2   , ( 2 2 ) where R e { ⋅ } denotes the real part of complex array  ℎ ( 𝑟 1 , 𝑟 2 ) . Subpixel accuracy of motion measurements is obtained by variable-separable fitting performed in the neighborhood of the maximum using one-dimensional quadratic function. Using the notation in ( 22 ), prototype functions are fitted to the triplets:   ℎ  𝑟 1 𝑚 − 1 , 𝑟 2 𝑚  ,  ℎ  𝑟 1 𝑚 , 𝑟 2 𝑚  ,  ℎ  𝑟 1 𝑚 + 1 , 𝑟 2 𝑚     ℎ  𝑟 , ( 2 3 ) 1 𝑚 , 𝑟 2 𝑚  ,  ℎ  𝑟 − 1 1 𝑚 , 𝑟 2 𝑚  ,  ℎ  𝑟 1 𝑚 , 𝑟 2 𝑚   + 1 , ( 2 4 ) that is, the maximum peak of the phase correlation surface and its two neighboring values on either side, vertically and horizontally. The location of the maximum of the fitted function provides the required subpixel motion estimate (   𝑑 𝑘   𝑑 𝑥 , 𝑘 𝑦 ) . Fitting a parabolic function horizontally to the data triplet ( 23 ) yields a closed-form solution for the horizontal component of the motion estimate   𝑑 𝑘 𝑥 as follows:   𝑑 𝑘  ℎ  𝑟 𝑥 = 1 𝑚 + 1 , 𝑟 2 𝑚  −  ℎ  𝑟 1 𝑚 − 1 , 𝑟 2 𝑚  2  𝐻 , ( 2 5 ) where   𝐻 = [ ℎ ( 𝑟 1 𝑚 + 1 , 𝑟 2 𝑚  ) − 2 ℎ ( 𝑟 1 𝑚 , 𝑟 2 𝑚  ) + ℎ ( 𝑟 1 𝑚 − 1 , 𝑟 2 𝑚 ) ] . The fractional part   𝑑 𝑘 𝑦 of the vertical component can be obtained in a similar way using ( 24 ) instead of ( 23 ). Finally the horizontal and vertical components of the subpixel accurate motion estimate are obtained by computing the location of the maxima of each of the above fitted quadratics. In [ 18 ], it is shown that half-pixel accuracy motion vectors lead to a very significant improvement when compared to one pixel accuracy, whereas a higher precision results in negligible changes. Therefore, a half-pixel accuracy was chosen in our simulations. 5. Computational Cost Comparison The majority of the computational cost of the proposed bispectrum is due to the fast Fourier transform (FFT). Therefore, the fundamental computation required for bispectral estimates is given by ( 7 ), the triple product of the three individual Fourier transformations, while this computation is straightforward, limitations on computer time and statistical variance impose severe limitations on implementation of the definition of the bispectrum [ 19 ]. On the other hand, we take advantage of the symmetrical properties of the bispectrum to reduce the computational complexity and memory requirements of calculating third-order statistics. It can now be calculated in any one sector and mapped onto the others [ 20 ]. The phase correlation is estimated by multiplying each coefficient 𝐺 𝑔 𝑘 ( 𝑢 1 , 𝑢 2 ) by its complex conjugate, but each component of the bispectrum is estimated by a triple product of Fourier coefficients as demonstrated in ( 7 ). Thus, the number of operations required to compute the bispectrum is significantly increased relative to the phase correlation. There are 𝑁 2 / 8 independent components of the bispectrum while there are only 𝑁 / 2 independent components of the phase correlation for an 𝑁 × 𝑁 image [ 21 ]. 6. Simulation Results Our experiments have aimed at evaluating the performance of the proposed approach and comparing it with that of the optical flow and phase correlation techniques. For the optical flow method we used the implementation obtained from Bruhn method [ 22 ]. In our simulation we used the database freely available on the web at http://vision.middlebury.edu/flow/ . We contribute three types of data to test different aspects of all techniques: real sequences of independent motion; realistic synthetic sequences; and high frame-rate video. These sequences have been chosen for their difficult motion and their different characteristics. Although the original sequences are in color, only the luminance component is used to estimate the motion vectors. Figure 2 shows the estimated motion vector fields for the Grove sequence using the three aforementioned motion estimation methods. Note that for a fair comparison we used optical flow technique and phase correlation algorithm with half-pixel accuracy. The motion vectors estimated between frames 6 and 7 are shown for the Grove sequence. For this particular sequence, our scheme provides the most consistent and reliable motion vector field. Both optical flow and phase correlation algorithms fail to detect the true motion vector. Similar results are shown in Figures 3 and 4 for the motion vectors estimated between frames 2 and 3, and between frames 5 and 6 in the Walking and Mequon sequences, respectively. Both optical flow and phase correlation algorithms produce abrupt motion vector fields. Although these abrupt motion vectors may lead to lower numerical mean squared errors (MSEs), they are incorrect motion vectors. Because of the noise resistant property of the parametric bispectrum method, it produces more reliable estimates. Therefore, our approach motion estimation results globally in motion fields more representative of the true motion in the scene. Figure 2: Motion vector fields of Grove sequence in the presence of noise using (a) our algorithm, (b) optical flow algorithm, (c) phase correlation algorithm. Figure 3: Motion vector fields of Walking sequence in the presence of noise using (a) our algorithm, (b) optical flow algorithm, (c) phase correlation algorithm. Figure 4: Motion vector fields of Mequon sequence in the presence of noise using (a) our algorithm, (b) optical flow algorithm, (c) phase correlation algorithm. To see more clearly the correctness of motion estimation, we use Beanbags sequence as an example. The motion compensated pictures using three methods are shown in Figure 5 . Portions of these three pictures are enlarged in Figure 6 to show the differences. We observe better compensated images by the proposed method. We also observe that the motion compensated images for our scheme are much closer to the original images. Thus, the scheme is able to measure the motion vector more accurately and is more robust in general. Overall, parametric bispectrum scheme typically offers better visual quality images than the other methods. Figure 5: Prediction for frame 5 of the Beanbags sequence in the presence of noise using (b) our algorithm, (c) optical flow algorithm, (d) phase correlation algorithm, (a) is original image. Figure 6: Enlarged portions of the motion compensated pictures of the Beanbags sequence using (a) our algorithm, (b) optical flow algorithm, (c) phase correlation algorithm. The detection of motion vectors relies on successive phase correlation operations applied to pairs of consecutive block partitioned frames of a video sequence. The heights of the dominant peaks are monitored, and when a sudden magnitude change is detected, then this is interpreted as a displacement vector. Figure 7 shows sample phase correlation surface between two blocks 𝑏 𝑘 − 1 ( 𝑥 , 𝑦 ) and 𝑏 𝑘 ( 𝑥 , 𝑦 ) , related to frames 3 and 4 of the Hydrangea sequence, respectively. The bispectrum retains both amplitude and phase information from the Fourier transform of a signal, unlike the other methods. The phase of the Fourier transform contains important shape information. Therefore, the bispectrum minimizes the influence of the noise and simplifies the identification of the dominant peak on the correlation surface. Figure 7: Phase correlation surfaces between two blocks using (a) our algorithm, (b) optical flow algorithm, (c) phase correlation algorithm. The PSNR of motion compensated is a popular performance measure for motion estimation, giving insight about the quality of the prediction. The PSNRs of the three motion estimation algorithms are shown in Figure 8 . This result is obtained by using two real video sequences Tempete and Stefan . These sequences were run for 60 frames with a frame rate of 30 frame/sec. Both sequences are degraded with additive zero-mean Gaussian noise to a signal-to-noise ratio (SNR) of 10 dB. Here we define S N R = 1 0 l o g 1 0 𝜎 2 𝑓 𝜎 2 𝑛 , ( 2 6 ) where 𝜎 2 𝑓 is the variance of the frame, 𝜎 2 𝑛 is the variance of the noise. From Figure 8 , it is clear that the implemented optical flow technique is significantly less efficient than the parametric bispectrum technique. It is mainly due to the difficulty of the optical flow technique to cope with large displacement and discontinuities in the motion field. On the other hand, the normalization (equalization) operation in the phase correlation technique enhances the noise power at high frequencies, and it produces incorrect displacement estimates on noisy image sequences. On the whole, the bispectrum retains both amplitude and phase information from the Fourier transform of a signal, unlike the other techniques. This confirms the motion that the proposed technique of an image is a superior feature selector utilizing the portions of the image spectrum most likely to contribute to reliable motion estimation. Figure 8: PSNR obtained for noisy sequences (SNR = 10 dB). In terms of complexity, this is measured by the computation time. All the computations are performed on Intel centrino duo machines (Toshiba Satellite A100-579 T5500, 2 GHz(2 CPUs)) with Windows XP. The three algorithms have been implemented using a prototype written in Matlab 6.5 R13. The comparison between three methods for the motion estimation computation time (MECT) is shown in Table 1 . Table 1: The comparison between three methods for the computation time. We employ 60 frames of the video Tempete sequence. We perform the motion compensation procedure for each current frame 𝑘 with respect to reference frames 𝑘 − 𝑟 , where 𝑟 = 1 , 2 , 3 , and 4 . The average PSNR of the motion compensated images is given in Table 2 , with Tempete sequence degraded with additive zero-mean Gaussian noise to an SNR of 10 dB. Table 2: Average PSNR of motion compensated images for the three motion estimation techniques (unit: dB) for Tempete sequence. The average PSNR, P S N R a v g , is given as follows: P S N R a v g = 1 𝐹 𝐹  𝑖 = 1 P S N R 𝑖 , ( 2 7 ) where P S N R 𝑖 is the measured PSNR for frame 𝑖 and 𝐹 is the total number of frames. In Table 2 , we observe that the P S N R a v g decreases with larger apparent disparity between the global motion of the background and the local motion of the foreground. For each value of 𝑟 , we see that the P S N R a v g is higher for the proposed scheme than the other methods. 7. Conclusion In this paper, subpixel motion estimation algorithm using bispectrum in the parametric domain was presented. We have presented a collection of datasets for the evaluation of our method, available on the web at http://vision.middlebury.edu/flow/ . In the case of the data is severely corrupted by additive Gaussian noises of unknown covariance, our method suppresses the effects of noise and simplifies the identification of the dominant peak on the correlation surface, unlike other techniques. At high noise levels SNR around 10 dB the optical flow and phase correlation techniques fail, yet even under these extreme conditions, the parametric bispectrum provides improvement in performance over the other algorithms. Overall, our scheme produces smoother displacement vector field with a more accurate measure of object motion in different SNR scenarios. <h4>References</h4> N. Benmoussat, M. Faouzi Belbachir, and B. Benamar, “ Motion estimation and compensation from noisy image sequences: a new filtering scheme ,” Image and Vision Computing , vol. 25, no. 5, pp. 686–694, 2007. J. C. Brailean, R. P. Kleihorst, S. Efstratiadis, A. K. Katsaggelos, and R. L. Lagendijk, “ Noise reduction filters for dynamic image sequences: a review ,” Proceedings of the IEEE , vol. 83, no. 9, pp. 1272–1292, 1995. R. M. Armitano, R. W. Schafer, F. L. Kitson, and V. Bhaskaran, “ Robust block-matching motion-estimation technique for noisy sources ,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) , vol. 4, pp. 2685–2688, Munich, Germany, April 1997. S. Bhattacharya, N. C. Ray, and S. Sinha, “ 2-D signal modelling and reconstruction using third-order cumulants ,” Signal Processing , vol. 62, no. 1, pp. 61–72, 1997. E. M. Ismaili Aalaoui and E. Ibn-Elhaj, “ Estimation of subpixel motion using bispectrum ,” Research Letters in Signal Processing , vol. 2008, Article ID 417915, 5 pages, 2008. J. M. M. Anderson and G. B. Giannakis, “ Image motion estimation algorithms using cumulants ,” IEEE Transactions on Image Processing , vol. 4, no. 3, pp. 346–357, 1995. R. P. Kleihorst, R. L. Lagendijk, and J. Biemond, “ Noise reduction of severely corrupted image sequences ,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '93) , vol. 5, pp. 293–296, Minneapolis, Minn, USA, April 1993. E. Ibn-Elhaj, D. Aboutajdine, S. Pateux, and L. Morin, “ HOS-based method of global motion estimation for noisy image sequences ,” Electronics Letters , vol. 35, no. 16, pp. 1320–1322, 1999. E. Sayrol, A. Gasull, and J. R. Fonollosa, “ Motion estimation using higher order statistics ,” IEEE Transactions Image Processing , vol. 5, no. 6, pp. 1077–1084, 1996. A. N. Netravali and J. D. Robbins, “Motion-compensated television coding—part I,” Bell System Technical Journal , vol. 58, no. 3, pp. 629–668, 1979. V. Murino, C. Ottonello, and S. Pagnan, “ Noisy texture classification: a higher-order statistics approach ,” Pattern Recognition , vol. 31, no. 4, pp. 383–393, 1998. B. M. Sadler and G. B. Giannakis, “ Shift- and rotation-invariant object reconstruction using the bispectrum ,” Journal of the Optical Society of America A , vol. 9, no. 1, pp. 57–69, 1992. M. R. Raghuveer and C. L. Nikias, “ Bispectrum estimation: a parametric approach ,” IEEE Transactions on Acoustics, Speech and Signal Processing , vol. 33, no. 5, pp. 1213–1230, 1985. G. B. Giannakis, “ On the identifiability of non-Gaussian ARMA models using cumulants ,” IEEE Transactions on Automatic Control , vol. 35, no. 1, pp. 18–26, 1990. J. M. Mendel, “ Tutorial on higher-order statistics (spectra) in signal processing and system theory: theoretical results and some applications ,” Proceedings of the IEEE , vol. 79, no. 3, pp. 278–305, 1991. E. M. Ismaili Aalaoui and E. Ibn-Elhaj, “Estimation of motion fields from noisy image sequences: using generalized cross-correlation methods,” in Proceedings of the IEEE International Conference on Signal Processing and Communications (ICSPC '07) , Dubai, UAE, November 2007. E. M. Ismaili Aalaoui and E. Ibn Elhaj, “ Estimation of displacement vector field from noisy data using maximum likelihood estimator ,” in Proceedings of the 14th IEEE International Conference on Electronics, Circuits, and Systems (ICECS '07) , pp. 1380–1383, Marrakech, Morocco, December 2007. G. Madec, “Half pixel accuracy in block matching,” in Proceedings on the Picture Coding Symposium (PCS '90) , Cambridge, Mass, USA, March 1990. K. S. Lii and K. N. Helland, “ Cross-bispectrum computation and variance estimation ,” ACM Transactions on Mathematical Software , vol. 7, no. 3, pp. 284–294, 1981. J.-M. Le Caillec and R. Garello, “ Comparison of statistical indices using third order statistics for nonlinearity detection ,” Signal Processing , vol. 84, no. 3, pp. 499–525, 2004. R. W. Means, B. Wallach, and D. Busby, “ Bispectrum signal processing on HNC's SIMD numerical array processor (SNAP) ,” in Proceedings of the ACM/IEEE Conference on Supercomputing (SC '93) , pp. 535–537, Portland, Ore, USA, November 1993. A. Bruhn, J. Weickert, and C. Schnörr, “ Lucas/Kanade meets Horn/Schunck: combining local and global optic flow methods ,” International Journal of Computer Vision , vol. 61, no. 3, pp. 211–231, 2005. //

Journal

EURASIP Journal on Image and Video ProcessingHindawi Publishing Corporation

Published: Feb 11, 2009

There are no references for this article.