In some embodiments, a method of processing a video sequence may include receiving an input video sequence having an input video sequence resolution, aligning images from the input video sequence, reducing noise in the aligned images, and producing an output video sequence from the reduced noise images, wherein the output video sequence has the same resolution as the input video sequence resolution. Other embodiments are disclosed and claimed.
|
1. A method of processing a video sequence, comprising:
receiving an input video sequence having an input video sequence resolution;
aligning images from the input video sequence;
reducing noise in the aligned images including performing iterative forward and backward remapping of the aligned images corresponding to a conjugate gradient minimization; and
producing an output video sequence from the reduced noise images, wherein the output video sequence has a same resolution as the input-video sequence resolution,
wherein a scaling parameter estimated from the input video sequence utilizing an estimate of image noise based on median of absolute differences of vertical and horizontal pixel differences is used in reducing the noise.
7. A non-transitory computer readable media including program instructions which when executed by a processor cause the processor to:
receive an input video sequence having an input video sequence resolution;
align images from the input video sequence;
reduce noise in the input video sequence based on the aligned images including a performance of iterative forward and backward remapping of the aligned images corresponding to a conjugate gradient minimization; and
produce an output video sequence based on the reduced noise input video sequence, wherein the output video sequence has no greater resolution than the input video sequence resolution,
wherein a scaling parameter estimated from the input video sequence utilizing an estimate of image noise based on median of absolute differences of vertical and horizontal pixel differences is used in reducing the noise.
13. A processor-based electronic system, comprising:
a processor;
a memory coupled to the processor, the memory having instructions that, when executed by the processor,
receive an input video sequence having an input video sequence resolution;
align images from the input video sequence;
reduce noise in the input video sequence based on the aligned images including a performance of iterative forward and backward remapping of the aligned images corresponding to a conjugate gradient minimization; and
produce an output video sequence based on the reduced noise input video sequence, wherein the output video sequence has no greater resolution than the input video sequence resolution,
wherein a scaling parameter estimated from the input video sequence utilizing an estimate of image noise based on median of absolute differences of vertical and horizontal pixel differences is used in reducing the noise.
2. The method of
estimating an optical flow of the input video sequence prior to aligning images from the input video sequence.
3. The method of
4. The method of
performing iterative line minimization.
5. The method of
performing image warping of each of the aligned images to a reference frame, wherein intensity values of the aligned images are combined to obtain a de-noised version of the reference frame.
6. The method of
performing bilateral filtering in time and space to the aligned, warped images to combine the intensity values.
8. The non-transitory medium of
9. The non-transitory medium of
10. The non-transitory medium of
11. The non-transitory medium of
12. The non-transitory medium of
14. The processor-based electronic system of
15. The processor-based electronic system of
16. The processor-based electronic system of
17. The processor-based electronic system of
18. The processor-based electronic system of
19. The method of
20. The method of
|
The invention relates to video processing. More particularly, some embodiments of the invention relate to noise reduction in video.
U.S. Pat. No. 7,447,382 describes computing a higher resolution image from multiple lower resolution images using model-based, robust Bayesian estimation. A result higher resolution (HR) image of a scene given multiple, observed lower resolution (LR) images of the scene is computed using a Bayesian estimation image reconstruction methodology. The methodology yields the result HR image based on a likelihood probability function that implements a model for the formation of LR images in the presence of noise. This noise is modeled by a probabilistic, non-Gaussian, robust function.
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
Various features of the invention will be apparent from the following description of preferred embodiments as illustrated in the accompanying drawings, in which like reference numerals generally refer to the same parts throughout the drawings. The drawings are not necessarily to scale, the emphasis instead being placed upon illustrating the principles of the invention.
In the following description, for purposes of explanation and not limitation, specific details are set forth such as particular structures, architectures, interfaces, techniques, etc. in order to provide a thorough understanding of the various aspects of the invention. However, it will be apparent to those skilled in the art having the benefit of the present disclosure that the various aspects of the invention may be practiced in other examples that depart from these specific details. In certain instances, descriptions of well known devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.
With reference to
With reference to
With reference to
With reference to
With reference to
With reference to
With reference to
With reference to
For example, the memory 82 may have further instructions that when executed by the processor perform iterative forward and backward remapping of the aligned images corresponding to a conjugate gradient minimization. The memory 82 may have further instructions that when executed by the processor perform iterative line minimization. Alternatively, the memory 82 may have further instructions that when executed by the processor perform image warping of the aligned images and perform bilateral filtering in time and space to the aligned, warped images.
For example, the processor 81 and memory 82 may be disposed in a housing 83. For example, the housing may correspond to any of desktop computer, a laptop computer, a set-top box, a hand-held device, among numerous other possibilities for the processor-based system 80. For example, system 80 may further include a display device 84 coupled to the processor 81 and memory 82. The output video sequence may be displayed on the display device 84. For example, the system 80 may further include an input device 85 to provide user input to the processor 81 and 82. For example, the input device 85 may be a wireless remote control device.
Advantageously, some embodiments of the invention may reduce noise in video and/or image sequences. Without being limited to theory of operation, some embodiments of the invention may provide optical flow based robust multi-frame noise reduction in video. For example, some embodiments of the invention may be applied to videos and/or image sequences from a variety of different sources, quality and acquisition methods, including consumer grade videos acquired with inexpensive devices (like the ubiquitous cell-phone cameras, webcams, etc.), professional grade videos acquired in non-optimal conditions (low illumination), medical image sequences from various sources (X-ray, ultrasound, etc.), among numerous other sources of video sequences.
Without being limited to theory of operation, some embodiments of the invention may exploit the high degree of redundancy usually present in videos and image sequences. For example, in the case of a noisy video sequence of a static scene and camera many samples of frames of the noisy intensity at each pixel can be used to obtain a statistical estimate of the underlying constant intensity of that pixel (e.g. by taking the average of all the samples available).
In many video sequences, however, the scenes are usually not static due to camera or object motion. Advantageously, some embodiments of the invention may estimate the optical flow of the video sequence. For example, the optical flow may correspond to the apparent motion of the intensity levels from one image to the next in the sequence due to the motion of the camera or the objects in the scene. For example, the optical flow may be estimated from a window of frames around a reference frame (e.g. a nominal central frame), taking for example two frames forward and two frames backward to use a total of five frames, and aligning each frame to the reference frame.
For example, the image alignment can be estimated using different motion models, ranging from low parameter motion models (e.g., pure translation, rotation, affine, projective, etc.) that have limited applicability in general video, to dense non-parametric models (e.g., optical flow) in which a displacement vector is estimated for every pixel, with intermediate solutions being also possible (one low parameter motion model plus individual optical flow for points considered outliers for this model). For example, U.S. Patent Publication No. 2008-0112630, entitled DIGITAL VIDEO STABILIZATION BASED ON ROBUST DOMINANT MOTION ESTIMATION, describes that an apparatus may receive an input image sequence and estimate dominant motion between neighboring images in the image sequence. The apparatus may use a robust estimator to automatically detect and discount outliers corresponding to independently moving objects.
Some embodiments of the invention may utilize a gradient based, multi-resolution, optical flow estimation with theoretical sub-pixel accuracy. However, other methods for dense motion estimation are potentially applicable. For example, motion estimation techniques applied to frame rate conversion may be applicable.
With reference to
In accordance with some embodiments of the invention, robust frame de-noising may be applied to the aligned images in the video sequence. For example, once the alignment has been estimated, the noisy intensities coming from different frames that correspond to the same scene location may be combined to obtain a de-noised version of the central, reference frame. In accordance with some embodiments of the invention, a range of robustness may be applied to the frame de-noising. For example, in some embodiments of the invention the robust frame de-noising may involve relatively high computational cost and provide very effective de-noising (e.g. a Bayesian method). For example, in some embodiments of the invention the robust frame de-noising may involve a relatively lower computational cost but provide less effective de-noising (e.g. based on pre-warping).
One important parameter in robust methods (like the ones proposed here) is what is known as a scale parameter which is related to what magnitudes of the error are considered noise and what magnitudes are considered outliers. This important parameter is automatically estimated from the input images using a robust estimate of the image noise (based on the MAD—median of absolute differences—of the vertical and horizontal pixel differences). This allows the proposed methods to work in a completely automatic mode without user interaction independent of the noise level of the input image sequence.
With reference to
Advantageously, as compared to the super-resolution image processing described in the above-mentioned U.S. Pat. No. 7,447,382, some embodiments of the invention may involve a relatively lower computational cost while still providing good de-noising performance. For example, because some embodiments of the present invention do not increase the resolution of the image, the image formation model may omit down/up sampling. Some embodiments of the invention may be iterative and utilize the application of the direct and inverse image formation models (basically backward and forward remapping) multiple times, as well as to do a line minimization at each iteration, resulting in an high computational cost. Advantageously, the results of this iterative model-based method are usually very high quality with high noise reduction and virtually no additional blur introduced in the resulting images. In some embodiments of the invention additional de-blurring capabilities may be utilized.
In accordance with some embodiments of the invention, a lower computational cost method may be based on pre-warping the images and then combining the results. For example, once the alignment has been estimated, it is possible to combine the noisy intensities coming from different frames that correspond to the same scene location in a statistical way to obtain a de-noised version of the central frame. For example, warping may be performed on every image in the window of images around the central frame using the estimated optical flow. The warping may be done using backwards interpolation (e.g. using bi-linear interpolation).
After the surrounding images have been aligned to the central frame and warped, the result is a collection of frames whose pixels are aligned, providing each frame at least one noisy sample of the underlying noise-free intensity. For example, at every pixel position there are a number of noisy intensity samples ranging from 1 (the one corresponding to the central frame) to the number of images depending on the visibility of that pixel in the different images. In accordance with some embodiments of the invention, all of the available samples may then be combined using an approach similar to bilateral filtering.
For example, a weighted average for each pixel position may be determined, where the weights of each sample are a function of the difference in intensities between the sample and the central frame (e.g. using a Gaussian function). Advantageously, some embodiments of the invention may favor intensities that are close to the intensity value of the central frame, which may avoid combining wrong intensities coming from incorrect alignments. Advantageously, the computational cost of this method is relatively low as it doesn't require iterations and the process is done locally on a pixel by pixel basis.
Advantageously, some embodiments of the invention may provide a multi-frame de-noising method for video that is a relatively simple (e.g. as compared to the Bayesian method), without resolution increase, and having relatively low computational cost. Once the frames to be combined are aligned, some embodiments of the invention may involve very little additional bandwidth and operations. Advantageously, some embodiments of the invention may utilize the motion already estimated by other modules (e.g. one used for frame rate conversion) to perform the image alignment, resulting in further computational savings (e.g. and potentially silicon area or board area savings for hardware implementations).
Some embodiments of the invention may include robust de-noising using bilateral filtering. Once the images are aligned, the intensity values may be combined on a pixel by pixel basis to estimate a value of the noise-free intensity underlying the noisy measurements using an appropriate statistical method. For example, a straightforward method for combining the measurements may be to take a simple statistical estimator, such as the average or the median. However, these estimators may not be sufficiently robust to errors in the alignment. Gross errors in the alignment may occur, even using advanced motion estimation techniques. Advantageously, some embodiments of the invention may combine the noisy measurements based on bilateral filtering, which may improve the results even when there are alignment errors.
Bilateral filtering may be useful for spatial de-noising based on obtaining at each pixel a weighted average of the intensities of the pixels in a surrounding window (e.g. similar to a linear filtering), where the weight at a given pixel is a function of the difference between the intensity at that pixel and the intensity at the central frame (e.g., a Gaussian, which will give more weight to intensities close to the central pixel).
Some embodiments of the invention may combine the intensities coming from different frames, which results in the following solution for a given image (the individual pixel locations are omitted from the equation for clarity):
where N is the number of frames being combined, c is the index corresponding to the central (reference) frame, and where the images Ik are already warped according to the estimated alignment to the central frame.
Some embodiments of the invention may combine both temporal and spatial information, such that to obtain the value of one given pixel, information may be combined from the same location in multiple pre-aligned frames, together with neighboring pixels in a window surrounding the given pixel in the central frame.
Without being limited to theory of operation, the bilateral filter can be interpreted as one iteration of the method of iteratively re-weighted least squares estimator, with a solution initialized with the intensity of the central pixel. Advantageously, some embodiments of the invention may be non-iterative, robust to outliers, and produce a de-noised pixel value that is close to the original value of the central pixel. This may be particularly advantageous when a gross alignment error may otherwise bias the estimate of the intensity to values very far from the original value, even using robust estimators, when the breakdown point of the robust estimator is reached, and creating artifacts.
In iterative RWLS this procedure may be iterated until convergence is achieved. In accordance with some embodiments of the invention, bilateral filtering in Eq. 1 provides a result which is a reasonable estimate of the RWLS method. For example, one particular noisy measurement may be defined as the central value, used as an initial estimate of g, and then a result may be computed with just one iteration. Therefore, in some embodiments of the invention bilateral filtering can be interpreted as running one iteration of an iterative RWLS with a particular starting point.
In standard bilateral filtering the most common choice for the weighting function is a Gaussian weighting function, which corresponds to an underlying robust function known as the Welsch function in the robust statistics literature. Other choices of robust functions may be suitable alternatives for the bilateral filter. For example, the Cauchy function may be appropriate for video de-noising as it may provide high quality results in super-resolution applications. The Huber function may also provide good results. On the contrary, the Tukey function, with a shape similar to the Welsch function, may be suitable but not as good.
With reference to
The results of
Advantageously, some embodiments of the invention combine motion estimation with robust statistical methods (like bilateral filtering or Bayesian reconstruction with robust functions) to achieve a high quality de-noising, substantially artifact free and that does not introduce substantial spatial blur. Furthermore, the method may be made fully automatic and can be applied to sequences with different levels of noise, from not noisy to very noisy, without introducing noticeable artifacts or decrease in visual quality in the noise free sequences.
For example, some embodiments of the invention may be useful in video processing pipelines to be incorporated in displays and/or set-top boxes which may benefit from a de-noising module to clean potentially noisy content. For example, some embodiments of the invention may be implemented as fixed function hardware. For example, some embodiments of the invention may be implemented as a software solution running on special purpose or general purpose processors. Although described in connection with modules, those skilled in the art will appreciate that such modules are not necessarily discrete parts with fixed components. Rather such modules may be implemented by various combinations of hardware and/or software distributed throughout the architecture of a processor-based system.
Those skilled in the art will appreciate that the flow diagram of
The foregoing and other aspects of the invention are achieved individually and in combination. The invention should not be construed as requiring two or more of such aspects unless expressly required by a particular claim. Moreover, while the invention has been described in connection with what is presently considered to be the preferred examples, it is to be understood that the invention is not limited to the disclosed examples, but on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and the scope of the invention.
Kurupati, Sreenath, Nestares, Oscar, Haussecker, Horst W., Gat, Yoram, Ettinger, Scott M.
Patent | Priority | Assignee | Title |
10200649, | Feb 07 2014 | MORPHO, INC | Image processing device, image processing method and recording medium for reducing noise in image |
11238564, | Mar 21 2015 | MINE ONE GMBH | Temporal de-noising |
11960639, | Mar 21 2015 | MINE ONE GMBH | Virtual 3D methods, systems and software |
11995902, | Mar 21 2015 | MINE ONE GMBH | Facial signature methods, systems and software |
Patent | Priority | Assignee | Title |
5361105, | Mar 05 1993 | Panasonic Corporation of North America | Noise reduction system using multi-frame motion estimation, outlier rejection and trajectory correction |
5453792, | Mar 18 1994 | PRIME IMAGE, INC | Double video standards converter |
5696848, | Mar 09 1995 | Intellectual Ventures Fund 83 LLC | System for creating a high resolution image from a sequence of lower resolution motion images |
5715000, | Sep 24 1992 | Texas Instruments Incorporated | Noise reduction circuit for reducing noise contained in video signal |
6408104, | Dec 31 1997 | LG Electronics Inc. | Method and apparatus for eliminating noise in a video signal encoder |
6490364, | Aug 28 1998 | SRI International | Apparatus for enhancing images using flow estimation |
6798897, | Sep 05 1999 | PROTRACK LTD | Real time image registration, motion detection and background replacement using discrete local motion estimation |
7519235, | Oct 24 2005 | Adobe Inc | Using nonlinear filtering while resizing an image to preserve sharp image detail |
7711044, | Oct 29 2001 | ENTROPIC COMMUNICATIONS, INC ; Entropic Communications, LLC | Noise reduction systems and methods |
7945117, | Aug 22 2006 | SIEMENS HEALTHINEERS AG | Methods and systems for registration of images |
8447565, | Jun 07 2010 | Lawrence Livermore National Security, LLC | Approximate error conjugation gradient minimization methods |
8478555, | Nov 21 2007 | University of Manitoba | System and methods of improved tomography imaging |
8712156, | Jan 10 2010 | VIDEOCITES ID LTD | Comparison of visual information |
20020135743, | |||
20030095696, | |||
20030189655, | |||
20040073111, | |||
20050093987, | |||
20050168651, | |||
20060008152, | |||
20060083440, | |||
20060257042, | |||
20080050043, | |||
20080226189, | |||
20090245755, | |||
20090284601, | |||
20100034272, | |||
20110273621, | |||
20130129141, | |||
CN101009021, | |||
EP614312, | |||
EP1045591, | |||
JP2002099176, | |||
JP2004289829, | |||
JP2005295287, | |||
JP2005295302, | |||
JP2006211254, | |||
JP2006221190, | |||
JP2006309749, | |||
JP2006319136, | |||
JP2007194979, | |||
JP2007299068, | |||
JP2008062035, | |||
JP2008294601, | |||
JP5061975, | |||
JP5236302, | |||
JP7274044, | |||
JP8065682, | |||
WO2007103051, | |||
WO2010077913, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Dec 19 2008 | NESTARES, OSCAR | Intel Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 029226 | /0435 | |
Dec 19 2008 | HAUSSECKER, HORST W | Intel Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 029226 | /0435 | |
Dec 19 2008 | ETTINGER, SCOTT M | Intel Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 029226 | /0435 | |
Dec 19 2008 | GAT, YORAM | Intel Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 029226 | /0435 | |
Dec 19 2008 | KURUPATI, SREENATH | Intel Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 029226 | /0435 | |
Dec 30 2008 | Intel Corporation | (assignment on the face of the patent) | / |
Date | Maintenance Fee Events |
Nov 03 2014 | ASPN: Payor Number Assigned. |
May 17 2018 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
Dec 29 2021 | M1552: Payment of Maintenance Fee, 8th Year, Large Entity. |
Date | Maintenance Schedule |
Dec 02 2017 | 4 years fee payment window open |
Jun 02 2018 | 6 months grace period start (w surcharge) |
Dec 02 2018 | patent expiry (for year 4) |
Dec 02 2020 | 2 years to revive unintentionally abandoned end. (for year 4) |
Dec 02 2021 | 8 years fee payment window open |
Jun 02 2022 | 6 months grace period start (w surcharge) |
Dec 02 2022 | patent expiry (for year 8) |
Dec 02 2024 | 2 years to revive unintentionally abandoned end. (for year 8) |
Dec 02 2025 | 12 years fee payment window open |
Jun 02 2026 | 6 months grace period start (w surcharge) |
Dec 02 2026 | patent expiry (for year 12) |
Dec 02 2028 | 2 years to revive unintentionally abandoned end. (for year 12) |