Fusion of range and color images for denoising and resolution enhancement with
a non-local filter
Benjamin Huhle
⇑
, Timo Schairer, Philipp Jenke, Wolfgang Straßer
University of Tübingen, WSI/GRIS, Sand 14, 72076 Tübingen, Germany
article info
Article history:
Received 5 January 2009
Accepted 7 November 2009
Available online 19 August 2010
Keywords:
Denoising
Outlier removal
Super-resolution
Time-of-flight
NL-Means
abstract
We present an integrated method for post-processing of range data which removes outliers, smoothes the
depth values and enhances the lateral resolution in order to achieve visually pleasing 3D models from
low-cost depth sensors with additional (registered) color images. The algorithm is based on the non-local
principle and adapts the original NL-Means formulation to the characteristics of typical depth data.
Explicitly handling outliers in the sensor data, our denoising approach achieves unbiased reconstructions
from error-prone input data. Taking intra-patch similarity into account, we reconstruct strong disconti-
nuities without disturbing artifacts and preserve fine detail structures, obtaining piece-wise smooth
depth maps. Furthermore, we exploit the dependencies of the depth data with additionally available
color information and increase the lateral resolution of the depth maps. We finally discuss how to paral-
lelize the algorithm in order to achieve fast processing times that are adequate for post-processing of data
from fast depth sensors such as time-of-flight cameras.
Ó 2010 Elsevier Inc. All rights reserved.
1. Introduction
A wide range of applications such as static or dynamic model
acquisition for 3DTV, other virtual reality and entertainment appli-
cations or scene acquisition as a basic technology in general, rely
on 3D data captured with different devices. In recent years, cam-
eras that measure real world distances based on the time-of-flight
principle at real-time frame-rates are becoming increasingly popu-
lar. These cameras that use an active modulated light source (e.g.,
from PMDTec, Canesta or Mesa Imaging) are prospective low-cost
sensors which could be deployed in standard applications, i.e.,
mass markets. However, data from these sensors are afflicted with
noise and contain frequent outliers due to difficult illumination sit-
uations and limitations of the sensor. A further draw-back of these
cameras is their low resolution – compared, for example, to digital
color cameras. Therefore, appropriate post-processing and fusion
with other sensors of higher resolution are essential in order to ob-
tain photo-realistic models. The algorithm presented in this article
is not dependent on the type of sensor. However, it is to be ex-
pected, that also with upcoming generations of low-cost depth
sensors one has to deal with inaccuracies, outliers and limited res-
olution of the output data.
There are several reasons to directly process the data in sensor
space instead of considering 3D point clouds reconstructed from
the raw data: For most sensors we have to deal with (additive)
noise in the viewing direction which is implicitly modeled if we
consider the data as a depth map. Furthermore, the filtering is per-
formed earlier in the pipeline and subsequent steps such as the
registration of several frames can rely on the refined data. In con-
trast, for common surface reconstruction techniques in 3D, e.g.,
multiple merged frames are necessary – otherwise, occlusions
would impede sound solutions. In practice, we believe that a
two-stage data enhancement is necessary, comprising an early
data filtering in sensor space and a subsequent surface reconstruc-
tion computed on a registered model. In many applications, color
images are recorded in addition to depth data but color data is of-
ten used for texturing only. We propose to use color information as
a supplementary cue for smoothing and since these images are
typically of higher resolution, a lateral resolution enhancement
exploiting the dependencies of depth and color data can be per-
formed at the same time.
Our approach to post-processing and fusion of depth and color
data is a denoising algorithm that consists of two stages. In the first
one, outliers are detected and removed. In the second stage a uni-
fied smoothing and upsampling technique yields unbiased and fea-
ture-preserving reconstructions. Due to the computational
complexity of this algorithm and since programmable graphics
hardware (GPUs) and multi-core CPUs are widely available, it is
useful to look for a scalable and efficient parallel formulation. We
1077-3142/$ - see front matter Ó 2010 Elsevier Inc. All rights reserved.
doi:10.1016/j.cviu.2009.11.004
⇑
Corresponding author. Fax: +49 7071 295466.
E-mail addresses: huhle@gris.uni-tuebingen.de (B. Huhle), schairer@gris.
uni-tuebingen.de (T. Schairer), jenke@gris.uni-tuebingen.de (P. Jenke), strasser@
gris.uni-tuebingen.de (W. Straßer).
Computer Vision and Image Understanding 114 (2010) 1336–1345
Contents lists available at ScienceDirect
Computer Vision and Image Understanding
journal homepage: www.elsevier.com/locate/cviu