# Model-driven deep-learning

Model-driven deep-learning Deep learning has been widely recognized as the representative advances of machine learning or artificial intelligence in general nowadays [1,2]. This can be attributed to the recent breakthroughs made by deep learning on a series of challenging applications. A deep-learning approach improves the accuracy rate of face recognition to be higher than 99%, beating the human level [3]. For speech recognition and machine translation, deep learning is approaching the performance level of a simultaneous interpreter [4]. For the game of ‘go’, it successfully beats the human world champion [5]. For diagnosis of some specific diseases, it has matched the level of medium or senior professional physicians [6]. Until now, it has been hard to find areas in which the deep-learning technique has not been tried in their respective tasks. One can observe that these breakthroughs always take place in large IT companies or specialized R&D institutes, such as Google, Microsoft, Facebook, etc. This is because deep-learning applications require some prerequisites, such as a huge volume of labeled data, sufficient computational resources and the engineering experiences in determining the network topology, including the number of layers, number of neurons per layer and non-linear transforms of neurons. Due to these prerequisites, it requires sufficient knowledge and engineering experience in neural network design, and takes a long time in accumulating and labeling data. Professional IT companies and specialized R&D institutions can obviously match these requirements. Figure 1. View largeDownload slide Model-driven deep-learning approach. Figure 1. View largeDownload slide Model-driven deep-learning approach. With the arrival of the big data era, data requirements are gradually no longer an obstacle (at least for many areas), but the determination of network topology is still a bottleneck. This is mainly due to the lack of theoretical understandings of the relationship between the network topology and performance. In the current state, the selection of network topology is still an engineering practice instead of scientific research, leading to the fact that most of the existing deep-learning approaches lack theoretical foundations. The difficulties in network design and its interpretation, and a lack of understanding in its generalization ability are the common limitations of the deep-learning approach. These limitations may prevent its widespread use in the trends of ‘standardization, commercialization’ of machine learning and artificial intelligence technology. A natural question is whether we can design network topology with theoretical foundations, and make the network structure explainable and predictable. We believe that it is possible to provide a positive answer to this question through combing the model-driven approach and data-driven deep-learning approach. Here we take the deep-learning approach as a data-driven approach because it uses a standard network architecture as a black box, heavily relying on huge data to train the black box. In contrast, the model-driven approach here refers to the method using a model (e.g. a loss function) constructed based on the objective, physical mechanism and domain knowledge for a specific task. A prominent feature of the model-driven approach is that, when the model is sufficiently accurate, the solution can be generally expected to be optimal, and the minimization algorithm is commonly deterministic. A fatal flaw of the model-driven approach lies in the difficulty in accurately modeling for a specific task in real applications, and sometimes the pursuit of accurate modeling is a luxury expectation. In recent years, we have studied and implemented a series of model-driven deep-learning methods [7–10] combining the modeling-based and deep-learning-based approaches, which showed their feasibilities and effectiveness in real applications. Given a specific task, the basic procedures of our model-driven deep-learning method are shown in Fig. 1 and explained as follows: A model family is first constructed based on the task backgrounds (e.g. objective, physical mechanism and prior knowledge). The model family is a family of functions with a large set of unknown parameters, amounting to the hypothesis space in machine learning. Differently from the accurate model in the model-driven approach, this model family only provides a very rough and broad definition of the solution space. It has the advantage of a model-driven approach but greatly reduces the pressure of accurate modeling. An algorithm family is then designed for solving the model family and the convergence theory of the algorithm family is established. The algorithm family refers to the algorithm with unknown parameters for minimizing the model family in the function space. The convergence theory should include the convergence rate estimation and the constraints on the parameters that assure the convergence of the algorithm family. The algorithm family is unfolded to a deep network with which parameter learning is performed as a deep-learning approach. The depth of the network is determined by the convergence rate estimation of the algorithm family. The parameter space of the deep network is determined by the parameter constraints. All the parameters of the algorithm family are learnable. In this way, the topology of the deep network is determined by the algorithm family, and the deep network can be trained through back-propagation. Figure 2. View largeDownload slide Topology of ADMM-Net [7]: given under-sampled k-space data, it outputs the reconstructed MRI image after T stages of processing. Figure 2. View largeDownload slide Topology of ADMM-Net [7]: given under-sampled k-space data, it outputs the reconstructed MRI image after T stages of processing. Taking [7] as an example, we apply the above model-driven deep-learning approach to compressive sensing magnetic resonance imaging (CS-MRI), i.e. recovering the high-quality MR image using sub-sampled k-space data lower than the Nyquist rate. The model family is defined as:   \begin{eqnarray} \hat{x} &=& \arg \,{\min _x}\left\{ \frac{1}{2}\left\| {Ax - y} \right\|_2^2 \right.\nonumber\\ &&\left.+\, \sum\nolimits_{l = 1}^L {{\lambda _l}g({D_l}x)} \right\}, \end{eqnarray} (1)where A =  PF is the measurement matrix, P is the sampling matrix, F is the Fourier transform matrix, Dl is linear transform for convolution, g( · ) is the regularization function, λl is the regularization parameter and L is the number of linear transforms. All the parameters of (Dl,  g,  λl,  L) are unknown and reflect the uncertainty in modeling (notice that these parameters are known and fixed in traditional CS-MRI models). According to the ADMM (Alternating Direction Method of Multipliers) method, the algorithm family for solving the model family can be designated as:   {{ \begin{eqnarray} \left\{ {\begin{array}{@{}*{1}{l}@{}} {x^{( n)}} = {F^T}\ {{\left( {{P^T}P + {\sum _l}{\rho _l}FD_l^T{D_l}{F^T}} \right)}^{ - 1}}\\ \times \left[ {{P^T}y + {\sum _l}{\rho _l}FD_l^T\left( {z_l^{( {n - 1})} + \beta _l^{( {n - 1} )}} \right)} \right]\nonumber\\ {{z_l}^{( n )} = {\rm{\ }}S\left( {{D_l}{x^{( n )}} + \beta _l^{( {n - 1} )};\frac{{{\lambda _l}}}{{{\rho _l}}}} \right)}\nonumber\\ {\beta _l^{( n )} = \beta _l^{( {n - 1} )}{\rm{\ }} + {\eta _l}\left( {{D_l}{x^{( n )}} - {z_l}^{( n )}} \right)} \end{array},} \right.\!\!\!\!\!\!\\ \end{eqnarray}}} (2)where S( · ) is a non-linear transform relating to g( · ). According to the ADMM convergence theory, this algorithm is linearly convergent. By unfolding the algorithm family to a deep network, we design an ADMM-Net composed of T successive stages, as shown in Fig. 2. Each stage consists of a reconstruction layer (R), a convolution layer (C), a non-linear transform layer (Z) and a multiplier update layer (M). We learn the parameters of (S, Dl,  λl,  ρl, ηl) using a back-propagation algorithm. In [7], we reported the state-of-the-art CS-MRI results using this model-driven deep-learning method. The above model-driven deep-learning approach obviously retains the advantages (i.e. determinacy and theoretical soundness) of the model-driven approach, and avoids the requirement for accurate modeling. It also retains the powerful learning ability of the deep-learning approach, and overcomes the difficulties in network topology selection. This makes the deep-learning approach designable and predictable, and it balances well versatility and pertinence in real applications. We point out that the model-driven approach and data-driven approach are not opposed to each other. If the model is accurate, it provides the essential description of the problem solutions, from which infinite ideal samples can be generated, and vice versa: when the sufficient samples are provided, the model of the problem is fully (but in discretized form) represented. This is the essential reason for the effectiveness of the model-driven deep-learning approach. Please refer to [2,8] for the previous investigations of the model-driven deep-learning approach. The recent advances can be found in [7,9–11]. Most of these successful applications lie in the inverse problems in imaging sciences, for which there exists domain knowledge that can be well modeled in the model family. We believe that this model-driven deep-learning approach can be widely applied to the applications where we can design the model family by incorporating domain knowledge and then the deep architecture can be correspondingly designed following the above procedures. References 1. LeCun Y, Bengio Y, Hinton G. Nature  2015; 521: 436– 44. CrossRef Search ADS PubMed  2. Gregor K, LeCun Y. ICML  2010. 3. Schroff F, Kalenichenko D, Philbin J. CVPR  2015. 4. Yonghui W, Schuster M, Zhifeng Chen et al.   arXiv:1609.08144, 2016. 5. Silver D, Aja Huang, Chris J. Maddison et al.   Nature  2016; 529: 484– 9. CrossRef Search ADS PubMed  6. Gulshan V, Peng L, Coram M et al. Jama  2016; 316: 2402– 10. CrossRef Search ADS PubMed  7. Yang Y, Sun J, Li H et al. NIPS  2016. 8. Sun J, Tappen M. CVPR  2011. 9. Sun J, Tappen M. IEEE T Image Process  2013; 22: 402– 8. CrossRef Search ADS   10. Sun J, Sun J, Xu Z. IEEE T Image Process  2015; 24: 4148– 59. CrossRef Search ADS   11. Sprechmann P, Bronstein AM, Sapiro G. IEEE TPAMI  2015; 37: 1821– 33. CrossRef Search ADS   © The Author(s) 2017. Published by Oxford University Press on behalf of China Science Publishing & Media Ltd. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png National Science Review Oxford University Press

# Model-driven deep-learning

, Volume 5 (1) – Jan 1, 2018
3 pages

Loading next page...

/lp/ou_press/model-driven-deep-learning-QnTagY5UgS
Publisher
Oxford University Press
Copyright
© The Author(s) 2017. Published by Oxford University Press on behalf of China Science Publishing & Media Ltd. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
ISSN
2095-5138
eISSN
2053-714X
D.O.I.
10.1093/nsr/nwx099
Publisher site
See Article on Publisher Site

### Abstract

Deep learning has been widely recognized as the representative advances of machine learning or artificial intelligence in general nowadays [1,2]. This can be attributed to the recent breakthroughs made by deep learning on a series of challenging applications. A deep-learning approach improves the accuracy rate of face recognition to be higher than 99%, beating the human level [3]. For speech recognition and machine translation, deep learning is approaching the performance level of a simultaneous interpreter [4]. For the game of ‘go’, it successfully beats the human world champion [5]. For diagnosis of some specific diseases, it has matched the level of medium or senior professional physicians [6]. Until now, it has been hard to find areas in which the deep-learning technique has not been tried in their respective tasks. One can observe that these breakthroughs always take place in large IT companies or specialized R&D institutes, such as Google, Microsoft, Facebook, etc. This is because deep-learning applications require some prerequisites, such as a huge volume of labeled data, sufficient computational resources and the engineering experiences in determining the network topology, including the number of layers, number of neurons per layer and non-linear transforms of neurons. Due to these prerequisites, it requires sufficient knowledge and engineering experience in neural network design, and takes a long time in accumulating and labeling data. Professional IT companies and specialized R&D institutions can obviously match these requirements. Figure 1. View largeDownload slide Model-driven deep-learning approach. Figure 1. View largeDownload slide Model-driven deep-learning approach. With the arrival of the big data era, data requirements are gradually no longer an obstacle (at least for many areas), but the determination of network topology is still a bottleneck. This is mainly due to the lack of theoretical understandings of the relationship between the network topology and performance. In the current state, the selection of network topology is still an engineering practice instead of scientific research, leading to the fact that most of the existing deep-learning approaches lack theoretical foundations. The difficulties in network design and its interpretation, and a lack of understanding in its generalization ability are the common limitations of the deep-learning approach. These limitations may prevent its widespread use in the trends of ‘standardization, commercialization’ of machine learning and artificial intelligence technology. A natural question is whether we can design network topology with theoretical foundations, and make the network structure explainable and predictable. We believe that it is possible to provide a positive answer to this question through combing the model-driven approach and data-driven deep-learning approach. Here we take the deep-learning approach as a data-driven approach because it uses a standard network architecture as a black box, heavily relying on huge data to train the black box. In contrast, the model-driven approach here refers to the method using a model (e.g. a loss function) constructed based on the objective, physical mechanism and domain knowledge for a specific task. A prominent feature of the model-driven approach is that, when the model is sufficiently accurate, the solution can be generally expected to be optimal, and the minimization algorithm is commonly deterministic. A fatal flaw of the model-driven approach lies in the difficulty in accurately modeling for a specific task in real applications, and sometimes the pursuit of accurate modeling is a luxury expectation. In recent years, we have studied and implemented a series of model-driven deep-learning methods [7–10] combining the modeling-based and deep-learning-based approaches, which showed their feasibilities and effectiveness in real applications. Given a specific task, the basic procedures of our model-driven deep-learning method are shown in Fig. 1 and explained as follows: A model family is first constructed based on the task backgrounds (e.g. objective, physical mechanism and prior knowledge). The model family is a family of functions with a large set of unknown parameters, amounting to the hypothesis space in machine learning. Differently from the accurate model in the model-driven approach, this model family only provides a very rough and broad definition of the solution space. It has the advantage of a model-driven approach but greatly reduces the pressure of accurate modeling. An algorithm family is then designed for solving the model family and the convergence theory of the algorithm family is established. The algorithm family refers to the algorithm with unknown parameters for minimizing the model family in the function space. The convergence theory should include the convergence rate estimation and the constraints on the parameters that assure the convergence of the algorithm family. The algorithm family is unfolded to a deep network with which parameter learning is performed as a deep-learning approach. The depth of the network is determined by the convergence rate estimation of the algorithm family. The parameter space of the deep network is determined by the parameter constraints. All the parameters of the algorithm family are learnable. In this way, the topology of the deep network is determined by the algorithm family, and the deep network can be trained through back-propagation. Figure 2. View largeDownload slide Topology of ADMM-Net [7]: given under-sampled k-space data, it outputs the reconstructed MRI image after T stages of processing. Figure 2. View largeDownload slide Topology of ADMM-Net [7]: given under-sampled k-space data, it outputs the reconstructed MRI image after T stages of processing. Taking [7] as an example, we apply the above model-driven deep-learning approach to compressive sensing magnetic resonance imaging (CS-MRI), i.e. recovering the high-quality MR image using sub-sampled k-space data lower than the Nyquist rate. The model family is defined as:   \begin{eqnarray} \hat{x} &=& \arg \,{\min _x}\left\{ \frac{1}{2}\left\| {Ax - y} \right\|_2^2 \right.\nonumber\\ &&\left.+\, \sum\nolimits_{l = 1}^L {{\lambda _l}g({D_l}x)} \right\}, \end{eqnarray} (1)where A =  PF is the measurement matrix, P is the sampling matrix, F is the Fourier transform matrix, Dl is linear transform for convolution, g( · ) is the regularization function, λl is the regularization parameter and L is the number of linear transforms. All the parameters of (Dl,  g,  λl,  L) are unknown and reflect the uncertainty in modeling (notice that these parameters are known and fixed in traditional CS-MRI models). According to the ADMM (Alternating Direction Method of Multipliers) method, the algorithm family for solving the model family can be designated as:   {{ \begin{eqnarray} \left\{ {\begin{array}{@{}*{1}{l}@{}} {x^{( n)}} = {F^T}\ {{\left( {{P^T}P + {\sum _l}{\rho _l}FD_l^T{D_l}{F^T}} \right)}^{ - 1}}\\ \times \left[ {{P^T}y + {\sum _l}{\rho _l}FD_l^T\left( {z_l^{( {n - 1})} + \beta _l^{( {n - 1} )}} \right)} \right]\nonumber\\ {{z_l}^{( n )} = {\rm{\ }}S\left( {{D_l}{x^{( n )}} + \beta _l^{( {n - 1} )};\frac{{{\lambda _l}}}{{{\rho _l}}}} \right)}\nonumber\\ {\beta _l^{( n )} = \beta _l^{( {n - 1} )}{\rm{\ }} + {\eta _l}\left( {{D_l}{x^{( n )}} - {z_l}^{( n )}} \right)} \end{array},} \right.\!\!\!\!\!\!\\ \end{eqnarray}}} (2)where S( · ) is a non-linear transform relating to g( · ). According to the ADMM convergence theory, this algorithm is linearly convergent. By unfolding the algorithm family to a deep network, we design an ADMM-Net composed of T successive stages, as shown in Fig. 2. Each stage consists of a reconstruction layer (R), a convolution layer (C), a non-linear transform layer (Z) and a multiplier update layer (M). We learn the parameters of (S, Dl,  λl,  ρl, ηl) using a back-propagation algorithm. In [7], we reported the state-of-the-art CS-MRI results using this model-driven deep-learning method. The above model-driven deep-learning approach obviously retains the advantages (i.e. determinacy and theoretical soundness) of the model-driven approach, and avoids the requirement for accurate modeling. It also retains the powerful learning ability of the deep-learning approach, and overcomes the difficulties in network topology selection. This makes the deep-learning approach designable and predictable, and it balances well versatility and pertinence in real applications. We point out that the model-driven approach and data-driven approach are not opposed to each other. If the model is accurate, it provides the essential description of the problem solutions, from which infinite ideal samples can be generated, and vice versa: when the sufficient samples are provided, the model of the problem is fully (but in discretized form) represented. This is the essential reason for the effectiveness of the model-driven deep-learning approach. Please refer to [2,8] for the previous investigations of the model-driven deep-learning approach. The recent advances can be found in [7,9–11]. Most of these successful applications lie in the inverse problems in imaging sciences, for which there exists domain knowledge that can be well modeled in the model family. We believe that this model-driven deep-learning approach can be widely applied to the applications where we can design the model family by incorporating domain knowledge and then the deep architecture can be correspondingly designed following the above procedures. References 1. LeCun Y, Bengio Y, Hinton G. Nature  2015; 521: 436– 44. CrossRef Search ADS PubMed  2. Gregor K, LeCun Y. ICML  2010. 3. Schroff F, Kalenichenko D, Philbin J. CVPR  2015. 4. Yonghui W, Schuster M, Zhifeng Chen et al.   arXiv:1609.08144, 2016. 5. Silver D, Aja Huang, Chris J. Maddison et al.   Nature  2016; 529: 484– 9. CrossRef Search ADS PubMed  6. Gulshan V, Peng L, Coram M et al. Jama  2016; 316: 2402– 10. CrossRef Search ADS PubMed  7. Yang Y, Sun J, Li H et al. NIPS  2016. 8. Sun J, Tappen M. CVPR  2011. 9. Sun J, Tappen M. IEEE T Image Process  2013; 22: 402– 8. CrossRef Search ADS   10. Sun J, Sun J, Xu Z. IEEE T Image Process  2015; 24: 4148– 59. CrossRef Search ADS   11. Sprechmann P, Bronstein AM, Sapiro G. IEEE TPAMI  2015; 37: 1821– 33. CrossRef Search ADS   © The Author(s) 2017. Published by Oxford University Press on behalf of China Science Publishing & Media Ltd. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com

### Journal

National Science ReviewOxford University Press

Published: Jan 1, 2018

## You’re reading a free preview. Subscribe to read the entire article.

### DeepDyve is your personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month ### Explore the DeepDyve Library ### Search Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly ### Organize Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place. ### Access Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals. ### Your journals are on DeepDyve Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more. All the latest content is available, no embargo periods. DeepDyve ### Freelancer DeepDyve ### Pro Price FREE$49/month
\$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off