Machine Vision and Applications (2017) 28:551–568
Learning typographic style: from discrimination to synthesis
Received: 26 July 2016 / Revised: 1 March 2017 / Accepted: 21 April 2017 / Published online: 9 May 2017
© Springer-Verlag Berlin Heidelberg 2017
Abstract Typography is a ubiquitous art form that affects
our understanding, perception and trust in what we read.
Thousands of different font-faces have been created with
enormous variations in the characters. In this paper, we learn
the style of a font by analyzing a small subset of only four
letters. From these four letters, we learn two tasks. The ﬁrst
is a discrimination task: given the four letters and a new can-
didate letter, does the new letter belong to the same font?
Second, given the four basis letters, can we generate all of
the other letters with the same characteristics as those in the
basis set? We use deep neural networks to address both tasks,
quantitatively and qualitatively measure the results in a vari-
ety of novel manners, and present a thorough investigation
of the weaknesses and strengths of the approach. All of the
experiments are conducted with publicly available font sets.
Keywords Style analysis · Typography · Image generation ·
Image synthesis · Machine learning
The history of fonts and typography is vast, originating back
at least to ﬁfteenth-century Germany with the creation of
the movable-type press by Johannes Gutenberg and the ﬁrst
font ‘Blackletter.’ This was based on the handwriting style
of the time and was used to print the ﬁrst books [29,49].
Centuries later, numerous studies have consistently shown
the large impact that fonts have on not only the readability of
Google, Inc., Mountain View, CA, USA
text, but also the comprehensibility and trustability of what
is written [5,25,38].
Despite the prevalence of just the few standard fonts used
throughout academic literature, there are innumerable cre-
ative, stylized and unique fonts available. Many have been
created by individual designers as hobbies or for particular
applications such as logos, movies or print advertisements.
A small sample of a few of the over 10,000 used in this
study is shown in Fig. 1. All of the experiments conducted
in this paper, in both training and testing, use only fonts that
are available for download as True-Type-Font (TTF) ﬁles
The seminal work of Tenenbaum and Freeman 
toward separating style from content was applied to letter
generation. Our motivation and goals are similar to theirs:
we hope that a learner can exploit the structure in samples of
related content to extract representations necessary for mod-
eling style. The end goal is to perform tasks, such as synthesis
and analysis, on new styles that have never been encountered.
In contrast to , we do not attempt to explicitly model style
and content separately; rather, through training a learning
model (a deep neural network) to reproduce style, content
and style are implicitly distinguished. We also extend their
work in four directions. First, we demonstrate that a very
small subset of characters is required to effectively learn
both discriminative and generative models for representing
typographic style; we use only 4 instead of the 62 alphanu-
meric characters used in . Second, with these 4 letters,
we learn individual letter and combined letter models capa-
ble of generating all the remaining letters. Third, we broaden
Wang et al. have studied retrieving fonts found in photographs .
Extraction from photographs is not addressed in this paper. However,
the underlying task of font retrieval will be presented in the experimental