Silog: Speech input logon
Sergio Grau
*
, Tony Allen, Nasser Sherkat
The Centre for Innovation and Technology Exploitation, Nottingham Trent University, Clifton Lane, Nottingham, Nottinghamshire NG11 8NS, United Kingdom
article info
Article history:
Available online 9 January 2009
Keywords:
Biometrics
Voice verification
VoiceXML
abstract
Silog is a biometric authentication system that extends the conventional PC logon process using voice
verification. Users enter their ID and password using a conventional Windows logon procedure but then
the biometric authentication stage makes a voice over IP (VoIP) call to a VoiceXML (VXML) server. User
interaction with this speech-enabled component then allows the user’s voice characteristics to be
extracted as part of a simple user/system spoken dialogue. If the captured voice characteristics match
those of a previously registered voice profile, then network access is granted. If no match is possible, then
a potential unauthorised system access has been detected and the logon process is aborted.
Ó 2009 Elsevier B.V. All rights reserved.
1. Introduction
The hierarchy of secure computer applications is most effi-
ciently expressed by the basic taxonomy of authentication meth-
ods [1]:
I. Something you have
II. Something you know
III. Something you are
Physical tokens such as smart cards, usb sticks and mobile
phones etc. represent the something you have category of security
whilst user remembered data such as PIN numbers, passwords
and memorable information represent the something you know
layer. Biometrics, in the form of fingerprint [2], iris [3] and voice
[4] characterisation are examples of the something you are genre
of secure solutions. Each biometric technique has a unique set of
advantages/disadvantages and its own dedicated group of
advocates.
Fingerprint capture hardware is readily available, has reason-
ably robust false acceptance and rejection rates (FAR/FRR) [2]
and is starting to become more widely used (US customs, high-
end laptops, National ID system etc.). Iris detection has good
FAR/FRR rates but is fairly invasive in terms of its data capture pro-
cess [3]. As a consequence, perhaps, few commercial applications
of this form of biometric technique are in evidence. Voice charac-
terisation meanwhile has reasonable FAR/FRR rates [4] and only re-
quires a microphone input as its data capture hardware. Voice thus
tends to lend itself well to being used in telephone based user
authentication solutions where a microphone and speaker are
inherently present [5].
In general, two-layer systems are more secure than single-layer
systems [6]; as evidenced by the token and PIN number verifica-
tion architecture used in the plethora of smart card based credit/
debit card systems currently available. However, most conven-
tional PC logon systems – especially large networked systems with
multiple users – continue to be one-layer systems (ID and pass-
word are both forms of something you know) because the cost of
providing the token or biometric data capture devices at multiple
nodes is prohibitive. Consequently, such systems are inherently
vulnerable to unauthorised access attacks where the user ID and
password has been stolen via the use of software such as Spyware
etc. [7] – this being particularly the case for remote access virtual
private network (VPN) systems where security at the remote user
node is not under the complete control of a centralised IT service.
In the Speech Input Logon system (Silog), voice characterisation
and knowledge verification has been integrated into the conven-
tional Windows logon process in order to produce a two-layer
identity management solution that can offer maximum security
for network access systems.
2. Silog: a user perspective
In the conventional Windows logon the system asks for user-
name and password using the Microsoft Graphical Identification
and Authentication (GINA) procedure. In order to add voice and
knowledge verification we were required to replace the standard
Microsoft GINA with a C++ plugin (pGINA [8]) that allows system
designers to introduce additional functionality into the logon pro-
cess. In our case, a VoIP softphone facility [9] is provided to manage
SIP [10] calls to an external VXML [11] spoken dialogue system.
Additional external database access functionality is also provided
0950-7051/$ - see front matter Ó 2009 Elsevier B.V. All rights reserved.
doi:10.1016/j.knosys.2008.10.002
* Corresponding author.
E-mail addresses: sergio.graupuerto@ntu.ac.uk (S. Grau), tony.allen@ntu.ac.uk
(T. Allen), nasser.sherkat@ntu.ac.uk (N. Sherkat).
Knowledge-Based Systems 22 (2009) 535–539
Contents lists available at ScienceDirect
Knowledge-Based Systems
journal homepage: www.elsevier.com/locate/knosys