Access the full text.
Sign up today, get DeepDyve free for 14 days.
J. Moon, G. Dantzig, C. Liu (1971)
Introduction to Combinatorial Mathematics.American Mathematical Monthly, 78
Izhak Shafran, Michael Riley, Mehryar Mohri (2003)
Voice signatures2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721)
R. Schapire, Y. Singer (2000)
BoosTexter: A Boosting-based System for Text CategorizationMachine Learning, 39
F. Gantmakher (1984)
The Theory of Matrices
Mehryar Mohri, Fernando Pereira, M. Riley (2005)
Weighted Automata in Text and Speech ProcessingArXiv, abs/cs/0503077
A. Seeger (1999)
Eigenvalue analysis of equilibrium processes defined by linear complementarity conditionsLinear Algebra and its Applications, 292
Corinna Cortes, M. Mohri (2004)
Distribution kernels based on moments of countsProceedings of the twenty-first international conference on Machine learning
P. Hart, R. Duda, D. Stork (1973)
Pattern Classification
E. Lloyd, J. Bondy, U. Murty (1978)
Graph Theory with Applications
H. Lodhi, C. Saunders, J. Shawe-Taylor, N. Cristianini, Christopher Watkins (2002)
Text Classification using String Kernels
Y. Guermeur (2007)
VC Theory of Large Margin Multi-Category ClassifiersJ. Mach. Learn. Res., 8
Bernhard Schölkopf (2001)
Learning with kernels
Mehryar Mohri
Edit-distance of Weighted Automata: General Definitions and Algorithms
Andrew Frigyik, S. Srivastava, M. Gupta (2008)
An Introduction to Functional Derivatives
Mehryar Mohri (2002)
Semiring Frameworks and Algorithms for Shortest-Distance ProblemsJ. Autom. Lang. Comb., 7
Erin Allwein, Robert Schapire, Y. Singer (2001)
Reducing multiclass to binary: a unifying approach for margin classifiersJournal of Machine Learning Research, 1
D. Achlioptas, Frank McSherry (2007)
Fast computation of low-rank matrix approximationsJ. ACM, 54
I. Althöfer, G. Das, D. Dobkin, D. Joseph, J. Soares (1993)
On sparse spanners of weighted graphsDiscrete & Computational Geometry, 9
L. Mason, Jonathan Baxter, P. Bartlett, Marcus Frean (1999)
Boosting Algorithms as Gradient Descent
Y. Freund, R. Schapire (1996)
Experiments with a New Boosting Algorithm
B. Bollobás (1988)
The Isoperimetric Number of Random Regular GraphsEur. J. Comb., 9
N. Alon, V. Milman (1985)
lambda1, Isoperimetric inequalities for graphs, and superconcentratorsJ. Comb. Theory, Ser. B, 38
F. Harary (1969)
Graph theory
C. Leslie, E. Eskin, J. Weston, William Noble (2002)
Mismatch String Kernels for SVM Protein Classification
Corinna Cortes, P. Haffner, M. Mohri (2002)
Rational Kernels
Flavio Chierichetti, Silvio Lattanzi, A. Panconesi (2010)
Rumour spreading and graph conductance
Joshua Batson, D. Spielman, N. Srivastava (2008)
Twice-ramanujan sparsifiersSIAM J. Comput., 41
Günther Eibl, K. Pfeiffer (2005)
Multiclass Boosting for Weak ClassifiersJ. Mach. Learn. Res., 6
Corinna Cortes, P. Haffner, M. Mohri (2003)
Lattice kernels for spoken-dialog classification2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)., 1
Alex Smola, P. Bartlett, B. Schölkopf, Dale Schuurmans (2000)
Dynamic Alignment Kernels
F. Facchinei, J. Pang (2003)
Finite-Dimensional Variational Inequalities and Complementarity Problems
G. Golub (1983)
Matrix computations
David Mease, A. Wyner (2008)
Evidence Contrary to the Statistical View of BoostingJ. Mach. Learn. Res., 9
Cyril Allauzen, Mehryar Mohri, Brian Roark (2004)
A General Weighted Grammar Library
R. Schapire, Y. Singer (1998)
Improved Boosting Algorithms Using Confidence-rated PredictionsMachine Learning, 37
R. Durbin, S. Eddy, A. Krogh, G. Mitchison (1998)
Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids
A. Benczúr, David Karger (1996)
Approximating s-t minimum cuts in Õ(n2) time
V. Klema (1980)
LINPACK user's guideIEEE Transactions on Automatic Control
C. Moler, C. Loan (1978)
Nineteen Dubious Ways to Compute the Exponential of a Matrix, Twenty-Five Years LaterSIAM Rev., 45
D. West (1995)
Introduction to Graph Theory
V. Levenshtein (1965)
Binary codes capable of correcting deletions, insertions, and reversalsSoviet physics. Doklady, 10
Fernando Pereira, M. Riley (1996)
Speech Recognition by Composition of Weighted Finite AutomataArXiv, cmp-lg/9603001
Y. Freund, R. Schapire (1997)
A decision-theoretic generalization of on-line learning and an application to boosting
Thomas Dietterich, Ghulum Bakiri (1994)
Solving Multiclass Learning Problems via Error-Correcting Output CodesJ. Artif. Intell. Res., 2
B. Schölkopf (2000)
The Kernel Trick for Distances
Corinna Cortes, P. Haffner, M. Mohri (2003)
Positive Definite Rational Kernels
Reid Andersen, Y. Peres (2008)
Finding sparse cuts locally using evolving setsArXiv, abs/0811.3779
Corinna Cortes, V. Vapnik (1995)
Support-Vector NetworksMachine Learning, 20
E. Takimoto, Manfred Warmuth (2002)
Path Kernels and Multiplicative Updates
(1976)
Combinatorial optimization : networks and matroids
Hamed Masnadi-Shirazi, V. Mahadevan, N. Vasconcelos (2010)
On the design of robust classifiers for computer vision2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition
A. Salomaa, Matti Soittola (1978)
Automata-Theoretic Aspects of Formal Power Series
N. Tsing, M. Fan, Erik Verriest (1994)
On analyticity of functions involving eigenvaluesLinear Algebra and its Applications, 207
R. Horn, Charles Johnson (1991)
Topics in Matrix Analysis
Corinna Cortes, P. Haffner, M. Mohri (2003)
Weighted automata kernels - general framework and algorithms
D. Haussler (1999)
Convolution kernels on discrete structures
P. Haffner, Gökhan Tür, Jeremy Wright (2003)
Optimizing SVMs for complex call classification2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)., 1
V. Vapnik (1998)
Statistical learning theory
J. Friedman (2000)
Special Invited Paper-Additive logistic regression: A statistical view of boostingAnnals of Statistics, 28
B. Boser, Isabelle Guyon, V. Vapnik (1992)
A training algorithm for optimal margin classifiers
A. Chandra, P. Raghavan, W. Ruzzo, R. Smolensky, Prasoon Tiwari (1989)
The electrical resistance of a graph captures its commute and cover timescomputational complexity, 6
Chapter 4 4.1 The Likelihood of Finite Mixture Models Estimation of the parameters of the mixing distribution P is predominantly done using maximum likelihood. Given a sample of iid x ∼ f (x|P), i = 1,..., n, (4.1) we are interested in finding the maximum likelihood estimates (MLEs) of P, denoted as P, that is P = arg max L(P), n k L(P)= f (x , λ ) p (4.2) i j j ∏ ∑ i=1 j=1 or alternatively finding the estimates of P which maximize the log likelihood function n k (P)= log L(P)= log f (x , λ ) p . (4.3) i j j ∑ ∑ i=1 j=1 An estimate of P can be obtained as a solution to the likelihood equation ∂ (P) S(x, P)= = 0, (4.4) ∂ P where S(x, P) is the gradient vector of the log likelihood function, where differenti- ation is with respect to the parameter vector P. Maximum likelihood estimation of P is by no means trivial, since there are mostly no closed-form solutions available. P. Schlattmann, Medical Applications of Finite Mixture Models,55 Statistics for Biology and Health, DOI: 10.1007/978-3-540-68651-4 4, Springer-Verlag Berlin Hiedelberg 2009 56 4 Theory and
Published: Dec 8, 2008
Keywords: Mixture Model; Convex Hull; Bayesian Information Criterion; Expectation Maximization; Directional Derivative
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.