Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Modeling annotated data

Modeling annotated data Modeling Annotated Data David M. Blei Division of Computer Science University of California, Berkeley Berkeley, CA 94720 Michael I. Jordan Division of Computer Science and Department of Statistics University of California, Berkeley Berkeley, CA 94720 ABSTRACT We consider the problem of modeling annotated data ”data with multiple types where the instance of one type (such as a caption) serves as a description of the other type (such as an image). We describe three hierarchical probabilistic mixture models which aim to describe such data, culminating in correspondence latent Dirichlet allocation, a latent variable model that is e €ective at modeling the joint distribution of both types and the conditional distribution of the annotation given the primary type. We conduct experiments on the Corel database of images and captions, assessing performance in terms of held-out likelihood, automatic annotation, and text-based image retrieval. Categories and Subject Descriptors G.3 [Mathematics of Computing]: Probability and Statistics ”statistical computing, multivariate statistics General Terms algorithms, experimentation Keywords probabilistic graphical models, empirical Bayes, variational methods, automatic image annotation, image retrieval 1. INTRODUCTION Traditional methods of information retrieval are organized around the representation and processing of a document in a (high-dimensional) word-space. Modern multimedia documents, however, are http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png

Modeling annotated data

Association for Computing Machinery — Jul 28, 2003

Loading next page...
 
/lp/association-for-computing-machinery/modeling-annotated-data-QIvb6Q80rb

References (18)

Datasource
Association for Computing Machinery
Copyright
Copyright © 2003 by ACM Inc.
ISBN
1-58113-646-3
doi
10.1145/860435.860460
Publisher site
See Article on Publisher Site

Abstract

Modeling Annotated Data David M. Blei Division of Computer Science University of California, Berkeley Berkeley, CA 94720 Michael I. Jordan Division of Computer Science and Department of Statistics University of California, Berkeley Berkeley, CA 94720 ABSTRACT We consider the problem of modeling annotated data ”data with multiple types where the instance of one type (such as a caption) serves as a description of the other type (such as an image). We describe three hierarchical probabilistic mixture models which aim to describe such data, culminating in correspondence latent Dirichlet allocation, a latent variable model that is e €ective at modeling the joint distribution of both types and the conditional distribution of the annotation given the primary type. We conduct experiments on the Corel database of images and captions, assessing performance in terms of held-out likelihood, automatic annotation, and text-based image retrieval. Categories and Subject Descriptors G.3 [Mathematics of Computing]: Probability and Statistics ”statistical computing, multivariate statistics General Terms algorithms, experimentation Keywords probabilistic graphical models, empirical Bayes, variational methods, automatic image annotation, image retrieval 1. INTRODUCTION Traditional methods of information retrieval are organized around the representation and processing of a document in a (high-dimensional) word-space. Modern multimedia documents, however, are

There are no references for this article.