Access the full text.
Sign up today, get DeepDyve free for 14 days.
Eugene Agichtein, Venkatesh Ganti (2004)
Mining reference tables for automatic text segmentationProceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Steve Dill, Nadav Eiron, David Gibson, D. Gruhl, R. Guha, A. Jhingran, T. Kanungo, S. Rajagopalan, A. Tomkins, J. Tomlin, Jason Zien (2003)
SemTag and seeker: bootstrapping the semantic web via automated semantic annotation
Marius Pasca, Dekang Lin, Jeffrey Bigham, Andrei Lifchits, Alpa Jain (2006)
Organizing and Searching the World Wide Web of Facts - Step One: The One-Million Fact Extraction Challenge
Nick Craswell, P. Bailey, D. Hawking (2000)
Server selection on the World Wide Web
Mark Carman, Craig Knoblock (2007)
Learning Semantic Descriptions of Web Information Sources
Steven Minton, Claude Nanjo, Craig Knoblock, Martin Michalowski, M. Michelson (2005)
A heterogeneous field matching method for record linkageFifth IEEE International Conference on Data Mining (ICDM'05)
N. Kushmerick, Daniel Weld, Robert Doorenbos (1997)
Wrapper Induction for Information Extraction
Alon Levy (2001)
Logic-based techniques in data integration
S. Thakkar, J.L. Ambite, C.A. Knoblock (2005)
Composing, optimizing, and executing plans for bioinformatics web servicesInt. J. Very Large Databases, Spec. Issue Data Manage. Anal. Mining Life Sci, 14
Michael Cafarella, Doug Downey, S. Soderland, Oren Etzioni (2005)
KnowItNow: Fast, Scalable Information Extraction from the Web
M. Bilenko, R. Mooney (2003)
Adaptive duplicate detection using learnable string similarity measures
M. Michelson, Craig Knoblock (2007)
An Automatic Approach to Semantic Annotation of Unstructured, Ungrammatical Sources: A First Look
William Cohen, Pradeep Ravikumar, S. Fienberg (2003)
A Comparison of String Metrics for Matching Names and Records
Hany Hassan, Ahmed Awadallah, O. Emam (2006)
Unsupervised Information Extraction Approach Using Graph Mutual Reinforcement
Lawrence Reeve, Hyoil Han (2005)
Survey of semantic annotation platforms
M. Michelson, Craig Knoblock (2005)
Semantic annotation of unstructured and ungrammatical text
A. Halevy, A. Rajaraman, J. Ordille (1996)
Querying Heterogeneous Information Sources Using Source Descriptions
M. Michelson, Craig Knoblock (2007)
Mining Heterogeneous Transformations for Record Linkage
W. Winkler (1999)
The State of Record Linkage and Current Research Problems
Kristina Lerman, Anon Plangprasopchok, Craig Knoblock (2006)
Automatically Labeling the Inputs and Outputs of Web Services
Snehal Thakkar, J. Ambite, Craig Knoblock (2005)
Composing, optimizing, and executing plans for bioinformatics web servicesThe VLDB Journal, 14
F. Ciravegna (2001)
Adaptive Information Extraction from Text by Rule Induction and Generalisation
William Cohen, Sunita Sarawagi (2004)
Exploiting dictionaries in named entity extraction: combining semi-Markov extraction processes and data integration methodsProceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Temple Smith, M. Waterman (1981)
Identification of common molecular subsequences.Journal of molecular biology, 147 1
Jianhua Lin (1991)
Divergence measures based on the Shannon entropyIEEE Trans. Inf. Theory, 37
Information extraction from unstructured, ungrammatical data such as classified listings is difficult because traditional structural and grammatical extraction methods do not apply. Previous work has exploited reference sets to aid such extraction, but it did so using supervised machine learning. In this paper, we present an unsupervised approach that both selects the relevant reference set(s) automatically and then uses it for unsupervised extraction. We validate our approach with experimental results that show our unsupervised extraction is competitive with supervised machine learning approaches, including the previous supervised approach that exploits reference sets.
International Journal of Document Analysis and Recognition (IJDAR) – Springer Journals
Published: Oct 16, 2007
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.