Access the full text.
Sign up today, get DeepDyve free for 14 days.
Michael Cafarella, A. Halevy, D. Wang, Eugene Wu, Yang Zhang (2008)
WebTables: exploring the power of tables on the webProc. VLDB Endow., 1
Schmidt greatly facilitated our user study This work was supported by NSF grant IIS- 0307906, ONR grant N00014-06-1-0147, SRI CALO grant 03-000225, the WRF
Simone Ponzetto, M. Strube (2007)
Deriving a Large-Scale Taxonomy from Wikipedia
J. Lafferty, A. McCallum, Fernando Pereira (2001)
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
Marti Hearst (1992)
Automatic Acquisition of Hyponyms from Large Text Corpora
Cheng-Tao Chu, Sang Kim, Yi-An Lin, YuanYuan Yu, G. Bradski, A. Ng, K. Olukotun (2006)
Map-Reduce for Machine Learning on Multicore
Matthew Richardson, Pedro Domingos (2006)
Markov logic networksMachine Learning, 62
Acknowledgments: We thank Eytan Adar
R. Snow, Dan Jurafsky, A. Ng (2006)
Semantic Taxonomy Induction from Heterogenous Evidence
S. Bryant, Andrea Forte, A. Bruckman (2005)
Becoming Wikipedian: transformation of participation in a collaborative online encyclopediaProceedings of the 2005 ACM International Conference on Supporting Group Work
Wolfgang Gatterbauer, Paul Bohunsky, M. Herzog, Bernhard Krüpl, Bernhard Pollak (2007)
Towards domain-independent information extraction from web tables
Benjamin Durme, Lenhart Schubert (2008)
Open Knowledge Extraction through Compositional Language ProcessingProceedings of the 2008 Conference on Semantics in Text Processing - STEP '08
Michele Banko, Michael Cafarella, S. Soderland, M. Broadhead, Oren Etzioni (2007)
Open Information Extraction from the Web
Fei Wu, Daniel Weld (2007)
Autonomously semantifying wikipedia
Hoifung Poon, Pedro Domingos (2007)
Joint Inference in Information Extraction
Michael Wick, Khashayar Rohanimanesh, Karl Schultz, A. McCallum (2008)
A unified approach for schema matching, coreference and canonicalization
Pedro DeRose, Xiaoyong Chai, Byron Gao, Warren Shen, A. Doan, P. Bohannon, Xiaojin Zhu (2008)
Building Community Wikipedias: A Machine-Human Partnership Approach2008 IEEE 24th International Conference on Data Engineering
Michele Banko, Oren Etzioni (2008)
The Tradeoffs Between Open and Traditional Relation Extraction
K. Yee, Kirsten Swearingen, Kevin Li, Marti Hearst (2003)
Faceted metadata for image search and browsing
Oren Etzioni, Michael Cafarella, Doug Downey, Ana-Maria Popescu, Tal Shaked, S. Soderland, Daniel Weld, A. Yates (2005)
Unsupervised named-entity extraction from the Web: An experimental studyArtif. Intell., 165
Raphael Hoffmann, S. Amershi, Kayur Patel, Fei Wu, J. Fogarty, Daniel Weld (2009)
Amplifying community content creation with mixed initiative information extractionProceedings of the SIGCHI Conference on Human Factors in Computing Systems
Fei Wu, Raphael Hoffmann, Daniel Weld (2008)
Information extraction from Wikipedia: moving down the long tail
A. McCallum, R. Rosenfeld, Tom Mitchell, Andrew Ng (1998)
Improving Text Classification by Shrinkage in a Hierarchy of Classes
Fei Wu, Daniel Weld (2008)
Automatically refining the wikipedia infobox ontology
J. Voß (2005)
Measuring Wikipedia
Robert Doorenbos, Oren Etzioni, Daniel Weld (1997)
A scalable comparison-shopping agent for the World-Wide Web
D. Cosley, Dan Frankowski, L. Terveen, J. Riedl (2007)
SuggestBot: using intelligent task routing to help people find work in wikipedia
WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge
Kedar Bellare, A. McCallum (2007)
Learning Extractors from Unlabeled Text using Relevant Databases
K. Nigam, J. Lafferty, A. McCallum (1999)
Using Maximum Entropy for Text Classification
Using Wikipedia to Bootstrap Open Information Extraction Daniel S. Weld Computer Science & Engineering University of Washington Seattle, WA-98195, USA Raphael Hoffmann Computer Science & Engineering University of Washington Seattle, WA-98195, USA Fei Wu Computer Science & Engineering University of Washington Seattle, WA-98195, USA weld@cs.washington.edu raphaelh@cs.washington.edu wufei@cs.washington.edu in the corpus and R denotes the number of relations; in contrast, scalability to the Web demands that open IE scale linearly in D. 1. INTRODUCTION We often use Data Management to refer to the manipulation of relational or semi-structured information, but much of the world s data is unstructured, for example the vast amount of natural-language text on the Web. The ability to manage the information underlying this unstructured text is therefore increasingly important. While information retrieval techniques, as embodied in today s sophisticated search engines, offer important capabilities, they lack the most important faculties found in relational databases: 1) queries comprising aggregation, sorting and joins, and 2) structured visualization such as faceted browsing [29]. Information extraction (IE), the process of generating structured data from unstructured text, has the potential to convert much of the Web to relational form enabling these powerful querying and visualization methods. Implemented systems
ACM SIGMOD Record – Association for Computing Machinery
Published: Mar 20, 2009
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.