This paper documents research toward building a complete lexicon containing all the words found in general newspaper text. It is intended to provide the reader with an understanding of the inherent limitations of existing vocabulary collection methods and the need for greater attention to multi-word phrases as the building blocks of text. Additionally, while traditional reference books define many proper nouns, they appear to be very limited in their coverage of the new proper nouns appearing daily in newspapers. Proper nouns appear to require a grammar and lexicon of components much the way general parsing of text requires syntactic rules and a lexicon of common nouns.
/lp/association-for-computing-machinery/research-toward-the-development-of-a-lexical-knowledge-base-for-4hnIIV0KJL