Search

Filter

  • Advanced Filters:

  • to
  • Specific Data Sources:

    All Edit

    Select All  |  Select None

Reset filters

Dictionary-Based Cross-Language Information Retrieval: Principles, System Design and Evaluation Turid Hedlund Department of Information Studies Faculty of Information Sciences University of Tampere hedlund@shh.fi The research problems of the thesis relate to the Scandinavian language Swedish. When the research work on this thesis started, there was very limited knowledge on information retrieval or cross-language information retrieval research in Swedish. The linguistic features of this and other compound rich languages indicate that research focusing on languages of other types than English is of great importance. One problem was also the lack of automated dictionary-based systems for query translation of Scandinavian languages and other compound rich languages. Firstly, cross-language information retrieval problems for non-English languages, particularly Swedish are discussed. In the article the need to extend research on information retrieval techniques to undertreated languages is demonstrated. Secondly, one of the main problems identified for Swedish, the frequent presence of compounds is discussed in detail and solutions are proposed. Retrieval efficiency may be improved by splitting not directly translatable compounds into constituents using morphological analysis programs and by normalising the constituents into base form before translation using machine-readable dictionaries. This solution is tested for 80 cross-language information retrieval queries. Thirdly, this thesis deals with bilingual natural language information retrieval techniques where English is the target or document language and Swedish, Finnish and German are source or query languages. The system design of the UTACLIR, an extendable bilingual dictionary-based query translation system, is presented. The approach is to apply linguistic tools in an automated dictionary-based system able to handle several languages. Fourthly, the performance of the system is evaluated in international evaluation campaigns and shown effective. The automated CLIR process is also tested for the performance of its components. The tests with structuring of the queries indicate that structuring is a good way to reduce the effect of ambiguity caused by several dictionary translation equivalents for a source language word. This is true for all the source languages, but is particularly notable for Finnish and German where the translation dictionaries used in the study were comprehensive. Compound handling for the compound rich source languages Swedish, German and Finnish is found beneficial to the system performance. An n-gram based algorithm was implemented in the process in order to solve the problem of untranslatable words, such as proper names. The process was particularly successful for the Finnish language where proper names usually appear in inflected forms and where matching to the target language document index therefore is difficult. Electronic version of thesis: http://acta.uta.fi/pdf/951-44-5790-0.pdf ACM SIGIR Forum Vol. 38, No. 1 June 2004

Page 1 of 1

Page 1 of 1

Toggle back to continuous viewing mode

/lp/association-for-computing-machinery/dictionary-based-cross-language-information-retrieval-principles-OPIv0d0Yjb
Welcome to DeepDyve! Rent Premier Research Articles and Save Up to 90%

Learn more

Free Article

Bookmark

Dictionary-based cross-language information retrieval: principles, system design and evaluation

Hedlund, Turid
ACM SIGIR Forum , Volume 38 (1)
Association for Computing MachineryJul 1, 2004

More Info

More Like This Article

View All dataSource[]=actageo&dataSource[]=aspet&dataSource[]=aaos&dataSource[]=aacc&dataSource[]=aacr&dataSource[]=aea&dataSource[]=aip&dataSource[]=ajnr&dataSource[]=ams&dataSource[]=aps_physical&dataSource[]=appi_book&dataSource[]=appi_journal&dataSource[]=apha&dataSource[]=asip&dataSource[]=asm&dataSource[]=asn&dataSource[]=aspb&dataSource[]=avs&dataSource[]=annual_reviews&dataSource[]=arxiv&dataSource[]=acm&dataSource[]=berghahn&dataSource[]=cabi&dataSource[]=clinical_trials&dataSource[]=dailymed&dataSource[]=degruyter&dataSource[]=du_press&dataSource[]=esa&dataSource[]=eu_press&dataSource[]=elsevier&dataSource[]=emerald&dataSource[]=ejtr&dataSource[]=emea&dataSource[]=epo&dataSource[]=faseb&dataSource[]=gsa&dataSource[]=health_affairs&dataSource[]=hindawi&dataSource[]=imanager&dataSource[]=imedpub&dataSource[]=informa_healthcare&dataSource[]=informs&dataSource[]=iop&dataSource[]=iucr&dataSource[]=iospress&dataSource[]=jbjs&dataSource[]=leftcoast&dataSource[]=lu_press&dataSource[]=mesharpe&dataSource[]=mary_ann_liebert&dataSource[]=medline&dataSource[]=mit_press&dataSource[]=nature&dataSource[]=oxford&dataSource[]=pier_professional&dataSource[]=pnas&dataSource[]=portlandpress&dataSource[]=psyc_articles&dataSource[]=psyc_books&dataSource[]=psyc_critiques&dataSource[]=plos_journal&dataSource[]=pubmed_central&dataSource[]=rsna&dataSource[]=rockefeller&dataSource[]=rcn&dataSource[]=ria&dataSource[]=rsc&dataSource[]=sage&dataSource[]=spie&dataSource[]=springer_journal&dataSource[]=springer&dataSource[]=taylor_francis&dataSource[]=aps&dataSource[]=the_scientist&dataSource[]=uc_press&dataSource[]=uspto_abstract&dataSource[]=wiley&dataSource[]=pct

Browse: Subject Areas | Journals | Publishers

Sign Up for a DeepDyve Account

Bookmark an Article

To bookmark an article, please log in first, or sign up for a DeepDyve account if you don't already have one.

OK

Subscribe to Journal Email Alerts

To subscribe to email alerts, please log in first, or sign up for a DeepDyve account if you don't already have one.

OK

Thank you for renting with DeepDyve

Your PayPal account has been charged $2.99. You now have access to the full text of this article. A rental receipt has also been sent to your email address.

Your credit card has been charged $2.99. You now have access to the full text of this article. A rental receipt has also been sent to your email address.

OK

New! You can now keep track of new articles from ACM SIGIR Forum on your personalized homepage! Learn more

PDF Download — Not Available

Thanks for your interest in purchasing the PDF. Your request has been noted and we will work with our publisher partner to discuss enabling this feature.

In the meantime, you can get the PDF by visiting the publisher site.

Thank you for purchasing with DeepDyve

Your PayPal account has been charged $.

Your credit card has been charged $.

You can now download this article. A purchase receipt has also been sent to your email address.

Download This Article or I'm done with my download

Print Page — Not Available

Thanks for your interest in printing individual pages. Your request has been noted and we will work with our publisher partner to discuss enabling this feature.

In the meantime, you can get the PDF by visiting the publisher site.

Thank you for printing with DeepDyve

Your PayPal account has been charged $0.

Your credit card has been charged $0.

You can now print this article. A purchase receipt has also been sent to your email address.

Print the Selected Pages or I'm done with my printing

Please refresh to generate a new download link

Your article download link has expired. Please refresh this page to obtain a new download link and try again.

Follow a Journal

To get new article updates from a journal on your personalized homepage, please log in first, or sign up for a DeepDyve account if you don't already have one.

OK