PurposeThe purpose of this paper is to present an efficient and scalable Arabic semantic search engine based on a domain-specific ontological graph for Colleges of Applied Science, Sultanate of Oman (CASOnto). It also supports the factorial question answering and uses two types of searching: the keyword-based search and the semantics-based search in both languages Arabic and English. This engine is built on variety of technologies such as resource description framework data and ontological graph. Furthermore, two experimental results are conducted; the first is a comparison among entity-search and the classical-search in the system itself. The second compares the CASOnto with well-known semantic search engines such as Kngine, Wolfram Alpha and Google to measure their performance and efficiency.Design/methodology/approachThe design and implementation of the system comprises the following phases, namely, designing inference, storing, indexing, searching, query processing and the user’s friendly interface, where it is designed based on a specific domain of the IBRI CAS (College of Applied Science) to highlight the academic and nonacademic departments. Furthermore, it is ontological inferred data stored in the tuple data base (TDB) and MySQL to handle the keyword-based search as well as entity-based search. The indexing and searching processes are built based on the Lucene for the keyword search, while TDB is used for the entity search. Query processing is a very important component in the search engines that helps to improve the user’s search results and make the system efficient and scalable. CASOnto handles the Arabic issues such as spelling correction, query completion, stop words’ removal and diacritics removal. It also supports the analysis of the factorial question answering.FindingsIn this paper, an efficient and scalable Arabic semantic search engine is proposed. The results show that the semantic search that built on the SPARQL is better than the classical search in both simple and complex queries. Clearly, the accuracy of semantic search equals to 100 per cent in both types of queries. On the other hand, the comparison of CASOnto with the Wolfram Alpha, Kngine and Google refers to better results by CASOnto. Consequently, it seems that our proposed engine retrieved better and efficient results than other engines. Thus, it is built according to the ontological domain-specific, highly scalable performance and handles the complex queries well by understanding the context behind the query.Research limitations/implicationsThe proposed engine is built on a specific domain (CAS Ibri – Oman), and in the future vision, it will highlight the nonfactorial question answering and expand the domain of CASOnto to involve more integrated different domains.Originality/valueThe main contribution of this paper is to build an efficient and scalable Arabic semantic search engine. Because of the widespread use of search engines, a new dimension of challenge is created to keep up with the evolution of the semantic Web. Whereas, catering to the needs of users has become a matter of paramount importance in the light of artificial intelligence and technological development to access the accurate and the efficient information in less possible time. However, the research challenges still in its infancy due to lack of research engine that supports the Arabic language. It could be traced back to the complexity of the Arabic language morphological and grammar rules.
International Journal of Web Information Systems – Emerald Publishing
Published: Jun 20, 2016