Tools for the assessment of the quality and reliability of Web applications are based on the possibility of downloading the target of the analysis. This is achieved through Web crawlers, which can automatically navigate within a Web site and perform proper actions (such as download) during the visit. The most important performance indicators for a Web crawler are its completeness and robustness, measuring respectively the ability to visit the Web site entirely and without errors. The variety of implementation languages and technologies used for Web site development makes these two indicators hard to maximize. We conducted an evaluation study, in which we tested several of the available Web crawlers.
International Journal of Web Information Systems – Emerald Publishing
Published: May 1, 2006
Keywords: Web crawlers; Web site analysis; Completeness; Robustness; Download limiting; Supported features