In the world of the Internet and World Wide Web, which offers a tremendous amount of information, an increasing emphasis is being given to searching services and functionality. Currently, a majority of web portals offer their searching utilities, be it better or worse. These can search for the content within the sites, mainly text the textual content of documents. In this paper a novel similarity measure called SMDR (Similarity Measure for Documents Retrieval) is proposed to help retrieve more similar documents from the repository thus contributing considerably to the effectiveness of Web Information Retrieval (WIR) process. Bio-inspired PSO methodology is used with the intent to reduce the response time of the system and optimizes WIR process, hence contributes to the efficiency of the system. This paper also demonstrates a comparative study of the proposed system with the existing method in terms of accuracy, sensitivity, F-measure and specificity. Finally, extensive experiments are conducted on CACM collections. Better precision-recall rates are achieved than the existing system. Experimental results demonstrate the effectiveness and efficiency of the proposed system.
翻译:在提供大量信息的互联网和万维网世界中,人们越来越重视搜索服务和功能,目前,大多数网络门户提供搜索工具,不管是更好还是更差,它们都可以在网站上搜索内容,主要是文本文件的内容。在本文件中,提议采取一个新颖的类似措施,称为SMDR(检索文件的智能度量度度度度度度度度度),以帮助从存储处检索更相似的文件,从而大大促进网络信息检索进程的有效性。采用生物启发的PSO方法是为了缩短系统的反应时间,优化WIR进程,从而有助于提高系统的效率。本文还表明对拟议系统进行了比较研究,在准确性、敏感性、F度和具体性方面采用了现有方法。最后,对CACM收藏进行了广泛的实验。比现有系统更精确的检索率得到了提高。实验结果表明了拟议系统的有效性和效率。