MEASURING SEMANTIC SIMILARITY BETWEEN WORDS USING WEB SEARCH ENGINES
Abstract:
Measuring the semantic similarity between words is an important component in various semantic web-related applications such as community mining, relation extraction and automatic Meta data extraction. Despite the usefulness of similarity measures in these applications, accurately measuring similarity between two words (or entities) remains a challenging task. This system proposes a cosine similarity measure that uses the information available on the Web to measure similarity between words or entities. The proposed method exploits page counts and text snippets returned by a Web search engine. This project define various similarity scores for two given words P and Q, using the page counts for the queries P, Q and P AND Q. Moreover, system proposes a novel approach to compute cosine similarity using automatically extracted lexical-syntactic patterns from text snippets. These different similarity scores are integrated using K-Means algorithm, to leverage a robust cosine similarity measure. Experimental result shows the efficient method to group the web pages using similarity measure. The proposed similarity measure explicitly based on precision and coverage allows the discovery of more correct profiles at the same precision or recall quality levels.
Frontend:
DOTNET
Backend:
SQL SERVER
Area:
IEEE PROJECT
Domain:
WEB SECURITY