Grouping User by Employing Spectral Clustering Algorithm based on the Interest

Authors

  • Chaitra H K, Suneetha K R

Keywords:

Weblog, Coverage, Session, Matrix, Graph, Euclidean

Abstract

The use of the Internet as a medium of communication between businesses and consumers has increased in popularity. The website of a company must be designed to fulfill the specific needs of its clients. Users appreciate returning to websites that are simple to navigate. The usability of a website can be improved by observing and analyzing the habits of its visitors. The recommendation of the website to the users based on their interest is an important thing for growing the business. The grouping of users based on their interests is the initial stage of website recommendation. The user groping is taken for this research. The weblog is collected from Kannada University. The weblog is preprocessed. To achieve success, adequate data pre-processing techniques need to be devised. Data cleaning, user identification, and session identification are the pre-processing activities that were addressed in this research. The preprocessed weblog is used for identifying users with similar interests. The workflow is broken into three important features such as matrix calculation, graph building, and clustering. The three matrices are produced notably page, distance, and adjacency. By employing the user ID and page URL the page matrix is formed. The distance matrix is calculated using the Euclidean distance. The adjacency matrix is populated with the help of threshold values. Based on the adjacency matrix the graph is constructed. Then the clustering is accomplished using the spectral clustering algorithm. The Silhouette and Davies-Bouldin Index are applied to validate the clustering.

Downloads

Published

2022-04-06