ABSTRACT

Many blogs accumulate large quantities of data reflecting the user opinion. Such huge information may be analyzed automatically to discover the user opinion. In this paper, we present a new hybrid approach for blog classification—CARs—using a four-step process. First, we extract our dataset from blogs. Then, we preprocess our corpus using lexicon-based tools and determine the opinion holders. After that, we classify the corpus using our new algorithm Semantic Association Classification (SAC). The generated classes are finally represented using the chart visualization tool. Experiments carried out on real blogs confirm the soundness of our approach.