ABSTRACT

We have developed methods to identify online communities, or groups, using a combination of structural information variables and content information variables from weblog posts and their comments to build a characteristic footprint for groups. We have worked with both explicitly connected groups and ‘abstract’ groups, in which the connection between individuals is in interest (as determined by content based features) and behavior (metadata based features) as opposed to explicit links. We find that these variables do a good job at identifying groups, placing members within a group, and helping determine the appropriate granularity for group boundaries. The group footprint can then be used to identify differences between the online groups.