ABSTRACT

This chapter focuses on the techniques that can be used to analyze the contents of the dark web. To understand how web content analysis is done, it is good to take the familiar example of how surface web indexing engines work. The surface web has a familiar problem of content replication and duplicate sites. These can severely affect the quality of search results since the same content can be listed over and over in repeating search results. Unlike most results on the surface web, some part of the results from the deep web is filled with non-HTML content. It is important for some people and institutions such as law agencies to keep abreast with the sites that are on the deep web. The media has been blamed for presenting a jaundiced view of the deep web. They often cover it as a dangerous part of the internet where all manner of crimes take place.