ABSTRACT
The web as it stands now is sometimes referred to as
the “syntactic web” as it consists entirely of syntactic
constructs. Syntactic rules govern HTML, which is the
primary language of the web, with supporting technologies
(CSS, DOM, XHTML, etc.). Hyperlinks add connectivity
and structure to the web as a whole, but still do not enable
the construction of meaning from existing web pages. Text
mining, link analysis, and some heuristics (for example,
“give more weight to words appearing in header
elements”) do enable search engines (and the like) to
deduce enough about page content to provide users with
reasonable results to queries, but they cannot be said to
truly be discerning meaning. Currently, a human reader is
required to determine the semantics of a web page or group
of pages-indeed, many of the popular online resources are
based on human reviewers.