ABSTRACT

The web as it stands now is sometimes referred to as

the “syntactic web” as it consists entirely of syntactic

constructs. Syntactic rules govern HTML, which is the

primary language of the web, with supporting technologies

(CSS, DOM, XHTML, etc.). Hyperlinks add connectivity

and structure to the web as a whole, but still do not enable

the construction of meaning from existing web pages. Text

mining, link analysis, and some heuristics (for example,

“give more weight to words appearing in header

elements”) do enable search engines (and the like) to

deduce enough about page content to provide users with

reasonable results to queries, but they cannot be said to

truly be discerning meaning. Currently, a human reader is

required to determine the semantics of a web page or group

of pages-indeed, many of the popular online resources are

based on human reviewers.