ABSTRACT

The idea of using trees to represent evolution dates to Darwin. If there is an ancestral sequence, labeled by root, then each branching of the tree represents the time of a divergence. There are two distinct features of rooted or unrooted trees. First is the branching structure or topology of the tree. Second is the branch lengths of the tree, which often indicate the distance in time or evolutionary events between the vertices. A tree is a cycle-free connected graph. A binary tree has all vertices of degree one or of degree three. A tree is labeled if its terminal vertices are labeled. Some easy combinatorics for these trees is in the next proposition. The techniques of cluster analysis have long been applied to distance data. The idea of the first class of algorithms is to repeatedly merge or cluster pairs of labels. The algorithms are called pair group methods (PGM).