ABSTRACT

Several human organs and biological structures possess readily identiable

shapes The shapes of the human heart brain kidneys and several bones

are well known and in normal cases do not deviate much from an aver

age shape However disease processes can aect the structure of organs

and cause deviation from their expected or average shapes Even abnormal

entities such as masses and calcications in the breast tend to demonstrate

dierences in shape between benign and malignant conditions For exam

ple most benign masses in the breast appear as wellcircumscribed areas on

mammograms with smooth boundaries that are circular or oval some benign

masses may be macrolobulated On the other hand malignant masses can

cerous tumors are typically illdened on mammograms and possess a rough

or stellate starlike shape with strands or spicules appearing to radiate from

a central mass some malignant masses may be microlobulated

Shape is a key feature in discriminating between normal and abnormal cells

in Papsmear tests However biological entities demonstrate wide

ranges of manifestation with signicant overlap between their characteris

tics for various categories Furthermore it should be borne in mind that the

imaging geometry DtoD projection and the superimposition of multi

ple objects commonly aect the shapes of objects as perceived on biomedical

images

Several techniques have been proposed to characterize shape

We shall study a selection of shape analysis techniques in this chapter A

few applications will be described to demonstrate the usefulness of shape

characteristics in the analysis of biomedical images

Representation of Shapes and Contours

The most general form of representation of a contour in discretized space

is in terms of the x y coordinates of the digitized points pixels along

the contour A contour with N points could be represented by the series of

coordinates fxn yn g n N Observe that there is no gray

level associated with the pixels along a contour A contour may be depicted

as a binary or bilevel image

Signatures of contours

The dimensionality of representation of a contour may be reduced from two

to one by converting from a coordinatebased representation to distances from

each contour point to a reference point A convenient reference is the centroid

or center of mass of the contour whose coordinates are given by

x

N

N

X

n

xn and y

N

N

X

n

yn

The signature of the contour is then dened as

dn

p

xn x

yn y

n N see Figure It should be noted that the centroids of

regions that are concave or have holes could lie outside the regions

A radialdistance signature may also be derived by computing the distance

from the centroid to the contour points intersected for angles of the radial

line spanning the range

o

o

However for irregular contours such a

signature may be multivalued for some angles that is a radial line may

intersect the contour more than once see for example Pohlman et al

It is obvious that going around a contour more than once generates the same

signature hence the signature signal is periodic with the period equal to N

the number of pixels on the contour The signature of a contour provides

general information on the nature of the contour such as its smoothness or

roughness

Examples Figures a and a show the contours of a benign breast

mass and a malignant tumor respectively as observed on mammograms

The marks within the contours represent their centroids Figures b

and b show the signatures of the contours as dened in Equation

It is evident that the smooth contour of the benign mass possesses a smooth

signature whereas the spiculated malignant tumor has a rough signature with

several signicant rapid variations over its period

Chain coding

An ecient representation of a contour may be achieved by specifying the

x y coordinates of an arbitrary starting point on the contour the direction of

traversal clockwise or counterclockwise and a code to indicate the manner

of movement to reach the next contour point on a discrete grid A coarse

representation may be achieved by using only four possible movements to the

point at the left of right of above or below the current point as indicated in

Figure a A ner representation may be achieved by using eight possible

movements including diagonal movements as indicated in Figure b

The sequence of codes required to traverse through all the points along the

FIGURE

A contour represented by its boundary points zn and distances dn to its

centroid

contour is known as the chain code The technique was proposed by

Freeman and is also known as the Freeman chain code

The chain code facilitates more compact representation of a contour than

the direct specication of the x y coordinates of all of its points Except

the initial point the representation of each point on the contour requires only

two or three bits depending upon the type of code used Furthermore chain

coding provides the following advantages

The code is invariant to shift or translation because the starting point

is kept out of the code

To a certain extent the chain code is invariant to size scaling Con

tours of dierent sizes may be generated from the same code by using

dierent sampling grids step sizes A contour may also be enlarged by

a factor of n by repeating each code element n times and maintaining

the same sampling grid A contour may be shrunk to half of the

original size by reducing pairs of code elements to single numbers with

approximation of unequal pairs by their averages reduced to integers

The chain code may be normalized for rotation by taking the rst dif

ference of the code and adding or to negative dierences depending

upon the code used

a

b

FIGURE

a Contour of a benign breast mass N The mark represents the

centroid of the contour b Signature dn as dened in Equation

a

b

FIGURE

a Contour of a malignant breast tumor N The mark represents

the centroid of the contour b Signature dn as dened in Equation

With reference to the symbol code the rotation of a given contour by

n

o

in the counterclockwise direction may be achieved by adding a

value of n to each code element followed by integer division by The

addition of an odd number rotates the contour by the corresponding

multiple of

o

however the rotation of a contour by angles other than

integral multiples of

o

on a discrete grid is subject to approximation

In the case of the symbol code the length of a contour is given by the

number of even codes plus

p

times the number of odd codes multiplied

by the grid sampling interval

The chain code may also be used to achieve reduction check for closure

check for multiple loops and determine the area of a closed loop

Examples Figure shows a contour represented using the chain codes

with four and eight symbols The use of a discrete grid with large spacings

leads to the loss of ne detail in the contour However this feature may be

used advantageously to lter out minor irregularities due to noise artifacts

due to drawing by hand etc

FIGURE

Chain code with a four directional codes and b eight directional codes

Segmentation of contours

The segmentation of a contour into a set of piecewisecontinuous curves is a

useful step before analysis and modeling Segmentation may be performed by

locating the points of inection on the contour

Consider a function fx Let f

x f

x and f

x represent the rst

second and third derivatives of fx A point of inection of the function or

Chain code

Figure a

curve fx is dened as a point where f

x changes its sign Note that the

derivation of f

x requires fx and f

x to be continuous and dierentiable

It follows that the following conditions apply at a point of inection

f

x

f

x

f

x f

x and

f

x f

x

Let C fxn yn g n N represent in vector form

the x y coordinates of the N points on the given contour The points of

inection on the contour are obtained by solving

C

C

C

C

where C

C

and C

are the rst second and third derivatives of C

respectively and represents the vector cross product Solving Equation

is equivalent to solving the system of equations given by

x

n y

n x

n y

n

Chain code

b

FIGURE

A closed contour represented using the chain code a using four directional

codes as in Figure a and b with eight directional codes as in Figure

b The o mark represents the starting point of the contour which is

traversed in the clockwise direction to derive the code

x

n y

n x

n y

n

where x

n y

n x

n y

n x

n and y

n are the rst second and

third derivatives of xn and yn respectively

Segments of contours of breast masses between successive points of inection

were modeled as parabolas by Menut et al Diculty lies in segmen

tation because the contours of masses are in general not smooth False or

irrelevant points of inection could appear on relatively straight parts of a

contour when x

n and y

n are not far from zero In order to address this

problem smoothed derivatives at each contour point could be estimated by

considering the cumulative sum of weighted dierences of a certain number of

pairs of points on either side of the point xn under consideration as

x

n

m

X

i

xn i xn i

i

where m represents the number of pairs of points used to compute the deriva

tive x

n the same procedure applies to the computation of y

n

In the works reported by Menut et al and Rangayyan et al

the value of m was varied from to to compute derivatives that resulted

in varying numbers of inection points for a given contour The number of

inection points detected as a function of the number of dierences used was

analyzed to determine the optimal number of dierences that would provide

the most appropriate inection points the value of m at the rst straight

segment on the function was selected

Examples Figure shows the contour of a spiculated malignant tumor

The points of inection detected are marked with The number of inec

tion points detected is plotted in Figure as a function of the number of

dierences used m in Equation the horizontal and vertical lines indi

cate the optimal number of dierences used to compute the derivative at each

contour point and the corresponding number of points of inection that were

located on the contour

The contour in Figure is shown in Figure overlaid on the corre

sponding part of the original mammogram Segments of the contours are

shown in black or white indicating if they are concave or convex respec

tively Figure provides a similar illustration for a circumscribed benign

mass Analysis of concavity of contours is described in Section

Polygonal modeling of contours

Pavlidis and Horowitz and Pavlidis and Ali proposed methods

for segmentation and approximation of curves and shapes by polygons for

computer recognition of handwritten numerals cell outlines and ECG signals

Ventura and Chen presented an algorithm for segmenting and polygonal

modeling of D curves in which the number of segments is to be prespecied

FIGURE

Contour of a spiculated malignant tumor with the points of inection indicated

by Number of points of inection See also Figure

FIGURE

Number of inection points detected as a function of the number of dierences

used to estimate the derivative for the contour in Figure The horizontal

and vertical lines indicate the optimal number of dierences used to compute

the derivative at each contour point and the corresponding number of points

of inection that were located on the contour

FIGURE

Concave and convex parts of the contour of a spiculated malignant tumor

separated by the points of inection See also Figure The concave parts

are shown in black and the convex parts in white The image size is

pixels or mm with a pixel size of m Shape factors

f

cc

SI cf Reproduced with permission from RM

Rangayyan NR Mudigonda and JEL Desautels Boundary modeling and

shape analysis methods for classication of mammographic masses Medical

and Biological Engineering and Computing

c

IFMBE

FIGURE

Concave and convex parts of the contour of a circumscribed benign mass

separated by the points of inection The concave parts are shown in black and

the convex parts in white The image size is pixels or mm

with a pixel size of m Shape factors f

cc

SI cf

Reproduced with permission from RM Rangayyan NR Mudigonda

and JEL Desautels Boundary modeling and shape analysis methods for

classication of mammographic masses Medical and Biological Engineering

and Computing

c

IFMBE

for initiating the process in relation to the complexity of the shape This is

not a desirable step when dealing with complex or spiculated shapes of breast

tumors In a modied approach proposed by Rangayyan et al the

polygon formed by the points of inection detected on the original contour was

used as the initial input to the polygonal modeling procedure This step helps

in automating the polygonalization algorithm the method does not require

any interaction from the user in terms of the initial number of segments

Given an irregular contour C as specied by the set of its x y coordinates

the polygonal modeling algorithm starts by dividing the contour into a set

of piecewisecontinuous curved parts by locating the points of inection on

the contour as explained in Section Each segmented curved part is

represented by a pair of linear segments based on its arctochord deviation

The procedure is iterated subject to predened boundary conditions so as to

minimize the error between the true length of the contour and the cumulative

length computed from the polygonal segments

Let C fxn yn g n N represent the given contour

Let SC

mk

SC

mk

C m M be M curved parts each con

taining a set of contour points at the start of the k

th

iteration such that

SC

k

S

SC

k

S

S

SC

Mk

C The iterative procedure proposed by

Rangayyan et al is as follows

In each curved part represented by SC

mk

the arctochord distance

is computed for all the points and the point on the curve with the

maximum arctochord deviation d

max

is located

If d

max

mm pixels in the images with a pixel size of m

used in the work of Rangayyan et al the curved part is segmented

at the point of maximum deviation to approximate the same with a

pair of linear segments irrespective of the length of the resulting linear

segments If mm d

max

mm the curved part is segmented

at the point of maximum deviation subject to the condition that the

resulting linear segments satisfy a minimumlength criterion which was

specied as mm in the work of Rangayyan et al If d

max

mm the curved part SC

mk

is considered to be almost linear and is

not segmented any further

After performing Steps and on all the curved parts of the contour

available in the current k

th

iteration the resulting vector of the poly

gons vertices is updated

If the number of polygonal segments following the k

th

iteration equals

that of the previous iteration the algorithm is considered to have con

verged and the polygonalization process is terminated Otherwise the

procedure Steps to is repeated until the algorithm converges

The criterion for choosing the threshold for arctochord deviation was based

on the assumption that any segment possessing a smaller deviation is insignif

icant in the analysis of contours of breast masses

Examples Figure a shows the points of inection denoted by

and the initial stage of polygonal modeling straightline segments of the

contour of a spiculated malignant tumor see also Figure Figure b

shows the nal result of polygonal modeling of the same contour The al

gorithm converged after four iterations as shown by the convergence plot in

Figure The result of the application of the polygonal modeling algorithm

to the contour of a circumscribed benign mass is shown in Figure

The number of linear segments required for the approximation of a contour

increases with its shape complexity polygons with the number of sides in the

range were used in the work of Rangayyan et al to model

contours of breast masses and tumors The number of iterations required for

the convergence of the algorithm did not vary much for dierent mass contour

shapes remaining within the range This is due to the fact that the

relative complexity of the contour to be segmented is taken into considera

tion during the initial preprocessing step of locating the points of inection

hence the subsequent polygonalization process is robust and computationally

ecient The algorithm performed well and delivered satisfactory results on

various irregular shapes of spiculated cases of benign and malignant masses

Parabolic modeling of contours

Menut et al proposed the modeling of segments of contours of breast

masses between successive points of inection as parabolas An inspection

of the segments of the contours illustrated in Figures and a see

also Figures and indicates that most of the curved portions between

successive points inection lend themselves well to modeling as parabolas

Some of the segments are relatively straight however such segments may

not contribute much to the task of discrimination between benign masses and

malignant tumors

Let us consider a segment of a contour represented in the continuous D

space by the points xs ys over the interval S

s S

where s indicates

distance along the contour and S

and S

are the endpoints of the segment

Let us now consider the approximation of the curve by a parabola Regardless

of the position and orientation of the given curve let us consider the simplest

representation of a parabola as Y AX

in the coordinate space XY The

parameter A controls the narrowness of the parabola the larger the value of

A the narrower is the parabola Allowing for a rotation of and a shift of

c d between the x y and XY spaces we have

xs Xs cos Y s sin c

ys Xs sin Y s cos d

a b

FIGURE

Polygonal modeling of the contour of a spiculated malignant tumor a Points

of inection indicated by and the initial polygonal approximation

straightline segments number of sides b Final model after four

iterations number of sides See also Figure Reproduced with

permission from RM Rangayyan NR Mudigonda and JEL Desautels

Boundary modeling and shape analysis methods for classication of mam

mographic masses Medical and Biological Engineering and Computing

c

IFMBE

FIGURE

Convergence plot of the iterative polygonal modeling procedure for the con

tour of the spiculated malignant tumor in Figure Reproduced with

permission from RM Rangayyan NR Mudigonda and JEL Desautels

Boundary modeling and shape analysis methods for classication of mam

mographic masses Medical and Biological Engineering and Computing

c

IFMBE

a b

FIGURE

Polygonal modeling of the contour of a circumscribed benign mass a Points

of inection indicated by and the initial polygonal approximation

straightline segments initial number of sides b Final model num

ber of sides number of iterations See also Figure Reproduced

with permission from RM Rangayyan NR Mudigonda and JEL Desau

tels Boundary modeling and shape analysis methods for classication of

mammographic masses Medical and Biological Engineering and Computing

c

IFMBE

We also have the following relationships

Xs xs c cos ys d sin

Y s xs c sin ys d cos

Xs s

Y s A s

Taking the derivatives of Equation with respect to s we get the following

X

s

Y

s As

X

s

Y

s A

Similarly taking the derivatives of Equation with respect to s we get the

following

X

s x

s cos y

s sin

Y

s x

s sin y

s cos

Combining Equations and we get

X

s x

s cos y

s sin

which upon multiplication with sin yields

x

s sin cos y

s sin sin

Similarly we also get

Y

s A x

s sin y

s cos

which upon multiplication with cos yields

A cos x

s sin cos y

s cos cos

Combining Equations and we get

A cos y

s

The equations above indicate that y

s and x

s are constants with values

related to A and The values of the two derivatives may be computed

from the given curve over all available points and averaged to obtain the

corresponding constant values Equations and may then be solved

simultaneously to obtain and A Thus the parameter of the parabolic model

is obtained from the given contour segment

Menut et al hypothesized that malignant tumors due to the presence of

narrow spicules or microlobulations would have several parabolic segments

with large values of A on the other hand benign masses due to their char

acteristics of being oval or macrolobulated would have a small number of

parabolic segments with small values of A The same reasons were also ex

pected to lead to a larger standard deviation of A for malignant tumors than

for benign masses In addition to the parameter A Menut et al proposed to

use the width of the projection of each parabola on to the X axis with the

expectation that its values would be smaller for malignant tumors than for

benign masses A classication accuracy of was obtained with a set of

contours

Thinning and skeletonization

Objects that are linear or oblong or structures that have branching anos

tomotic patterns may be eectively characterized by their skeletons The

skeleton of an object or region is obtained by its medialaxis transform or via

a thinning algorithm

The medialaxis transformation proposed by Blum is as follows First

the given image needs to be binarized so as to include only the patterns of

interest Let the set of pixels in the binary pattern be denoted as B let C be

the set of contour pixels of B and let c

i

be an arbitrary contour point in C

For each point b in B a point c

i

is found such that the distance between the

point b and c

i

represented as db c

i

is at its minimum If a second point c

k

is found in C such that db c

k

db c

i

then b is a part of the skeleton of

B otherwise b is not a part of the skeleton

A simple algorithm for thinning is as follows Assume that

the image has been binarized with the pixels inside the ROIs being labeled

as and the background pixels as A contour point is dened as any pixel

having the value and at least one connected neighbor valued Let the

connected neighboring pixels of the pixel p

being processed be indexed as

p

p

p

p

p

p

p

p

p

Flag a contour point p

for deletion if the following are true

a Np

b Sp

c p

p

p

d p

p

p

where Np

is the number of nonzero neighbors of p

and Sp

is the

number of transitions in the sequence p

p

p

p

Delete all agged pixels

Do the same as Step above replacing the conditions c and d with

c p

p

p

d p

p

p

Delete all agged pixels

Iterate Steps until no further pixels are deleted

The algorithm described above has the properties that it does not remove

end points does not break connectivity and does not cause excessive erosion

of the region

Example Figure a shows a pattern of blood vessels in a section

of a ligament The vessels were perfused with black ink prior to

extraction of the tissue for study Figure b shows the skeleton of the

image in part a of the gure It is seen that the skeleton represents the

general orientational pattern and overall shape of the blood vessels in the

original image However information regarding the variation in the thickness

diameter of the blood vessels is lost in skeletonization Eng et al

studied the eect of injury and healing on the microvascular structure of

ligaments by analyzing the statistics of the volume and directional distribution

of blood vessels as illustrated in Figure see Section for details

Shape Factors

Although contours may be eectively characterized by representations and

models such as the chain code and the polygonal model described in the

preceding section it is often desirable to encode the nature or form of a

contour using a small number of measures commonly referred to as shape

factors The nature of the contour to be encapsulated in the measures may

vary from one application to another Regardless of the application a few

basic properties are essential for ecient representation of which the most

important are

invariance to shift in spatial position

invariance to rotation and

invariance to scaling enlargement or reduction

Invariance to reection may also be desirable in some applications Shape fac

tors that meet the criteria listed above can eectively and eciently represent

contours for pattern classication

a b

FIGURE

a Binarized image of blood vessels in a ligament perfused with black ink Im

age courtesy of RC Bray and MR Doschak University of Calgary b Skele

ton of the image in a after iterations of the algorithm described in Sec

tion

A basic method that is commonly used to represent shape is to t an ellipse

or a rectangle to the given closed contour The ratio of the major axis of

the ellipse to its minor axis or equivalently the ratio of the larger side

to the smaller side of the bounding rectangle is known as its eccentricity

and represents its deviation from a circle for which the ratio will be equal to

unity Such a measure however represents only the elongation of the object

and may have on its own limited application in practice Several shape

factors of increasing complexity and specicity of application are described in

the following sections

Compactness

Compactness is a simple and popular measure of the eciency of a contour

to contain a given area and is commonly dened as

Co

P

A

where P and A are the contour perimeter and area enclosed respectively The

smaller the area contained by a contour of a given length the larger will be the

value of compactness Compactness as dened in Equation has a lower

bound of for a circle except for the trivial case of zero for P but

no upper bound It is evident that compactness is invariant to shift scaling

rotation and reection of a contour

In order to restrict and normalize the range of the parameter to as

well as to obtain increasing values with increase in complexity of the shape

the denition of compactness may be modied as

cf

A

P

With this expression cf has a lower bound of zero for a circle and increases

with the complexity of the contour to a maximum value of unity

Examples Figure illustrates a few simple geometric shapes along

with their values of compactness Elongated contours with large values of the

perimeter and small enclosed areas possess high values of compactness

Figure illustrates a few objects with simple geometric shapes including

scaling and rotation the values of compactness Co and cf for the contours

of the objects are listed in Table It is evident that

both denitions of compactness provide the desired invariance to scaling and

rotation within the limitations due to the use of a discrete grid

Figure illustrates a few objects of varying shape complexity prepared

by cutting construction paper The values of compactness

cf for the contours of the objects are listed in Table It is seen that

compactness increases with shape roughness and or complexity

FIGURE

Examples of contours with their values of compactness Co and cf as dened

in Equations and a Circle b Square c Rectangle with sides

equal to and units d Rectangle with sides equal to and

units e Rightangled triangle of height and base units

FIGURE

A set of simple geometric shapes including scaling and rotation created on

a discrete grid to test shape factors Reproduced with permission from L

Shen RM Rangayyan and JEL Desautels Application of shape analysis

to mammographic calcications IEEE Transactions on Medical Imaging

c

IEEE

B i o m e d i c a l I m a g e A n a l y s i s

TABLE

Shape Factors for the Shapes in Figure

Shape Co cf F

F

F

F

F

F

mf ff

Large circle a

Medium circle b

Small circle c

Large square d

Medium square e

Rotated square f

Small square g

Large rectangle i

Medium rectangle j

Rotated rectangle k

Small rectangle l

Large isosceles triangle m

Medium isosceles triangle n

Rotated isosceles triangle o

Small isosceles triangle p

Large rightangled triangle q

Medium rightangled triangle r

Rotated rightangled triangle s

Small rightangled triangle t

FIGURE

A set of objects of varying shape complexity The objects were prepared by

cutting construction paper The contours of the objects include imperfections

and artifacts Reproduced with permission from L Shen RM Rangayyan

and JEL Desautels Application of shape analysis to mammographic cal

cications IEEE Transactions on Medical Imaging

c

IEEE

Shen et al applied compactness to shape analysis of mammo

graphic calcications The details of this application are presented in Sections

and The use of compactness in benignversusmalignant classication

of breast masses is discussed in Sections and

Moments

Statistical moments of PDFs and other data distributions have been utilized

as pattern features in a number of applications the same concepts have been

extended to the analysis of images and contours Given

a D continuous image fx y the regular moments m

pq

of order p q are

TABLE

Shape Factors for the Objects in

Figure Arranged in Increasing Order

of ff

Shape cf mf ff Type

Circle

Circle

Ellipse

Rectangle

Rectangle

Rectangle

Pentagon

Triangle

Triangle

Other

Other

Other

dened as

m

pq

Z

Z

x

p

y

q

fx y dx dy

for p q A uniqueness theorem states that if fx y is

piecewise continuous and has nonzero values only in a nite part of the x y

plane then moments of all orders exist and the moment sequence m

pq

p q

is uniquely determined by fx y Conversely the sequence m

pq

uniquely determines fx y

The central moments are dened with respect to the centroid of the image

as

pq

Z

Z

x x

p

y y

q

fx y dx dy

where

x

m

m

y

m

m

Observe that the gray levels of the pixels provide weights for the moments as

dened above If moments are to be computed for a contour only the contour

pixels would be used with weights equal to unity the internal pixels would

have weights of zero and eectively do not participate in the computation of

the moments

For an M N digital image the integrals are replaced by summations for

example Equation becomes

pq

M

X

m

N

X

n

m x

p

n y

q

fmn

The central moments have the following relationships

m

m

x

m

x y

m

y

m

m

x x

m

m

y m

x x

y

m

m

x m

y x y

m

m

y y

Normalization with respect to size is achieved by dividing each of the mo

ments by

where

pq

to obtain the normalized moments as

pq

pq

Hu see also Gonzalez and Woods dened a set of seven shape

factors that are functions of the secondorder and thirdorder central moments

as follows

M

M

M

M

M

M

M

The shape factors M

through M

are invariant to shift scaling and ro

tation within limits imposed by representation on a discrete grid and have

been found to be useful for pattern analysis Rangayyan et al computed

several versions of the factors M

through M

for breast masses and tu

mors using the mass ROIs with and without their gray levels as well as the

contours of the masses with and without their gray levels The features pro

vided benignversusmalignant classication accuracies in the range

see Section

Moments of distances to the centroid When an ROI is represented

using only its contour an alternative denition of moments is based upon

a sequence that represents the Euclidean distances between the centroid of

the region and all of the points or pixels along the contour shown as dn in

Figure The distances from the center of a circle to its contour points are all

equal to the radius of the circle the variance of the values is zero On the other

hand for rough shapes the distances will vary considerably see Figures

and for examples The variance and higherorder moments of the distance

values could be expected to provide indicators of shape complexity The p

th

moment of the sequence dn is dened as

m

p

N

N

X

n

dn

p

and the p

th

central moment is dened as

M

p

N

N

X

n

dn m

p

The corresponding normalized moments are dened as

m

p

m

p

M

p

N

N

X

n

dn

p

N

N

X

n

dn m

p

M

p

M

p

M

p

N

N

X

n

dn m

p

N

N

X

n

dn m

p

Gupta and Srinath showed that the normalized moments m

p

and M

p

are invariant to translation rotation and scaling This set of moments in an

innite series reversibly represents the shape of a contour

Although moments of any arbitrarily large order can be derived from a

contour and used as features for shape classication highorder moments are

sensitive to noise and hence the resulting classier will be less tolerant to

noise Therefore Gupta and Srinath selected four normalized loworder

moments to form a set of shape features as follows

F

M

m

N

N

X

n

dn m

m

F

M

M

N

N

X

n

dn m

N

N

X

n

dn m

F

M

M

N

N

X

n

dn m

N

N

X

n

dn m

A study by Shen et al showed that the variations in F

and F

for diering shape complexity are small and do not show a simple progression

Furthermore F

was observed to vary signicantly for the same geometric

shape with scaling and rotation see Figure and Table In order to

overcome these limitations F

and F

were modied by Shen et al

as follows

F

M

m

N

N

X

n

dn m

m

F

M

m

N

N

X

n

dn m

m

Compared with the feature set proposed by Gupta and Srinath the

set fF

F

F

g has the following properties

All of the three features are directly comparable

F

describes the roughness of a contour better than F

In general the

larger the value of F

the rougher is the contour

Although F

was observed by Shen et al to have better invariance

with respect to size and rotation for a given geometric shape it showed no

better variation than F

across the shape categories tested However it was

shown that the combination mf F

F

is a good indicator of shape

roughness because the fourthorder term in F

will be much larger than the

secondorder term in F

as the contour becomes rougher Also mf provides

the desired invariance for a given contour type as well as the desired variation

across the various shape categories see Figure and Table Note that

the denition of the features F

and F

makes it possible to perform the

subtraction directly and that mf is limited to the range

The values of mf for the contours of the objects in Figure are listed in

Table Observe that the values of mf do not demonstrate the same trends

as those of the other shape factors listed in Tables and for the same

contours the shape factors characterize dierent notions of shape complexity

see Section

Chordlength statistics

Methods to characterize D closed contours using their chordlength distri

butions were proposed by You and Jain A chordlength measure L

k

is

dened as the length of the line segment that links a pair of contour points

normalized by the length of the longest chord The complete set of chords

for a given object consists of all possible chords drawn from every boundary

pixel to every other boundary pixel see Figure You and Jain considered

the K NN unique chords of the N boundary points of an object

as a sample distribution set and computed the KolmogorovSmirnov KS

statistics of the chordlength distribution for use as shape factors as follows

M

c

K

K

X

k

L

k

M

c

K

K

X

k

L

k

M

c

M

c

M

c

K

K

X

k

L

k

M

c

M

c

M

c

K

K

X

k

L

k

M

c

The measures listed above in order represent the mean variance skewness

and kurtosis of the chordlength distributions

The chordlength statistics are invariant to translation scaling and ro

tation and are robust in the presence of noise and distortion in the shape

FIGURE

The set of all possible chords for a contour with N boundary points

There exist K NN unique chords including the sides of the

polygonal contour in the example The contour points are shown in

regular font the chord numbers are shown in italics

boundary The method has a major disadvantage it is possible for contours

of dierent shapes to have the same chordlength distribution

The technique was applied by You and Jain to the boundary maps of seven

countries and six machine parts with dierent levels of resolution and the

results indicated good discrimination between the shapes Rangayyan et

al applied chordlength statistics to the analysis of contours of breast

masses but obtained accuracies of no more than in discriminating be

tween benign masses and malignant tumors see Section

Fourier Descriptors

Given a contour with N points having the coordinates fxn yn g n

N we could form a complex sequence zn xn j yn

n N see Figure Traversing the contour it is evident

that zn is a periodic signal with a period of N samples The sequence jzn j

may be used as a signature of the contour

Periodic signals lend themselves to analysis via the Fourier series Given a

discretespace sequence zn we could derive its Fourier series as one period

of its DFT Zk dened as

Zk

N

N

X

n

zn exp

j

N

nk

k

N

N

The frequency index k could be interpreted

as the index of the harmonics of a fundamental frequency The contour sample

sequence zn is given by the inverse DFT as

zn

N

N

X

k

N

Zk exp

j

N

nk

n N The coecients Zk are known as the Fourier de

scriptors of the contour zn Note Other denitions of Fourier

descriptors exist

Observe that Zk represents a twosided complex spectrum with fold

ing or shifting of the DFT or FFT array the frequency index would run as

k

N

N

without folding as usually provided by

common DFT or FFT algorithms the coecients are provided in the order

N

N

A few important properties and characteristics of Fourier descriptors are as

follows

The frequency index k represents the harmonic number or order in a

Fourier series representation of the periodic signal zn The fundamen

tal frequency k represents a sinusoid in each of the coordinates x

and y that exhibits one period while traversing once around the closed

contour zn

Diering from the usual spectra of real signals Zk does not possess

conjugate symmetry due to the complex nature of zn

The zerofrequency DC coecient Z represents the centroid or cen

ter of mass x y as

Z

N

N

X

n

zn x y

Each of the fundamental frequency coecients Z and Z repre

sents a circle The set of coecients fZ Z g represents an ellipse

Highorder Fourier descriptors represent ne details or rapid excursions

of the contour

The rotation of a contour by an angle may be expressed as z

n zn

expj where z

n represents the rotated contour Rotation leads to

an additional phase component as Z

k Zk expj

If zn represents the points along a contour obtained by traversing the

contour in the clockwise direction and z

n represents the points ob

tained by traversing in the counterclockwise direction we have z

n

zn zN n and Z

k Zk

Shifting or translating zn by x

o

y

o

to obtain z

n zn x

o

j y

o

leads to an additional DC component as Z

k Zk x

o

j y

o

k

Scaling a contour as z

n zn leads to a similar scaling of the

Fourier descriptors as Z

k Zk

Shifting the starting point by n

o

samples expressed as z

n zn

n

o

leads to an additional linearphase component with Z

k Zk

exp j

N

n

o

k

Fourier descriptors may be ltered in a manner similar to the ltering of

signals and images in the frequency domain The full set or a subset of the

coecients may also be used to represent the contour and to derive shape

factors

Examples Figure shows the normalized Fourier descriptors up to

k only for the squares and rightangled triangles in Figure It is

seen that the magnitude of the normalized Fourier descriptors is invariant to

scaling and rotation

a

b

FIGURE

Normalized Fourier descriptors NFD up to k for a the squares and b the

rightangled triangles in Figure Each gure shows the NFD for four objects

however due to invariance with respect to scaling and rotation the functions overlap

completely Reproduced with permission from L Shen RM Rangayyan and JEL

Desautels Application of shape analysis to mammographic calci cations IEEE

Transactions on Medical Imaging

c

IEEE

Figures and show the jzn j signatures and Fourierdescriptor

sequences for the benignmass and malignanttumor contours shown in Fig

ures a and a It is evident that the signatures reect the smoothness

or roughness of the contours the Fourier descriptors of the spiculated contour

indicate the presence of more highfrequency energy than those of the nearly

oval contour of the benign mass

Figure shows the results of ltering the benignmass contour in Figure

a using Fourier descriptors The coecients Z and Z provide two

circles with one of them tting the contour better than the other depending

upon the direction of traversal of the contour The combined use of Z

and Z has provided an ellipse that ts the original contour well The use

of additional Fourier descriptors has provided contours that t the original

contour better

Figure shows the results of ltering the malignanttumor contour in

Figure a using Fourier descriptors The inclusion of more highorder

coecients has led to contours that approximate the original contour to better

levels of t The ltered contours illustrate clearly the role played by high

order Fourier descriptors in representing the ner details of the given contour

Persoon and Fu showed that Fourier descriptors may be used to char

acterize the skeletons of objects with applications in character recognition

and machinepart recognition Lin and Chellappa showed that the clas

sication of D shapes based on Fourier descriptors is accurate even when

of the data are missing

Shape factor based upon Fourier descriptors In a procedure pro

posed by Shen et al to derive a single shape factor the Fourier

descriptors are normalized as follows Z is set equal to zero in order to

make the descriptors independent of position and each coecient is divided

by the magnitude of Z in order to normalize for size After these steps

the magnitudes of the Fourier descriptors are independent of position size

orientation and starting point of the contour note that the orientation and

starting point aect only the phase of the Fourier descriptors The normalized

Fourier descriptors Z

o

k are dened as

Z

o

k

k

Zk

jZ j

otherwise

Note For normalization as above the points of the contour must be in

dexed from to N in counterclockwise order in the opposite case

jZ j should be used

Contours with sharp excursions possess more highfrequency energy than

smooth contours However applying a weighting factor that increases with

frequency leads to unbounded values that are also sensitive to noise A shape

factor ff based upon the normalized Fourier descriptors was dened by Shen

a

b

FIGURE

a Signature with jzn j of the benignmass contour in Figure a

b Magnitude of the Fourier descriptors shown only for k

a

b

FIGURE

a Signature with jzn j of the malignanttumor contour in Figure a

b Magnitude of the Fourier descriptors shown only for k

a b

FIGURE

Filtering of the benignmass contour in Figure a using Fourier descrip

tors a Using coecients for k smaller circle in dashed line k

larger circle in dashed line and k f g ellipse in solid line b Us

ing coecients for k dashed line and k solid line The

original contour is indicated with a dotted line for reference

a b

FIGURE

Filtering of the malignanttumor contour in Figure a using Fourier de

scriptors a Using coecients for k smaller circle in dashed line

k larger circle in dashed line and k f g ellipse in solid line

b Using coecients for k dashed line and k solid

line The original contour is indicated with a dotted line for reference

et al as

ff

P

N

kN

jZ

o

k jjkj

P

N

kN

jZ

o

k j

The advantage of this measure is that it is limited to the range and is

not sensitive to noise which would not be the case if weights increasing with

frequency were used ff is invariant to translation rotation starting point

and contour size and increases in value as the object shape becomes more

complex and rough

Other forms of weighting could be used in Equation to derive several

variants or dierent shape factors based upon Fourier descriptors For exam

ple the normalized frequency given by

jkj

N

could be used to provide weights

increasing with frequency and the computation limited to frequencies up to

a fraction of the highest available frequency such as in order to limit the

eect of noise and highfrequency artifacts Highorder moments could also

be computed by using powers of the normalized frequency Subtraction from

unity as in Equation could then be removed so as to obtain shape factors

that increase with roughness

The values of ff for the contours of the objects in Figures and are

listed in Tables and The values of ff do not demonstrate the same

trends as those of the other shape factors listed in the tables for the same

contours Several shape factors that characterize dierent notions of shape

complexity may be required for ecient pattern classication of contours in

some applications see Sections and for illustrations

Malignant calcications that have elongated and rough contours lead to

larger ff values than benign calcications that are mostly smooth round

or oval in shape Furthermore tumors with microlobulations and jagged

boundaries are expected to have larger ff values than masses with smooth or

macrolobulated boundaries Shen et al applied ff to shape analysis

of mammographic calcications The details of this application are presented

in Section Rangayyan et al used ff to discriminate between benign

breast masses and malignant tumors and obtained an accuracy of see

Section Sahiner et al tested the classication performance of sev

eral shape factors and texture measures with a dataset of benign breast

masses and malignant tumors ff was found to give the best individual

performance with an accuracy of

Fractional Concavity

Most benign mass contours are smooth oval or have major portions of con

vex macrolobulations Some benign masses may have minor concavities and

spicules On the other hand malignant tumors typically possess both con

cave and convex segments as well as microlobulations and prominent spicules

Rangayyan et al proposed a measure of fractional concavity f

cc

of

contours to characterize and quantify these properties

In order to compute f

cc

after performing segmentation of the contour as

explained in Section the individual segments between successive inec

tion points are labeled as concave or convex parts A convex part is dened as

a segment of the contour that encloses a portion of the mass inside of the con

tour whereas a concave part is one formed by the presence of a background

region within the segment Figure shows a section of a mammogram with

a circumscribed benign mass overlaid with the contour drawn by a radiolo

gist specialized in mammography the black and white portions represent the

concave and convex parts respectively Figure shows a similar result of

the analysis of the contour of a spiculated malignant tumor

The contours used in the work of Rangayyan et al were manually

drawn and included artifacts and minor modulations that could lead to in

ecient representation for pattern classication The polygonal modeling

procedure described in Section was applied in order to reduce the eect

of the artifacts The cumulative length of the concave segments was com

puted using the polygonal model and normalized by the total length of the

contour to obtain f

cc

It is obvious that f

cc

is limited to the range and

is independent of rotation shift and the size scaling of the contour The

performance of f

cc

in discriminating between benign masses and malignant

tumors is illustrated in Sections and

Lee et al proposed an irregularity index for the classication of cuta

neous melanocytic lesions based upon their contours The index was derived

via an analysis of the curvature of the contour and the detection of local in

dentations concavities and protrusions convexities The irregularity index

was observed to have a higher correlation with clinical assessment of the le

sions than other shape factors based upon compactness see Section

and fractal analysis see Section

Analysis of Spicularity

It is known that invasive carcinomas due to their nature of inltration into

surrounding tissues form narrow stellate distortions or spicules at their

boundaries Based upon this observation Rangayyan et al proposed

a spiculation index SI to represent the degree of spiculation of a mass con

tour In order to emphasize narrow spicules and microlobulations a weight

ing factor was included to enhance the contributions of narrow spicules in the

computation of SI

For each curved part of a mass contour or the corresponding polygonal

model segment obtained as described in Sections and the ratio of

its length to the base width can represent the degree of narrowness or spicula

tion A nonlinear weighting function was proposed by Rangayyan et al

based upon the segments length S and angle of spiculation to deliver pro

gressively increasing weighting with increase in the narrowness of spiculation

of each segment Spicule candidates were identied as portions of the contour

delimited by pairs of successive points of inection The polygonal model

obtained as described in Section was used to compute the parameters

S and for each spicule candidate

If a spicule includesM polygonal segments then there existM angles at

the points of intersection of the successive segments Let s

m

m M

be the polygonal segments and !