ABSTRACT
Several human organs and biological structures possess readily identiable
shapes The shapes of the human heart brain kidneys and several bones
are well known and in normal cases do not deviate much from an aver
age shape However disease processes can aect the structure of organs
and cause deviation from their expected or average shapes Even abnormal
entities such as masses and calcications in the breast tend to demonstrate
dierences in shape between benign and malignant conditions For exam
ple most benign masses in the breast appear as wellcircumscribed areas on
mammograms with smooth boundaries that are circular or oval some benign
masses may be macrolobulated On the other hand malignant masses can
cerous tumors are typically illdened on mammograms and possess a rough
or stellate starlike shape with strands or spicules appearing to radiate from
a central mass some malignant masses may be microlobulated
Shape is a key feature in discriminating between normal and abnormal cells
in Papsmear tests However biological entities demonstrate wide
ranges of manifestation with signicant overlap between their characteris
tics for various categories Furthermore it should be borne in mind that the
imaging geometry DtoD projection and the superimposition of multi
ple objects commonly aect the shapes of objects as perceived on biomedical
images
Several techniques have been proposed to characterize shape
We shall study a selection of shape analysis techniques in this chapter A
few applications will be described to demonstrate the usefulness of shape
characteristics in the analysis of biomedical images
Representation of Shapes and Contours
The most general form of representation of a contour in discretized space
is in terms of the x y coordinates of the digitized points pixels along
the contour A contour with N points could be represented by the series of
coordinates fxn yn g n N Observe that there is no gray
level associated with the pixels along a contour A contour may be depicted
as a binary or bilevel image
Signatures of contours
The dimensionality of representation of a contour may be reduced from two
to one by converting from a coordinatebased representation to distances from
each contour point to a reference point A convenient reference is the centroid
or center of mass of the contour whose coordinates are given by
x
N
N
X
n
xn and y
N
N
X
n
yn
The signature of the contour is then dened as
dn
p
xn x
yn y
n N see Figure It should be noted that the centroids of
regions that are concave or have holes could lie outside the regions
A radialdistance signature may also be derived by computing the distance
from the centroid to the contour points intersected for angles of the radial
line spanning the range
o
o
However for irregular contours such a
signature may be multivalued for some angles that is a radial line may
intersect the contour more than once see for example Pohlman et al
It is obvious that going around a contour more than once generates the same
signature hence the signature signal is periodic with the period equal to N
the number of pixels on the contour The signature of a contour provides
general information on the nature of the contour such as its smoothness or
roughness
Examples Figures a and a show the contours of a benign breast
mass and a malignant tumor respectively as observed on mammograms
The marks within the contours represent their centroids Figures b
and b show the signatures of the contours as dened in Equation
It is evident that the smooth contour of the benign mass possesses a smooth
signature whereas the spiculated malignant tumor has a rough signature with
several signicant rapid variations over its period
Chain coding
An ecient representation of a contour may be achieved by specifying the
x y coordinates of an arbitrary starting point on the contour the direction of
traversal clockwise or counterclockwise and a code to indicate the manner
of movement to reach the next contour point on a discrete grid A coarse
representation may be achieved by using only four possible movements to the
point at the left of right of above or below the current point as indicated in
Figure a A ner representation may be achieved by using eight possible
movements including diagonal movements as indicated in Figure b
The sequence of codes required to traverse through all the points along the
FIGURE
A contour represented by its boundary points zn and distances dn to its
centroid
contour is known as the chain code The technique was proposed by
Freeman and is also known as the Freeman chain code
The chain code facilitates more compact representation of a contour than
the direct specication of the x y coordinates of all of its points Except
the initial point the representation of each point on the contour requires only
two or three bits depending upon the type of code used Furthermore chain
coding provides the following advantages
The code is invariant to shift or translation because the starting point
is kept out of the code
To a certain extent the chain code is invariant to size scaling Con
tours of dierent sizes may be generated from the same code by using
dierent sampling grids step sizes A contour may also be enlarged by
a factor of n by repeating each code element n times and maintaining
the same sampling grid A contour may be shrunk to half of the
original size by reducing pairs of code elements to single numbers with
approximation of unequal pairs by their averages reduced to integers
The chain code may be normalized for rotation by taking the rst dif
ference of the code and adding or to negative dierences depending
upon the code used
a
b
FIGURE
a Contour of a benign breast mass N The mark represents the
centroid of the contour b Signature dn as dened in Equation
a
b
FIGURE
a Contour of a malignant breast tumor N The mark represents
the centroid of the contour b Signature dn as dened in Equation
With reference to the symbol code the rotation of a given contour by
n
o
in the counterclockwise direction may be achieved by adding a
value of n to each code element followed by integer division by The
addition of an odd number rotates the contour by the corresponding
multiple of
o
however the rotation of a contour by angles other than
integral multiples of
o
on a discrete grid is subject to approximation
In the case of the symbol code the length of a contour is given by the
number of even codes plus
p
times the number of odd codes multiplied
by the grid sampling interval
The chain code may also be used to achieve reduction check for closure
check for multiple loops and determine the area of a closed loop
Examples Figure shows a contour represented using the chain codes
with four and eight symbols The use of a discrete grid with large spacings
leads to the loss of ne detail in the contour However this feature may be
used advantageously to lter out minor irregularities due to noise artifacts
due to drawing by hand etc
FIGURE
Chain code with a four directional codes and b eight directional codes
Segmentation of contours
The segmentation of a contour into a set of piecewisecontinuous curves is a
useful step before analysis and modeling Segmentation may be performed by
locating the points of inection on the contour
Consider a function fx Let f
x f
x and f
x represent the rst
second and third derivatives of fx A point of inection of the function or
Chain code
Figure a
curve fx is dened as a point where f
x changes its sign Note that the
derivation of f
x requires fx and f
x to be continuous and dierentiable
It follows that the following conditions apply at a point of inection
f
x
f
x
f
x f
x and
f
x f
x
Let C fxn yn g n N represent in vector form
the x y coordinates of the N points on the given contour The points of
inection on the contour are obtained by solving
C
C
C
C
where C
C
and C
are the rst second and third derivatives of C
respectively and represents the vector cross product Solving Equation
is equivalent to solving the system of equations given by
x
n y
n x
n y
n
Chain code
b
FIGURE
A closed contour represented using the chain code a using four directional
codes as in Figure a and b with eight directional codes as in Figure
b The o mark represents the starting point of the contour which is
traversed in the clockwise direction to derive the code
x
n y
n x
n y
n
where x
n y
n x
n y
n x
n and y
n are the rst second and
third derivatives of xn and yn respectively
Segments of contours of breast masses between successive points of inection
were modeled as parabolas by Menut et al Diculty lies in segmen
tation because the contours of masses are in general not smooth False or
irrelevant points of inection could appear on relatively straight parts of a
contour when x
n and y
n are not far from zero In order to address this
problem smoothed derivatives at each contour point could be estimated by
considering the cumulative sum of weighted dierences of a certain number of
pairs of points on either side of the point xn under consideration as
x
n
m
X
i
xn i xn i
i
where m represents the number of pairs of points used to compute the deriva
tive x
n the same procedure applies to the computation of y
n
In the works reported by Menut et al and Rangayyan et al
the value of m was varied from to to compute derivatives that resulted
in varying numbers of inection points for a given contour The number of
inection points detected as a function of the number of dierences used was
analyzed to determine the optimal number of dierences that would provide
the most appropriate inection points the value of m at the rst straight
segment on the function was selected
Examples Figure shows the contour of a spiculated malignant tumor
The points of inection detected are marked with The number of inec
tion points detected is plotted in Figure as a function of the number of
dierences used m in Equation the horizontal and vertical lines indi
cate the optimal number of dierences used to compute the derivative at each
contour point and the corresponding number of points of inection that were
located on the contour
The contour in Figure is shown in Figure overlaid on the corre
sponding part of the original mammogram Segments of the contours are
shown in black or white indicating if they are concave or convex respec
tively Figure provides a similar illustration for a circumscribed benign
mass Analysis of concavity of contours is described in Section
Polygonal modeling of contours
Pavlidis and Horowitz and Pavlidis and Ali proposed methods
for segmentation and approximation of curves and shapes by polygons for
computer recognition of handwritten numerals cell outlines and ECG signals
Ventura and Chen presented an algorithm for segmenting and polygonal
modeling of D curves in which the number of segments is to be prespecied
FIGURE
Contour of a spiculated malignant tumor with the points of inection indicated
by Number of points of inection See also Figure
FIGURE
Number of inection points detected as a function of the number of dierences
used to estimate the derivative for the contour in Figure The horizontal
and vertical lines indicate the optimal number of dierences used to compute
the derivative at each contour point and the corresponding number of points
of inection that were located on the contour
FIGURE
Concave and convex parts of the contour of a spiculated malignant tumor
separated by the points of inection See also Figure The concave parts
are shown in black and the convex parts in white The image size is
pixels or mm with a pixel size of m Shape factors
f
cc
SI cf Reproduced with permission from RM
Rangayyan NR Mudigonda and JEL Desautels Boundary modeling and
shape analysis methods for classication of mammographic masses Medical
and Biological Engineering and Computing
c
IFMBE
FIGURE
Concave and convex parts of the contour of a circumscribed benign mass
separated by the points of inection The concave parts are shown in black and
the convex parts in white The image size is pixels or mm
with a pixel size of m Shape factors f
cc
SI cf
Reproduced with permission from RM Rangayyan NR Mudigonda
and JEL Desautels Boundary modeling and shape analysis methods for
classication of mammographic masses Medical and Biological Engineering
and Computing
c
IFMBE
for initiating the process in relation to the complexity of the shape This is
not a desirable step when dealing with complex or spiculated shapes of breast
tumors In a modied approach proposed by Rangayyan et al the
polygon formed by the points of inection detected on the original contour was
used as the initial input to the polygonal modeling procedure This step helps
in automating the polygonalization algorithm the method does not require
any interaction from the user in terms of the initial number of segments
Given an irregular contour C as specied by the set of its x y coordinates
the polygonal modeling algorithm starts by dividing the contour into a set
of piecewisecontinuous curved parts by locating the points of inection on
the contour as explained in Section Each segmented curved part is
represented by a pair of linear segments based on its arctochord deviation
The procedure is iterated subject to predened boundary conditions so as to
minimize the error between the true length of the contour and the cumulative
length computed from the polygonal segments
Let C fxn yn g n N represent the given contour
Let SC
mk
SC
mk
C m M be M curved parts each con
taining a set of contour points at the start of the k
th
iteration such that
SC
k
S
SC
k
S
S
SC
Mk
C The iterative procedure proposed by
Rangayyan et al is as follows
In each curved part represented by SC
mk
the arctochord distance
is computed for all the points and the point on the curve with the
maximum arctochord deviation d
max
is located
If d
max
mm pixels in the images with a pixel size of m
used in the work of Rangayyan et al the curved part is segmented
at the point of maximum deviation to approximate the same with a
pair of linear segments irrespective of the length of the resulting linear
segments If mm d
max
mm the curved part is segmented
at the point of maximum deviation subject to the condition that the
resulting linear segments satisfy a minimumlength criterion which was
specied as mm in the work of Rangayyan et al If d
max
mm the curved part SC
mk
is considered to be almost linear and is
not segmented any further
After performing Steps and on all the curved parts of the contour
available in the current k
th
iteration the resulting vector of the poly
gons vertices is updated
If the number of polygonal segments following the k
th
iteration equals
that of the previous iteration the algorithm is considered to have con
verged and the polygonalization process is terminated Otherwise the
procedure Steps to is repeated until the algorithm converges
The criterion for choosing the threshold for arctochord deviation was based
on the assumption that any segment possessing a smaller deviation is insignif
icant in the analysis of contours of breast masses
Examples Figure a shows the points of inection denoted by
and the initial stage of polygonal modeling straightline segments of the
contour of a spiculated malignant tumor see also Figure Figure b
shows the nal result of polygonal modeling of the same contour The al
gorithm converged after four iterations as shown by the convergence plot in
Figure The result of the application of the polygonal modeling algorithm
to the contour of a circumscribed benign mass is shown in Figure
The number of linear segments required for the approximation of a contour
increases with its shape complexity polygons with the number of sides in the
range were used in the work of Rangayyan et al to model
contours of breast masses and tumors The number of iterations required for
the convergence of the algorithm did not vary much for dierent mass contour
shapes remaining within the range This is due to the fact that the
relative complexity of the contour to be segmented is taken into considera
tion during the initial preprocessing step of locating the points of inection
hence the subsequent polygonalization process is robust and computationally
ecient The algorithm performed well and delivered satisfactory results on
various irregular shapes of spiculated cases of benign and malignant masses
Parabolic modeling of contours
Menut et al proposed the modeling of segments of contours of breast
masses between successive points of inection as parabolas An inspection
of the segments of the contours illustrated in Figures and a see
also Figures and indicates that most of the curved portions between
successive points inection lend themselves well to modeling as parabolas
Some of the segments are relatively straight however such segments may
not contribute much to the task of discrimination between benign masses and
malignant tumors
Let us consider a segment of a contour represented in the continuous D
space by the points xs ys over the interval S
s S
where s indicates
distance along the contour and S
and S
are the endpoints of the segment
Let us now consider the approximation of the curve by a parabola Regardless
of the position and orientation of the given curve let us consider the simplest
representation of a parabola as Y AX
in the coordinate space XY The
parameter A controls the narrowness of the parabola the larger the value of
A the narrower is the parabola Allowing for a rotation of and a shift of
c d between the x y and XY spaces we have
xs Xs cos Y s sin c
ys Xs sin Y s cos d
a b
FIGURE
Polygonal modeling of the contour of a spiculated malignant tumor a Points
of inection indicated by and the initial polygonal approximation
straightline segments number of sides b Final model after four
iterations number of sides See also Figure Reproduced with
permission from RM Rangayyan NR Mudigonda and JEL Desautels
Boundary modeling and shape analysis methods for classication of mam
mographic masses Medical and Biological Engineering and Computing
c
IFMBE
FIGURE
Convergence plot of the iterative polygonal modeling procedure for the con
tour of the spiculated malignant tumor in Figure Reproduced with
permission from RM Rangayyan NR Mudigonda and JEL Desautels
Boundary modeling and shape analysis methods for classication of mam
mographic masses Medical and Biological Engineering and Computing
c
IFMBE
a b
FIGURE
Polygonal modeling of the contour of a circumscribed benign mass a Points
of inection indicated by and the initial polygonal approximation
straightline segments initial number of sides b Final model num
ber of sides number of iterations See also Figure Reproduced
with permission from RM Rangayyan NR Mudigonda and JEL Desau
tels Boundary modeling and shape analysis methods for classication of
mammographic masses Medical and Biological Engineering and Computing
c
IFMBE
We also have the following relationships
Xs xs c cos ys d sin
Y s xs c sin ys d cos
Xs s
Y s A s
Taking the derivatives of Equation with respect to s we get the following
X
s
Y
s As
X
s
Y
s A
Similarly taking the derivatives of Equation with respect to s we get the
following
X
s x
s cos y
s sin
Y
s x
s sin y
s cos
Combining Equations and we get
X
s x
s cos y
s sin
which upon multiplication with sin yields
x
s sin cos y
s sin sin
Similarly we also get
Y
s A x
s sin y
s cos
which upon multiplication with cos yields
A cos x
s sin cos y
s cos cos
Combining Equations and we get
A cos y
s
The equations above indicate that y
s and x
s are constants with values
related to A and The values of the two derivatives may be computed
from the given curve over all available points and averaged to obtain the
corresponding constant values Equations and may then be solved
simultaneously to obtain and A Thus the parameter of the parabolic model
is obtained from the given contour segment
Menut et al hypothesized that malignant tumors due to the presence of
narrow spicules or microlobulations would have several parabolic segments
with large values of A on the other hand benign masses due to their char
acteristics of being oval or macrolobulated would have a small number of
parabolic segments with small values of A The same reasons were also ex
pected to lead to a larger standard deviation of A for malignant tumors than
for benign masses In addition to the parameter A Menut et al proposed to
use the width of the projection of each parabola on to the X axis with the
expectation that its values would be smaller for malignant tumors than for
benign masses A classication accuracy of was obtained with a set of
contours
Thinning and skeletonization
Objects that are linear or oblong or structures that have branching anos
tomotic patterns may be eectively characterized by their skeletons The
skeleton of an object or region is obtained by its medialaxis transform or via
a thinning algorithm
The medialaxis transformation proposed by Blum is as follows First
the given image needs to be binarized so as to include only the patterns of
interest Let the set of pixels in the binary pattern be denoted as B let C be
the set of contour pixels of B and let c
i
be an arbitrary contour point in C
For each point b in B a point c
i
is found such that the distance between the
point b and c
i
represented as db c
i
is at its minimum If a second point c
k
is found in C such that db c
k
db c
i
then b is a part of the skeleton of
B otherwise b is not a part of the skeleton
A simple algorithm for thinning is as follows Assume that
the image has been binarized with the pixels inside the ROIs being labeled
as and the background pixels as A contour point is dened as any pixel
having the value and at least one connected neighbor valued Let the
connected neighboring pixels of the pixel p
being processed be indexed as
p
p
p
p
p
p
p
p
p
Flag a contour point p
for deletion if the following are true
a Np
b Sp
c p
p
p
d p
p
p
where Np
is the number of nonzero neighbors of p
and Sp
is the
number of transitions in the sequence p
p
p
p
Delete all agged pixels
Do the same as Step above replacing the conditions c and d with
c p
p
p
d p
p
p
Delete all agged pixels
Iterate Steps until no further pixels are deleted
The algorithm described above has the properties that it does not remove
end points does not break connectivity and does not cause excessive erosion
of the region
Example Figure a shows a pattern of blood vessels in a section
of a ligament The vessels were perfused with black ink prior to
extraction of the tissue for study Figure b shows the skeleton of the
image in part a of the gure It is seen that the skeleton represents the
general orientational pattern and overall shape of the blood vessels in the
original image However information regarding the variation in the thickness
diameter of the blood vessels is lost in skeletonization Eng et al
studied the eect of injury and healing on the microvascular structure of
ligaments by analyzing the statistics of the volume and directional distribution
of blood vessels as illustrated in Figure see Section for details
Shape Factors
Although contours may be eectively characterized by representations and
models such as the chain code and the polygonal model described in the
preceding section it is often desirable to encode the nature or form of a
contour using a small number of measures commonly referred to as shape
factors The nature of the contour to be encapsulated in the measures may
vary from one application to another Regardless of the application a few
basic properties are essential for ecient representation of which the most
important are
invariance to shift in spatial position
invariance to rotation and
invariance to scaling enlargement or reduction
Invariance to reection may also be desirable in some applications Shape fac
tors that meet the criteria listed above can eectively and eciently represent
contours for pattern classication
a b
FIGURE
a Binarized image of blood vessels in a ligament perfused with black ink Im
age courtesy of RC Bray and MR Doschak University of Calgary b Skele
ton of the image in a after iterations of the algorithm described in Sec
tion
A basic method that is commonly used to represent shape is to t an ellipse
or a rectangle to the given closed contour The ratio of the major axis of
the ellipse to its minor axis or equivalently the ratio of the larger side
to the smaller side of the bounding rectangle is known as its eccentricity
and represents its deviation from a circle for which the ratio will be equal to
unity Such a measure however represents only the elongation of the object
and may have on its own limited application in practice Several shape
factors of increasing complexity and specicity of application are described in
the following sections
Compactness
Compactness is a simple and popular measure of the eciency of a contour
to contain a given area and is commonly dened as
Co
P
A
where P and A are the contour perimeter and area enclosed respectively The
smaller the area contained by a contour of a given length the larger will be the
value of compactness Compactness as dened in Equation has a lower
bound of for a circle except for the trivial case of zero for P but
no upper bound It is evident that compactness is invariant to shift scaling
rotation and reection of a contour
In order to restrict and normalize the range of the parameter to as
well as to obtain increasing values with increase in complexity of the shape
the denition of compactness may be modied as
cf
A
P
With this expression cf has a lower bound of zero for a circle and increases
with the complexity of the contour to a maximum value of unity
Examples Figure illustrates a few simple geometric shapes along
with their values of compactness Elongated contours with large values of the
perimeter and small enclosed areas possess high values of compactness
Figure illustrates a few objects with simple geometric shapes including
scaling and rotation the values of compactness Co and cf for the contours
of the objects are listed in Table It is evident that
both denitions of compactness provide the desired invariance to scaling and
rotation within the limitations due to the use of a discrete grid
Figure illustrates a few objects of varying shape complexity prepared
by cutting construction paper The values of compactness
cf for the contours of the objects are listed in Table It is seen that
compactness increases with shape roughness and or complexity
FIGURE
Examples of contours with their values of compactness Co and cf as dened
in Equations and a Circle b Square c Rectangle with sides
equal to and units d Rectangle with sides equal to and
units e Rightangled triangle of height and base units
FIGURE
A set of simple geometric shapes including scaling and rotation created on
a discrete grid to test shape factors Reproduced with permission from L
Shen RM Rangayyan and JEL Desautels Application of shape analysis
to mammographic calcications IEEE Transactions on Medical Imaging
c
IEEE
B i o m e d i c a l I m a g e A n a l y s i s
TABLE
Shape Factors for the Shapes in Figure
Shape Co cf F
F
F
F
F
F
mf ff
Large circle a
Medium circle b
Small circle c
Large square d
Medium square e
Rotated square f
Small square g
Large rectangle i
Medium rectangle j
Rotated rectangle k
Small rectangle l
Large isosceles triangle m
Medium isosceles triangle n
Rotated isosceles triangle o
Small isosceles triangle p
Large rightangled triangle q
Medium rightangled triangle r
Rotated rightangled triangle s
Small rightangled triangle t
FIGURE
A set of objects of varying shape complexity The objects were prepared by
cutting construction paper The contours of the objects include imperfections
and artifacts Reproduced with permission from L Shen RM Rangayyan
and JEL Desautels Application of shape analysis to mammographic cal
cications IEEE Transactions on Medical Imaging
c
IEEE
Shen et al applied compactness to shape analysis of mammo
graphic calcications The details of this application are presented in Sections
and The use of compactness in benignversusmalignant classication
of breast masses is discussed in Sections and
Moments
Statistical moments of PDFs and other data distributions have been utilized
as pattern features in a number of applications the same concepts have been
extended to the analysis of images and contours Given
a D continuous image fx y the regular moments m
pq
of order p q are
TABLE
Shape Factors for the Objects in
Figure Arranged in Increasing Order
of ff
Shape cf mf ff Type
Circle
Circle
Ellipse
Rectangle
Rectangle
Rectangle
Pentagon
Triangle
Triangle
Other
Other
Other
dened as
m
pq
Z
Z
x
p
y
q
fx y dx dy
for p q A uniqueness theorem states that if fx y is
piecewise continuous and has nonzero values only in a nite part of the x y
plane then moments of all orders exist and the moment sequence m
pq
p q
is uniquely determined by fx y Conversely the sequence m
pq
uniquely determines fx y
The central moments are dened with respect to the centroid of the image
as
pq
Z
Z
x x
p
y y
q
fx y dx dy
where
x
m
m
y
m
m
Observe that the gray levels of the pixels provide weights for the moments as
dened above If moments are to be computed for a contour only the contour
pixels would be used with weights equal to unity the internal pixels would
have weights of zero and eectively do not participate in the computation of
the moments
For an M N digital image the integrals are replaced by summations for
example Equation becomes
pq
M
X
m
N
X
n
m x
p
n y
q
fmn
The central moments have the following relationships
m
m
x
m
x y
m
y
m
m
x x
m
m
y m
x x
y
m
m
x m
y x y
m
m
y y
Normalization with respect to size is achieved by dividing each of the mo
ments by
where
pq
to obtain the normalized moments as
pq
pq
Hu see also Gonzalez and Woods dened a set of seven shape
factors that are functions of the secondorder and thirdorder central moments
as follows
M
M
M
M
M
M
M
The shape factors M
through M
are invariant to shift scaling and ro
tation within limits imposed by representation on a discrete grid and have
been found to be useful for pattern analysis Rangayyan et al computed
several versions of the factors M
through M
for breast masses and tu
mors using the mass ROIs with and without their gray levels as well as the
contours of the masses with and without their gray levels The features pro
vided benignversusmalignant classication accuracies in the range
see Section
Moments of distances to the centroid When an ROI is represented
using only its contour an alternative denition of moments is based upon
a sequence that represents the Euclidean distances between the centroid of
the region and all of the points or pixels along the contour shown as dn in
Figure The distances from the center of a circle to its contour points are all
equal to the radius of the circle the variance of the values is zero On the other
hand for rough shapes the distances will vary considerably see Figures
and for examples The variance and higherorder moments of the distance
values could be expected to provide indicators of shape complexity The p
th
moment of the sequence dn is dened as
m
p
N
N
X
n
dn
p
and the p
th
central moment is dened as
M
p
N
N
X
n
dn m
p
The corresponding normalized moments are dened as
m
p
m
p
M
p
N
N
X
n
dn
p
N
N
X
n
dn m
p
M
p
M
p
M
p
N
N
X
n
dn m
p
N
N
X
n
dn m
p
Gupta and Srinath showed that the normalized moments m
p
and M
p
are invariant to translation rotation and scaling This set of moments in an
innite series reversibly represents the shape of a contour
Although moments of any arbitrarily large order can be derived from a
contour and used as features for shape classication highorder moments are
sensitive to noise and hence the resulting classier will be less tolerant to
noise Therefore Gupta and Srinath selected four normalized loworder
moments to form a set of shape features as follows
F
M
m
N
N
X
n
dn m
m
F
M
M
N
N
X
n
dn m
N
N
X
n
dn m
F
M
M
N
N
X
n
dn m
N
N
X
n
dn m
A study by Shen et al showed that the variations in F
and F
for diering shape complexity are small and do not show a simple progression
Furthermore F
was observed to vary signicantly for the same geometric
shape with scaling and rotation see Figure and Table In order to
overcome these limitations F
and F
were modied by Shen et al
as follows
F
M
m
N
N
X
n
dn m
m
F
M
m
N
N
X
n
dn m
m
Compared with the feature set proposed by Gupta and Srinath the
set fF
F
F
g has the following properties
All of the three features are directly comparable
F
describes the roughness of a contour better than F
In general the
larger the value of F
the rougher is the contour
Although F
was observed by Shen et al to have better invariance
with respect to size and rotation for a given geometric shape it showed no
better variation than F
across the shape categories tested However it was
shown that the combination mf F
F
is a good indicator of shape
roughness because the fourthorder term in F
will be much larger than the
secondorder term in F
as the contour becomes rougher Also mf provides
the desired invariance for a given contour type as well as the desired variation
across the various shape categories see Figure and Table Note that
the denition of the features F
and F
makes it possible to perform the
subtraction directly and that mf is limited to the range
The values of mf for the contours of the objects in Figure are listed in
Table Observe that the values of mf do not demonstrate the same trends
as those of the other shape factors listed in Tables and for the same
contours the shape factors characterize dierent notions of shape complexity
see Section
Chordlength statistics
Methods to characterize D closed contours using their chordlength distri
butions were proposed by You and Jain A chordlength measure L
k
is
dened as the length of the line segment that links a pair of contour points
normalized by the length of the longest chord The complete set of chords
for a given object consists of all possible chords drawn from every boundary
pixel to every other boundary pixel see Figure You and Jain considered
the K NN unique chords of the N boundary points of an object
as a sample distribution set and computed the KolmogorovSmirnov KS
statistics of the chordlength distribution for use as shape factors as follows
M
c
K
K
X
k
L
k
M
c
K
K
X
k
L
k
M
c
M
c
M
c
K
K
X
k
L
k
M
c
M
c
M
c
K
K
X
k
L
k
M
c
The measures listed above in order represent the mean variance skewness
and kurtosis of the chordlength distributions
The chordlength statistics are invariant to translation scaling and ro
tation and are robust in the presence of noise and distortion in the shape
FIGURE
The set of all possible chords for a contour with N boundary points
There exist K NN unique chords including the sides of the
polygonal contour in the example The contour points are shown in
regular font the chord numbers are shown in italics
boundary The method has a major disadvantage it is possible for contours
of dierent shapes to have the same chordlength distribution
The technique was applied by You and Jain to the boundary maps of seven
countries and six machine parts with dierent levels of resolution and the
results indicated good discrimination between the shapes Rangayyan et
al applied chordlength statistics to the analysis of contours of breast
masses but obtained accuracies of no more than in discriminating be
tween benign masses and malignant tumors see Section
Fourier Descriptors
Given a contour with N points having the coordinates fxn yn g n
N we could form a complex sequence zn xn j yn
n N see Figure Traversing the contour it is evident
that zn is a periodic signal with a period of N samples The sequence jzn j
may be used as a signature of the contour
Periodic signals lend themselves to analysis via the Fourier series Given a
discretespace sequence zn we could derive its Fourier series as one period
of its DFT Zk dened as
Zk
N
N
X
n
zn exp
j
N
nk
k
N
N
The frequency index k could be interpreted
as the index of the harmonics of a fundamental frequency The contour sample
sequence zn is given by the inverse DFT as
zn
N
N
X
k
N
Zk exp
j
N
nk
n N The coecients Zk are known as the Fourier de
scriptors of the contour zn Note Other denitions of Fourier
descriptors exist
Observe that Zk represents a twosided complex spectrum with fold
ing or shifting of the DFT or FFT array the frequency index would run as
k
N
N
without folding as usually provided by
common DFT or FFT algorithms the coecients are provided in the order
N
N
A few important properties and characteristics of Fourier descriptors are as
follows
The frequency index k represents the harmonic number or order in a
Fourier series representation of the periodic signal zn The fundamen
tal frequency k represents a sinusoid in each of the coordinates x
and y that exhibits one period while traversing once around the closed
contour zn
Diering from the usual spectra of real signals Zk does not possess
conjugate symmetry due to the complex nature of zn
The zerofrequency DC coecient Z represents the centroid or cen
ter of mass x y as
Z
N
N
X
n
zn x y
Each of the fundamental frequency coecients Z and Z repre
sents a circle The set of coecients fZ Z g represents an ellipse
Highorder Fourier descriptors represent ne details or rapid excursions
of the contour
The rotation of a contour by an angle may be expressed as z
n zn
expj where z
n represents the rotated contour Rotation leads to
an additional phase component as Z
k Zk expj
If zn represents the points along a contour obtained by traversing the
contour in the clockwise direction and z
n represents the points ob
tained by traversing in the counterclockwise direction we have z
n
zn zN n and Z
k Zk
Shifting or translating zn by x
o
y
o
to obtain z
n zn x
o
j y
o
leads to an additional DC component as Z
k Zk x
o
j y
o
k
Scaling a contour as z
n zn leads to a similar scaling of the
Fourier descriptors as Z
k Zk
Shifting the starting point by n
o
samples expressed as z
n zn
n
o
leads to an additional linearphase component with Z
k Zk
exp j
N
n
o
k
Fourier descriptors may be ltered in a manner similar to the ltering of
signals and images in the frequency domain The full set or a subset of the
coecients may also be used to represent the contour and to derive shape
factors
Examples Figure shows the normalized Fourier descriptors up to
k only for the squares and rightangled triangles in Figure It is
seen that the magnitude of the normalized Fourier descriptors is invariant to
scaling and rotation
a
b
FIGURE
Normalized Fourier descriptors NFD up to k for a the squares and b the
rightangled triangles in Figure Each gure shows the NFD for four objects
however due to invariance with respect to scaling and rotation the functions overlap
completely Reproduced with permission from L Shen RM Rangayyan and JEL
Desautels Application of shape analysis to mammographic calci cations IEEE
Transactions on Medical Imaging
c
IEEE
Figures and show the jzn j signatures and Fourierdescriptor
sequences for the benignmass and malignanttumor contours shown in Fig
ures a and a It is evident that the signatures reect the smoothness
or roughness of the contours the Fourier descriptors of the spiculated contour
indicate the presence of more highfrequency energy than those of the nearly
oval contour of the benign mass
Figure shows the results of ltering the benignmass contour in Figure
a using Fourier descriptors The coecients Z and Z provide two
circles with one of them tting the contour better than the other depending
upon the direction of traversal of the contour The combined use of Z
and Z has provided an ellipse that ts the original contour well The use
of additional Fourier descriptors has provided contours that t the original
contour better
Figure shows the results of ltering the malignanttumor contour in
Figure a using Fourier descriptors The inclusion of more highorder
coecients has led to contours that approximate the original contour to better
levels of t The ltered contours illustrate clearly the role played by high
order Fourier descriptors in representing the ner details of the given contour
Persoon and Fu showed that Fourier descriptors may be used to char
acterize the skeletons of objects with applications in character recognition
and machinepart recognition Lin and Chellappa showed that the clas
sication of D shapes based on Fourier descriptors is accurate even when
of the data are missing
Shape factor based upon Fourier descriptors In a procedure pro
posed by Shen et al to derive a single shape factor the Fourier
descriptors are normalized as follows Z is set equal to zero in order to
make the descriptors independent of position and each coecient is divided
by the magnitude of Z in order to normalize for size After these steps
the magnitudes of the Fourier descriptors are independent of position size
orientation and starting point of the contour note that the orientation and
starting point aect only the phase of the Fourier descriptors The normalized
Fourier descriptors Z
o
k are dened as
Z
o
k
k
Zk
jZ j
otherwise
Note For normalization as above the points of the contour must be in
dexed from to N in counterclockwise order in the opposite case
jZ j should be used
Contours with sharp excursions possess more highfrequency energy than
smooth contours However applying a weighting factor that increases with
frequency leads to unbounded values that are also sensitive to noise A shape
factor ff based upon the normalized Fourier descriptors was dened by Shen
a
b
FIGURE
a Signature with jzn j of the benignmass contour in Figure a
b Magnitude of the Fourier descriptors shown only for k
a
b
FIGURE
a Signature with jzn j of the malignanttumor contour in Figure a
b Magnitude of the Fourier descriptors shown only for k
a b
FIGURE
Filtering of the benignmass contour in Figure a using Fourier descrip
tors a Using coecients for k smaller circle in dashed line k
larger circle in dashed line and k f g ellipse in solid line b Us
ing coecients for k dashed line and k solid line The
original contour is indicated with a dotted line for reference
a b
FIGURE
Filtering of the malignanttumor contour in Figure a using Fourier de
scriptors a Using coecients for k smaller circle in dashed line
k larger circle in dashed line and k f g ellipse in solid line
b Using coecients for k dashed line and k solid
line The original contour is indicated with a dotted line for reference
et al as
ff
P
N
kN
jZ
o
k jjkj
P
N
kN
jZ
o
k j
The advantage of this measure is that it is limited to the range and is
not sensitive to noise which would not be the case if weights increasing with
frequency were used ff is invariant to translation rotation starting point
and contour size and increases in value as the object shape becomes more
complex and rough
Other forms of weighting could be used in Equation to derive several
variants or dierent shape factors based upon Fourier descriptors For exam
ple the normalized frequency given by
jkj
N
could be used to provide weights
increasing with frequency and the computation limited to frequencies up to
a fraction of the highest available frequency such as in order to limit the
eect of noise and highfrequency artifacts Highorder moments could also
be computed by using powers of the normalized frequency Subtraction from
unity as in Equation could then be removed so as to obtain shape factors
that increase with roughness
The values of ff for the contours of the objects in Figures and are
listed in Tables and The values of ff do not demonstrate the same
trends as those of the other shape factors listed in the tables for the same
contours Several shape factors that characterize dierent notions of shape
complexity may be required for ecient pattern classication of contours in
some applications see Sections and for illustrations
Malignant calcications that have elongated and rough contours lead to
larger ff values than benign calcications that are mostly smooth round
or oval in shape Furthermore tumors with microlobulations and jagged
boundaries are expected to have larger ff values than masses with smooth or
macrolobulated boundaries Shen et al applied ff to shape analysis
of mammographic calcications The details of this application are presented
in Section Rangayyan et al used ff to discriminate between benign
breast masses and malignant tumors and obtained an accuracy of see
Section Sahiner et al tested the classication performance of sev
eral shape factors and texture measures with a dataset of benign breast
masses and malignant tumors ff was found to give the best individual
performance with an accuracy of
Fractional Concavity
Most benign mass contours are smooth oval or have major portions of con
vex macrolobulations Some benign masses may have minor concavities and
spicules On the other hand malignant tumors typically possess both con
cave and convex segments as well as microlobulations and prominent spicules
Rangayyan et al proposed a measure of fractional concavity f
cc
of
contours to characterize and quantify these properties
In order to compute f
cc
after performing segmentation of the contour as
explained in Section the individual segments between successive inec
tion points are labeled as concave or convex parts A convex part is dened as
a segment of the contour that encloses a portion of the mass inside of the con
tour whereas a concave part is one formed by the presence of a background
region within the segment Figure shows a section of a mammogram with
a circumscribed benign mass overlaid with the contour drawn by a radiolo
gist specialized in mammography the black and white portions represent the
concave and convex parts respectively Figure shows a similar result of
the analysis of the contour of a spiculated malignant tumor
The contours used in the work of Rangayyan et al were manually
drawn and included artifacts and minor modulations that could lead to in
ecient representation for pattern classication The polygonal modeling
procedure described in Section was applied in order to reduce the eect
of the artifacts The cumulative length of the concave segments was com
puted using the polygonal model and normalized by the total length of the
contour to obtain f
cc
It is obvious that f
cc
is limited to the range and
is independent of rotation shift and the size scaling of the contour The
performance of f
cc
in discriminating between benign masses and malignant
tumors is illustrated in Sections and
Lee et al proposed an irregularity index for the classication of cuta
neous melanocytic lesions based upon their contours The index was derived
via an analysis of the curvature of the contour and the detection of local in
dentations concavities and protrusions convexities The irregularity index
was observed to have a higher correlation with clinical assessment of the le
sions than other shape factors based upon compactness see Section
and fractal analysis see Section
Analysis of Spicularity
It is known that invasive carcinomas due to their nature of inltration into
surrounding tissues form narrow stellate distortions or spicules at their
boundaries Based upon this observation Rangayyan et al proposed
a spiculation index SI to represent the degree of spiculation of a mass con
tour In order to emphasize narrow spicules and microlobulations a weight
ing factor was included to enhance the contributions of narrow spicules in the
computation of SI
For each curved part of a mass contour or the corresponding polygonal
model segment obtained as described in Sections and the ratio of
its length to the base width can represent the degree of narrowness or spicula
tion A nonlinear weighting function was proposed by Rangayyan et al
based upon the segments length S and angle of spiculation to deliver pro
gressively increasing weighting with increase in the narrowness of spiculation
of each segment Spicule candidates were identied as portions of the contour
delimited by pairs of successive points of inection The polygonal model
obtained as described in Section was used to compute the parameters
S and for each spicule candidate
If a spicule includesM polygonal segments then there existM angles at
the points of intersection of the successive segments Let s
m
m M
be the polygonal segments and !