ABSTRACT

Gasoline classification is an important issue in environmental and forensic applications. Several categorization algorithms exist that attempt to correctly classify gasoline samples in data sets. We demonstrate a method that can improve classification performance by maximizing hit-rate without using a priori knowledge of compounds in gasoline samples. This is accomplished by using a variable reduction technique that de-clutters the data set from redundant information by minimizing multivariate structural distortion and by applying a greedy Expectation-Maximization (EM) algorithm that optimally tunes parameters of a Gaussian mixture model (GMM). These methods initially classify premium and regular gasoline samples into clusters relying on their gas chromatography-mass spectroscopy (GCMS) spectral data and then they discriminate them into their winter and summer subgroups. Approximately 89% of the samples were correctly classified as premium or regular gasoline and 98.8% of the samples were correctly classified according to their seasonal characteristics.